AI Builds Stuff in Minecraft

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi first off I reached 100,000 subscribers wow that is incredible thank you so much to everyone who watched or subscribed or supported me in any way uh life has been a little hectic recently but I wanted to celebrate with a fun little video about Mindcraft the project that lets AI chatbots play Minecraft I've made a little update that lets different AIS other than chat GPT control agents and I want to compare their creative building skills so we have three agents each powered by a different large language model we have Google's Gemini Claude 3 from anthropic and an upgraded GPT 4 Turbo these should all do a lot better than gpt3 now they can't chat with each other that's not the point of this video and I can't chat with all of them all at once but I can talk to each of them individually so I'm just going to give them a bunch of resources and okay guys you need to share come over here let's give each other some space okay so I'm just going to give them a bunch of resources like Cobblestone and planks and stuff and then ask each one using the same prompt to first check their inventory and then use what they have to build various things in this case a house with a door gpt's up first let's see how it does right off the bat it gets the sequence of responses correct it checks its inventory and then calls new action to perform some custom Behavior this allows the language model to write a piece of JavaScript code that controls the bot here it uses the place block function to build a structure this is immediately much more reliable than gpt3 and it forgot the ceiling but it got the door usually gp4 does better this is not a very fair demonstration but it's what it happened to do okay now for Claude this is Claude Opus which is basically the state-of-the-art for language models right now it is big and smart and [Music] slow well that's basically just a box didn't out a door and it doesn't seem like it's done building it seemed to have recognized that something went wrong and then tries to build again and ends up building a second house that overlaps with the first one honestly this second house is pretty good it's got got windows but still no door okay now for [Music] Gemini so it got confused it did not call new action and I have to tell it to do so and then it calls the wrong command it calls place here which just places one block in the current location you can't build houses with it that's a common confusion and Gemini failed next challenge this time we'll do some pyramid building I've cleared their inventories and given them some sandstone and then ask them to build a pyramid and I'm asking all of them at the same time to do it so we can watch all of them all at [Music] once [Music] [Music] [Music] all right that's pretty good now we know how they built the real pyramids they use chat GPT Claude built a slightly bigger pyramid with alternating layers pretty cool and GPT made the Capstone a chiseled Sandstone block neat I think they both did pretty good and Gemini completely failed again made the same mistake okay now I've given them a bunch of logs and leaves and flowers and ask them to make a garden this is the emergent Garden Channel after all I like it it's got some funny looking trees but it's a pretty little garden I think Claude ran into some issues with its first attempt and is trying [Music] again that's pretty good but I definitely prefer gpt's garden and now for Gem I'm going to call new action myself and force it to write code to build the garden and off he goes where are you [Music] going okay yeah good job that's a really nice Garden buddy why don't you come hang out in here for a while and let the big boys play for now okay I'll be back for the final challenge I will be giving them all of this stuff and then ask them to build a creative and interesting structure I've left it very open-ended to test their creativity [Music] by the way you will notice that a lot of the Cobblestone or dirt that they place isn't actually part of the final structure they are scaffolding blocks that they use to get up to places that are Out Of Reach right now I don't have a great way of removing those scaffolding blocks so they're just part of the structure [Music] [Music] e [Music] [Music] so ultimately Claude ended up running out of resources and then tried to fix the problem by building another Tower which was like inside of the first Tower so it ends up being this really messy weird tower but I'd say it's more interesting than gpt's creation which is basically just a box with alternating patterns on the walls so I'd give this one to Claude let's check in here how you doing Gemini could you give me a sec okay okay I guess Gemini's pretty busy right now guys we gonna have to come back later stupid so ultimately I would say that gp4 and clae 3 Opus are pretty neck and neck they're just about the same sometimes CLA is better sometimes GPT 4 is better but I mean this is not a very rigorous scientific comparison so take it for what it's worth Gemini comes in dead last and this isn't super fair to Gemini cuz this is their worst model basically I don't know how to get my hands on Gemini 1.5 but even so the cheapest claw model is Lightyear better than Gemini 1. know so yeah not very impressed with Gemini okay finally I want to show off the coolest thing I have ever gotten any of these agents to make I wanted to make a skyscraper Claude was struggling with it but after a few tries I was able to get gp4 to come up with something really impressive I did have to kind of cheat and constantly Supply it with resources as it ran out but every single block was placed by AI I'll let the results speak for themselves [Music] oh [Music] [Music] [Music] [Music] [Music] [Music]
Info
Channel: Emergent Garden
Views: 464,690
Rating: undefined out of 5
Keywords:
Id: Xd5PLYl4Q5Q
Channel Id: undefined
Length: 12min 44sec (764 seconds)
Published: Fri Mar 29 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.