BabyAGI: A Real First Test

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
today we're exploring baby AGI I've been excited about this one for a while this is from yohei Nakajima what I'd like to do today is just speed run it we have the instructions from GitHub just search baby AGI how to use step one make it directory for it clone the repository looks good we will copy the EnV example make it just dot EnV let's see what we got only a handful of variables here by default it is running GPT 3.5 turbo which I think was a recent change from gpt4 in order to reduce expense and I have seen good results with 3.5 turbo and man it puts the turbo and turbo original temperature is zero we'll keep that in mind super secret API key goes here I'm hoping that it'll do the local chroma database out of the box and maybe we'll do something a little more modest than solve world hunger considering Auto GPT had a hard time with find me bicycles I do have high hopes for baby AGI it seems like it has a more complete architecture to keep it simple and to do apples to apples I'm going to rerun the example that I had from the previous engagement with auto GPT create a table comparing the specifications of the top three electric bicycles for an all-weather Five Mile commute costing less than eight hundred dollars here we're just defining a single initial task instead of five sub goals so I'm going to say figure out what specs good candidate e-bike should have and that's all I'm gonna do before I let it go let's make sure we install the requirements all right looks like the only thing to do is to run it it's so exciting hitting enter on something like this just not knowing what's gonna happen wow that just jumped it generated the entire of the text before it displayed it what I'm seeing is pretty spot on it's nailing the 250 watt motor 20 miles of range it's even going after subjective quality of life things like frame material making it lighter already we're seeing some results look at the old and cheer oh and our friend the swagtron oh thank goodness I wouldn't be able to evaluate an autonomous agent if it didn't try and sell me a swagtron that's going to be my first product sponsorship swagtron's gonna have to send me an EB A5 I'll see if I can power our local language model I am already feeling warm fuzzies about the information that's coming across I don't know if there is a setting for pausing it or breaking out of it on demand interesting it already came up with a task result of comparing the three candidates and their max speed and range and now it's generating another table comparing its foldability and weight this is really impressive so far and again I think this is running on GPT 3.5 turbo look at this table already maximum speed range weight foldable already it's blown Auto GPT out of the water on results for this task it's getting thorough now it's going after a weight capacity it seems to be doing well combining the data together into tables continuously but it's going to blow me away if it puts it all together in one at the end it's interesting because it seems to be pulling a lot of the same information but it seems to be handling it so much better maybe everything's behind the scenes but it is just not throwing errors it's talking about very similar things but when you look at it it keeps looking at different metrics in these tables and asking from different angles and thus far we're only on task five man I'm getting a bit concerned about this task list right here research and compare the top three electric bicycles for an all-weather Five Mile commute same thing thing for the next task comma costing less than 800 based on Power and speed based on their battery life and range I wonder if this is going to work out in the end they're doing slightly different things so if it's not Recycling and searching from the web every single time like we saw before if it's drawing on the chroma DB then we might still be okay it's not setting any land speed records as usual it seems to be talking about a lot of the same specifications and the same candidates so it's a bit concerning but there's new variations to it like durability and maintenance that makes you think it is making slow progress all this is running I've been reading reviews about autonomous agents and a lot of people have the same concerns has anyone seen an actual good use case for this has anyone seen one of these agents complete and give you what it wants I really have my fingers crossed on baby AGI on this run this is very promising we already have a well well-formatted table and the tasks in the remaining queue are related to adding new columns to it from the data store I'm actually getting kind of excited the table that it's working with keeps bouncing around it keeps me on the edge of my seat it's not just growing with one more column after the other as it goes through the steps when it refers to them by the rows it has bike one bike two down here it's referring to them as bike A and B previously it had their names I think it's keeping them straight but but it keeps you guessing I'm worried now next task here is to compare the weight of the top three electric bicycles and add the information to the table two tasks later we have the same exact next task but it loses the price oh come on man you're so close and yet so far away before I shut this down let's make sure that it's actually failing going up the list over the past actions it wanted to add the price the maximum distance price weight maximum distance and it's working on weight again obviously this is in a death Loop of its own it's doing so many promising things but it's just not finishing out it's regenerating partial tables the entire time so let me regroup Ctrl C out of that hold on I have an idea hmm how do you mount a chroma parquet file to query it well now you've done and made me do it I had to open up the code the majority of baby AGI is pretty straightforward the main part of it is under 500 lines and it's comprised of functions like the task creation agent that returns a list of tasks which inside is using the open Ai call function it's really exciting how much of a Lego block system this is all this dig getting started because I wanted to know more about the chroma database and it turns out they just recently replaced Pinecone with chroma as the default so what I'd love to do is open up these parquet files that are persistent storage for the chroma database you have chroma Collections and chroma embeddings this is the data that it collected throughout the run I would love to be able to query this and see if it's actually a work product that's useful so I went down the rabbit hole of chroma DB and wasn't immediately able to find out how to mount those files and that's how I find myself in baby agi.py the actual python code what I'm going to do is insert a option that when you fire up baby AGI to just query the database one thing I'm saying is this is designed to be an infinite loop at the same time if it completes all of its tasks it's supposed to sleep for five seconds and recheck the task list it looked like there were future development plans to be able to inject commands into the task list in real time okay so I modified things so that it will ask for user input so that when we fire it up if we enter run it'll start the normal baby AGI infinite Loop if we say query that'll allow us to query the database and the other input will exit out it's a good day to remember python all right let's see it do some query stuff query enter the query bike all right okay man I thought it was so cool I just turned into a surfer dude so what we're seeing is the information that was stored to chroma DB from the original run let's look at Watts compare the motor let's try speed and weight this is exactly what I was trying to test if I rerun baby AGI it should have this information still stored so it has some sort of memory and this is the beauty of the structure this is the only code that I added in order to use what's here to query the chroma database so the result here is that we can query something and it will give back the top 10 related items in the database compare the weight of the top three electric bikes and add the information to the table motor power added to the table what I've done is make a quick modification so that it will actually prompt you if you want to go into query mode and when you do that you can type in a query like add to table and it will give you the top 10 results from the database that are related to that query this is quite interesting so theoretically if you keep running the script in the same place it will gain knowledge over time that it can use as context there probably isn't much intelligence baked into it to prioritize what's getting queried but that's just a matter of implementing station the core capability is there unlike Auto GPT which left you with some Bare Bones text files perhaps I'm shortchanging it though because I think you can bolt on Pinecone now that we know that it is storing some information in the database between runs for test number two what I'm going to do is run a paired down list of requirement the main difference here is that I'm specifying what columns I want in the table and I will use my new input box to start the agent now that I know that its default mode of operation is an infinite Loop what I'm hoping for is it to settle on an output table and then we'll probably go into a five second periodic hold with an empty task list so here's what I was looking for I think these got pulled from the database look at that generating the table so interestingly similar to the first run it has the end result but it still has quite a few tasks in the task list interestingly it went back to referring to the bikes as a b and c what's this new entrant electric bicycle D alright this is kind of falling apart it started adding frame material again it has a partially formed task on the task list so I think it's actually being hurt more than helped by the old information being stored in the chroma database I will cancel this and clear the persistent files and give it another run same prompt as before it looks like we're starting fresh with the task list and of course the same candidates are showing up although the third one is different really lending Credence to this being a fresh internet search and it looks like it got the critical stats right off the bat it has a table that looks like it answers the mail but it's still chugging along we're down to 2 tasks but where did our table go curses it has gone back to making up columns that it's looking for unfortunately this baby is stuck in a death Loop all the way up here on the fourth task it actually had an answer the table was well formed all the candidates met the criteria but it just kept going and that the task list never got shorter it changed so it wasn't repetitive death Loop maybe it needs a new term death by permutation where it starts decaying and going down rabbit holes of its own devising the longer it goes when the problems is that baby AGI is supposed to run forever it's supposed to reach a steady state of zero tasks what are the takeaways from this first real test of baby AGI did it understand the assignment yes ish but it got into death Loops or more accurately task Decay Loops where the longer it ran the more its tasks repeated themselves or went in the wrong direction shout out to chroma DB I was able to get it to query data that had been generated from the running while this doesn't seem to help subsequent runs in my experience it was really cool to see insight and potentially this could lead subsequent runs having some sort of persistent memory but don't hold me to that it hasn't happened yet the bottom line is baby AGI has a lot of Promise number one it is much lower cost than auto gpt's run definitely helped by only hitting GPT 3.5 turbo and text embedding ada002 to the tune of 513 calls across all the runs I did and a grand total of 27 cents so it is lower cost but don't think that that GPT 3.5 usage makes it less intelligent it was very capable on the task that I had if it hadn't had task Decay I think it would have come to a very decisive result the last thing that gives me a lot of Hope for the future of baby GPT is that the project is well structured and easily modified as is evidenced by the minor changes I was able to do in order to add new functionality of querying the chroma DB store which I did share to GitHub context found slash baby AGI Link in description it's a very similar bottom line get in there and experiment with baby AGI it's a really cool framework and I expect it'll have rapid Innovation and maybe some of that Innovation can come from you
Info
Channel: Context Found
Views: 9,195
Rating: undefined out of 5
Keywords:
Id: ldRHBHrQgPQ
Channel Id: undefined
Length: 14min 32sec (872 seconds)
Published: Thu Apr 27 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.