Auto-GPT: A Real Second Test

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in the age of AI in 24 hours a lot can change including my opinion about Auto GPT I left off on the last video saying that I might try some additional approaches with my example of buying an electric bicycle in that first attempt I gave it an open directive hey find me the best bike that meets these criteria and it failed ish I called it a win because it did find a bike and I ended up buying that one but it didn't complete out it was caught in a death loop with an error but I thought that was pretty anomalous and it did have a lot of promise and I was impressed with it and that's where I left things off fast forward to the next day and attempt 2.0 I called it comparison bot what I wanted this to do is to take in a file with different project links that I had manually collected in the head-to-head competition last video and I wanted Auto GPT to go out and do some research collect the results and put them in a comparison table I didn't feel like this was too big of an Ask even though this was a more generalized process where it is saying hey you are comparison Shopper bot whatever links I give you in a text file go and compare those I was hoping to generalize it a little bit while giving it a softball of defining the links I wanted it to draw from so what happened was I prompted it and it took off and it read in successfully the file which is great and I see it pulling in the links from the file but unfortunately it then tried to parse it and it tried to run a python command it looks like and what came back is use list comprehension only if it improves readability and this happened over and over again once again Auto GPT is beating its head on a brick wall I really think fit can have a counter for how many times it runs the same command and then automatically break out and try a different track that would be great so attempt to was a failure Auto GPT got stuck in a death Loop of not being able to read the links all right attempt 2.1 same thing you're a comparison bot but this time I will format the input for you so that it is python friendly and of course by me I mean I'm going to let gpt4 do it for me it's at this point that I remember how slow gpt4 can be under Peak load and let's fire this up yes I will take the same configuration please I will let it go with a hundred commands as specified interestingly it looks like it is repeating the same web search so I wonder if it's going to get hung up or if it's going to move on to the next one it had looked like this had gotten stuck in a loop but it's re-reading the same page each time it needs to query something else so it'll re-download a whole Amazon page just to see about specifications then it might re-access the whole page breaking it apart in nine Parts crunching it all through just in order to get reviews so there is definitely room for increased efficiency here but at least it's not in a death Loop once again I have concerns here's the thought use Google command to find more information about the second candidate product now the second candidate product is the on-cheer 350 watt mountain bike but here it is Google arguments what are the specifications of a Razer E300 electric scooter we're now looking for scooters it's not even on the list it must have caught a link somewhere and now it's going down a rabbit hole oh good now it's shopping for headphones brilliant how did we even get here I don't see headphones on here oh great okay now we're gonna shop for a Galaxy a series all right I'm gonna have to shut down the experiment on this one it has gone down a total Rabbit Hole I gotta regroup and see how I can direct this in a way that it will stay on task so I canceled out of it with Ctrl C let's see what it did get in its working files it has knowledge of product one's text file a high level single paragraph on at least it's a bike and let's see what product one has it appears to be some of the specifications for the first link the Anker bike Yep this is the one that it was looking at so the only information it was able to gather in its files was based off of the first link before it just Dove off a cliff into searching for headphones so attempt 2.1 fail it just plain got lost I am in the market for headphones so maybe it's more advanced than I thought but right now it had the directive of just shopping for bikes okay buddy it's not easy being a general intelligence so let's try this one more time welcome back I would not like to run with the same settings your name is bike shopperbot you shop for bikes this time I'm making it very explicit this bot is an electric bicycle comparison Shopper no fiddling about trying to figure out what that means I'll spoon feed it to you you compare and contrast the e-bikes given to you in products.txt there are only three bikes by name the hover one Instinct the electric XP light and the totem Victor 2 so your mission is to compare and contrast those three bikes given in the text file in May I say a list that python can understand goal number one find information about each bicycle goal number two create a comparison table with a row for each bike and columns for important traits like price motor Watts top speed range and customer rating I want a matrix of products to Features so I can compare them that's what I've been looking for this whole time goal three there is none keep it simple let's try that let's have this run five operations it's going through a lot of effort to formulate tables for displaying which is promising for the output I'm maybe five minutes or more into this process and let me do a little race I'm gonna see if I can create what I asked for in less time than it takes Auto GPT and time took me 11 minutes I was able to find all these specs on it and put it in this ASCII only table this is the kind of comparison shopping info that I am looking for from this the totem Victor 2 is looking like the winner of these three it took me 11 minutes to put it together how is auto GPT doing well it's an infinite Loop of trying to run some python code to search through the file I believe I'm starting to sense a theme with auto GPT and that is the death Loop this time around it didn't even attempt to search the web let's try it one more time it's time for the bike Shopper bot 2. I'm gonna fix it not searching the Internet by saying right in the prompt that you do internet research unfortunately I do not have a recording of this run but here's what went down it did decide that it needed to do an internet search unfortunately the Google search term literally included products dot text the name of the file it did not reference the contents of products.txt so it had no idea what it was looking for swinging a Miss buddy this must mean it is time for bike Shopper bought three wherein I very much emphasized that it should get the names of the products from the local text file products.txt other than that the same instructions so let's fire it up so it's going to read the file this is very promising yay verily go forward for five steps one thing I've learned about dealing with autonomous agents is that they are not a real-time tool they're definitely for use in parallel or running asynchronously where you set it off on its Mission and come back to it yeah unfortunately in six minutes in and it has entered yet another death Loop this is becoming a troubling habit this is actually the second time I ran this version three of The e-bike Shopper last time I had the recording pause but what it did is that it actually got into the Google search it read the file got the names of the bikes it had it right there it searched on Google for the first one using the name and specifications and this gave it the information it needed right away that it captured in a structure but then the second search it did it tried to sell me a swagtron I swear Auto GPT brought to you by swagtron look I've went true a lot of time and a lot of effort to set up the prompts is simply and directly as possible getting simpler and simpler getting more and more explicit and what I wanted to do I was able to get the results I wanted quicker manually Auto GPT couldn't accomplish the mission under any amount of time it made 128 API requests to GPT 3.5 text embedding ada02 and gpt4 it used all the might of the large language models and it wasn't able to get an answer it went into a death Loop it went searching off on its own it was unable to understand that when I mentioned a file it should look in it and get the information from there it just tried to Google it I had better results in my previous video giving it much more open-ended requirements so a thought occurred to me I want to be fair maybe my my temperature setting is causing problems on it by default the temperature is set to zero in the EnV configuration this is the temperature that will be used by open AI so what I did is I turned it up to 0.4 which is just at the lower end of medium temperature sometimes equated to creativity so let's rerun bike Shopper bot 3 just like I had it before with temperature of 0.4 and I did confirm that the workspace was empty first thing it's doing is a Google search for the first item and once again it is condensing that information really well on the Second Step that it looks like it is going after the second product in the file this time around uh oh so once again for some reason Auto GPT is doing its own thing and going after products that I didn't ask for my theory is that it's getting confused by Links returned from the Google search because it hasn't gone to any website other than hitting Google to find out more information I am going to let this run if it can complete I'm curious if it can compile all the results into the table as I requested I am going to say that I don't think temperature played a part in the process right now we're getting the same failure mode is before going after random items I just scrolled back to review where it got off track I legitimately don't know how it got it in its head to go after the Red Rover this is the Google search result from the successful second item that it picked up the text returned from the search doesn't contain that keyword I was considering stopping the process but then I realized that the input also allows you to provide plain text guidance so I said stop searching and compile the results I wanted to move on from this exploration phase and try and generate the output table so with my feedback it looks like it's going to try and go through the plain text files that used and compile them this thing's in a death loop it's trying to go after the files and it thinks they have an underscore specs extension on them but they do not if I look at the output once again it gets partly there unfortunately this is another failed attempt so getting to some of the results on the API usage front there were a surprising number of calls yet again 236 this is a far lower than what I had last time but this is mostly due to the fact that auto GPT got tripped up so many times and couldn't even get to the point where it was utilizing the llm all in all no big Financial loss it cost about 86 cents this is with using GPT 3.5 turbo for most of it and text embedding ada002 but the failure today really can't be blamed on lack of smarts I have the key for gpt4 it used the API with gpt4 and it still did poorly in summary did Auto GPT understand the assignment no while I got partial results in my previous video today what we're looking for is some sort of completion with better prompting under better circumstances and that just didn't happen the system failed even when the pool of candidates was limited the system got off track searching for random items it wasn't able to read the plain text file either because it wasn't formatted for python or because it was confused by the file names that it created and with the theme of the day it would usually get stuck with some sort of death loop I have yet to see auto GPT complete with a successful run let alone outputting the result I requested my takeaways from the experiments today I don't know man maybe you can do better please let me know in the comments if you have some magic prompt setup that is getting you results with auto GPT perhaps you could micromanage the system through the input field instead of just blindly saying yes go about your next business it really defeats the purpose to constantly have to babysit this especially when it takes so long to respond this is not a real-time tool as I mentioned before this is using gpt4 it is not a lack of state-of-the-art Technology what's my bottom line autonomous agents are very promising I'm excited to continue experimenting with this but at least in Auto gpt's case it's just not ready for use in real scenarios by real people right now but I'm not giving up I will apply some of these test cases to other autonomous agents god mode baby AGI teenage AGI Jarvis and I think this really speaks to a hole in the market for more targeted autonomous agents for example a dedicated shopping bot that I was trying to create with my prompting additionally I'm going to look for examples where people have successfully deployed autogpt and other autonomous agents and see what these winning scenarios are I feel like a lot of people are excited about it as am I but don't thoroughly run the systems through the ringer and see what the output is when you dig deeper you start to see some of the limitations the upside to this being it's an open system when you identify those limitations or dare I say opportunities you can do something about them and make this whole system better I welcome your input and if you have anything that you want me to test out give me a heads up
Info
Channel: Context Found
Views: 4,351
Rating: undefined out of 5
Keywords: Auto-GPT, Autonomous Agents
Id: yiJjESh-kao
Channel Id: undefined
Length: 16min 29sec (989 seconds)
Published: Sun Apr 23 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.