OpenAI Assistant Creation: FIRST LOOK! Let's start making AGENT SWARMS! Tutorial Walkthrough

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good morning YouTube it is a good day to be in AI so yesterday we had the uh just complete uh tectonic shift that open AI did when they dropped a nuclear bomb uh figuratively speaking on the AI world so we've got a whole ton of new ideas and so what I figured I'd do is I just jump in and experiment and get my hands dirty so this is this is kind of old school how I used to do things where I would just kind of guess and um go from there so what I'm going to do is I'm looking at the new playground so we've got the playground we've got assistant and threads and so on and so forth um and one of the first things that people said on my video yesterday was hey why don't you um do uh an assistant of like your cognitive architecture and stuff so uh we were working on this um in another GitHub repo where it's like let's download all my transcripts and make them searchable well uh open AI Dropped a Bomb where they just it looks like they've built in rag retrieval augmented generation into the assistance so but rather than feed it the raw unstructured transcripts I'm going to trans transition everything into spr so spr is sparse priming representation which I made a video a while ago was a little bit controversial um but I was like don't use mgpt use spr and so a lot of people pointed out accurately that like you can do both right because um so the theory behind PR is that the language model already knows a lot so all you need is a few little bits and pieces basically some memory pointers to tell it how to reconstruct a memory um and this is actually inspired by how human brains work so human brains uh we have we have sparse memories meaning that basically what your brain does is over time as it compresses and compacts your memories it just remembers a few details and so like say for instance a party that happened 10 years ago you don't remember the exact details you remember who was there and so those are just little pointers in your brain that that you know says Bill Sally Steve and you know Todd or whatever if you go to a a party full of white guys like me um and so but then those pointers point to memories that are associated with those other people and so then it's like okay all of my memories about Todd are associated together and then the the the place where the party happened is its own set of pointers so basically the way that human memory works is it's like sparse priming representations embedded a Knowledge Graph uh everything that I know about uh neuroscience and human memory that's that's pretty much the closest thing so basically what I'm going to do is I'm going to take my YouTube transcripts and convert them into sparse priming representations I probably won't show you that I'll show you the script after I finish it but I just wanted to show you that I took one of my longest videos uh which this was 58 kilobytes worth of text and converted it down into 33 uh statements uh now it is a little bit lossy just like human memory but that's okay so anyways um I've got this copied on my clipboard so let's go ahead and jump over to the assistants so this is what I'm going to be doing so like Dave Shapiro YouTube uh questions or whatever uh and then I'm going to give it some instructions so these instructions will be like um answer uh questions about Dave's YouTube uh videos uh to the best of your ability if the uh question is not directly answerable um in the transcripts then uh do your best to infer impute uh or otherwise uh guess what the answer would be just tell the user that you don't have the exact answer and that you're speculating okay so then we'll probably choose the latest and greatest model cuz it's faster cheaper and smarter uh not only that it has a huge window and so then we'll do uh retrieval and upload files so I don't have this done yet so I'm going to pause the video real quick and come back and show you once I've got the goods to deliver uh but yeah so that's where we're going uh and uh yeah stay tuned okay it took an embarrassingly uh long amount of time to get this working but so I opened I updated my open AI um API uh or module in Python so let me walk you through some of the changes that I had to make so first if you when you update sorry about that so first when you update the uh the the client you'll need to start using it slightly differently so from open aai import open AI uh and then the response so then you have to instantiate a client so client equals open Ai and you set your key here uh obviously you're supposed to do it with an environment variable but I use Windows and environment variables are whatever remember I a very lazy coder so anyways then you use client chat completions create uh so that looks a little bit different than it used to um but then also for the for the response you do uh so this is the response object response choices zero message content um and that seems to work pretty well anyways so now what I've got this doing is I've got it churning through all of my transcripts um and it goes pretty fast so I just started this a few minutes ago or a few moments ago and we've already done 18 and so each of these is uh going to be much smaller than the original video so the these are the spr so these are the spr uh is just the compressed version so that you get like the cliffs notes um which is should be enough to just feed into it uh so this is using this in conjunction with retrieval augmented generation or whatever they're doing in the background should work really well uh and also you notice they're much smaller so the first 21 of these um is a grand total of 30 kilobytes um whereas we come over here to the originals and we select the first um let's see actually exclude those let's see the first 1920 that's uh half more than half a megabyte um so we're getting we're getting a reduction ratio of more than 10: one um which will be good because again uh feeding Superfluous tokens into the model one that slows it down but two it increases cost so this is an optimization step um when I was still doing consulting one of the things that I would often recommend for for clients is to um is if you can if speed is part of the ux consideration the user experience then do as much pre-processing as you can particularly the slow stuff um and then use the fast stuff at uh at inference Tim or at runtime such as using Vector search so if you've already done the processing this is slow and expensive so you do the slow expensive thing once and then you use Vector search which is cheap and fast uh in production we're converting all this this will take a little bit but uh like I said you know I'll just show you the good stuff at the end okay so while this is running um I thought maybe it would be good to just Riff on some ideas uh you guys seem to like my uh unstructured thoughts on some of this stuff so um as I mentioned in my video introductory video to open AI big big conference um like wait just wait because you know we were working on mem GPT and retrieval augmented generation and I was working on the Basher Loop and open AI comes and just nukes the entire thing I don't even need to think about it anymore and this is the pattern that I have seen over the last four years starting with gpt2 gpt3 GPT 4 and now gp4 turbo is that the things that are most useful the things that are just like kind of the basic tools uh just get built into the into the system and so like you know this pretty much completely invalidates everything that I was doing with you know Remo which was the rolling episodic memory organizer and the Basher Loop um and those sorts of things now that's not to say that there aren't cases where those very specific um things aren't needed but the next step after just having basic uh you know retrieval augmented generation into Bots is automatic knowledge graphs and Auto a atic sprs and that sort of thing um and so like you know yeah we can work really hard you know days weeks months um creating some of the tools but really on the path towards AGI as I've been watching uh open Ai and talking with people the last couple weeks um it really seems like their their approach is going to be a tools uh first approach and so what I mean by that is by adding one tool after another so figuring out what is what is the next bottleneck so for for instance with context Windows even at 128,000 tokens that's still not enough of a context window to do some things uh for instance if you've got a a scientific chatbot that needs to read literally thousands of papers and thousands of of pages of data and other stuff that's a lot to keep track of I think 128,000 tokens is like 6,000 pages so that's probably enough for most things but you still remember that like there's there's a lot of text out there and also some some of these things are very token heavy like raw data can be very token heavy and there can be a lot to keep track of now uh with 128,000 tokens that solves a lot of problems so it's like okay just make the window bigger great and if you know if we go up by another 32x that's going to be 3 million tokens this time next year uh which is like at that at a certain point it's just kind of like okay the the the the window size is solved then you just make sure that it can handle the memory inside and one of the things that Sam Alman said is that not only is the is the window uh bigger but it's better at keeping track of what information it needs within that gigantic context window which doesn't surprise me having done a few videos on um on LM infinite from meta as well as the um oh what was the name of the one from Microsoft the one for a billion tokens anyways I did videos on both of those so there's hundreds of little algorithmic improvements on the attention mechanisms and and and attention masks that you can do so you know I like I said I wouldn't be surprised if we have a million tokens or 3 million tokens this time next year um which is that's getting close to the amount of text that a human reads in an entire year uh so you know you think like okay so uh the the ratio is roughly right now it seems like it's it's kind of like seven uh 7 to 10 so like 100,000 tokens is 70,000 words so that's like a good book so you you you double that or you you go up 10x so a million tokens is rough is roughly U maybe 10 books or whatever so anyways the idea is like most people read 5 to 10 books a year um and so a million tokens is literally a year worth of reading now obviously that doesn't include all the emails and all the text messages and all the videos that you watch um but like you know you can see running in the background it's compressing literally like hundreds of hours worth of videos into a few succinct statements so like this knowledge compression is really like ramping up um and so uh going back to the idea of open AI is building tools so Knowledge Management that is a huge like tool um and then all the all the function calls all the other tools that it has access to so like you combine those basic capabilities with the reasoning engine of the uh of the GPT model and then you add a software architecture uh where you have some agents that are responsible for organizing the whole thing you have other agents that are responsible for doing you have other agents that are respons responsible for Morality ethics and legality then you have the the beginnings of autonomous systems and it really like I mentioned this yesterday and I'm just going to say it again because I think it it Bears repeating is that we are working towards um a swarm intelligence is going to be the best way to characterize the way that AGI is going to emerge and so it's not going to be like one thing that you talk to it's going to be more like a bive or an ant colony um now will we have a board Queen to help out that could be pretty cool um let's see rate limit oh you can see how fast it's spinning okay I need to pause this cuz I hit my rate limits oh yep it killed itself all right let's see how far we got so under transcripts we have a total of 282 and then under sprs we got to 86 you know what I think this is probably good enough to get started so let's just move on to uh building our assistant so let's zoom in a little bit all right create David Shapiro YouTube uh chatbot doubl spaced okay so you are a Q&A bot for David Shapiro's YouTube channel use the transcripts uh to answer questions as best you can you are also allowed to infer impute or guess if the question is not immediately answerable just do your best to fill in the blanks and let the user know that you are speculating okay so then the model we're going to use is the latest GPT 4106 preview and then we're going to do retrieval and we're going to upload a bunch of files so let's come over here to YouTube chapter generator SPS and away we go 80 videos is only 123 kilobytes so like that's actually a relatively trivial amount of of data overall and we can see it's uploading so this shouldn't take too long there we go so now we'll do save uh let's see validation error request make sure it has at most 20 items oh darn okay so unfortunately it looks like we can only do 20 files right now so that's uh that's kind of understandable I suspect that this limit is going to get uh is going to get removed very quickly especially since we're doing really small files so let let me try this differently so basically this entire thing was like a wasted waste of opportunity um okay so let's go to upload let's try um well here we can only do 20 so that's fine um let's choose all my AGI ones so that's um let's see that's six and then uh let's see anything that says cognitive architecture uh core objective functions that should be nine if I'm if I'm keeping trying um then Ace Ace is good stuff far priming representations all right that's good enough all right so this one will be just a few uh unfortunately it looks like we're limited to 20 files maybe what I could do is is is reformat everything to fit within those 20 files but let's just start here all right and I this is my first run so like I'm learning as I go uh so you're learning with me so let's see what happens next sips coffee buer buer that's that's an old reference Ferris Beer's day off hey shout out in the comments if you if you watch that movie that movie came out when I was like pretty little and so then I remember watching watching it and being confused about like I don't know if you saw like an explicit sex scene but it was like an implied sex scene and I'm like I remember asking my dad like what are they talking about like did they sleep together and he's just like I'll tell you when you're older but I remembered it and I put it together later not that you needed to know that this is really taking its time you can tell I'm getting bored when I'm starting to Riff on Ferris buer day off maybe I should stop recording so actually I wonder um since it's taking so long to upload I wonder if it's doing some kind of processing in the background you know it might be helpful if I go look up the documentation maybe I'll do that okay here we go I found the documentation for knowledge retrieval uh let's see how it works either passes a file content for short documents performs a vector search so it does do retrieval augmented generation I'm wondering why it's taking so long though uploading files for retrieval okay files can be added to the message in a thread so okay I mean it seems it seems like it's pretty pretty straightforward but let's see if it's done still not done okay well I guess I'm going to pause the video and we'll come back when it's done okay so I'm trying for a third time and it keeps timing out so we might just have to call it a day which is really sad but it's also not surprising cuz it just launched so I imagine like a million and a half other people are trying to build assistance right now but uh so like I've I've hit save and it's timed out and I keep coming to my assistant page and it's not here so really disappointing but it's also day one um but at least I walked you through the process of what I would do and some of the things that I figured out so now that you can uh you can do it too all right uh have a good one like subscribe etc etc uh as I mentioned on my videos I do have a uh exclusive Discord community that you can get access to through patreon links in the description also if you're not into Discord that's fine I have exclusive content um that is available to all patreon supporters there's two tiers there's the basic tier um which has uh one additional weekly video and then the premium tier has another additional weekly video on top of that so if you're on the premium tier you get two additional weekly videos uh that are raw unfiltered and deep dives into various topics Plus we also have live uh live stream Q&A sessions uh Discord Town Halls they'll be recorded and that sort of thing but yeah Jump On In um and I look forward to seeing you there have a good one
Info
Channel: David Shapiro
Views: 78,271
Rating: undefined out of 5
Keywords: ai, artificial intelligence, python, agi, gpt3, gpt 3, gpt-3, artificial cognition, psychology, philosophy, neuroscience, cognitive neuroscience, futurism, humanity, ethics, alignment, control problem
Id: Xp5uVthS-A0
Channel Id: undefined
Length: 18min 11sec (1091 seconds)
Published: Tue Nov 07 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.