AI SDK 3.1 First Impressions

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

the verell aisk is easily one of the pieces of technology that I am the most excited about right now it has been incredibly fun building out real applications with this I didn't expect to do much with it and I've kind of just very quickly fallen into like hey I actually want to make stuff with this CU it feels really new it's bleeding edge it is allows us to do new things that aren't just like basic crud applications it's a really really cool set of tools and they just released version 3.1 and I wanted to talk about what's in here and also show you guys how I'm using this and what I'm sort of thinking about for the fure future of these kind of AI applications in the way I'm thinking about it CU I am actually building one of these right now I'm thinking about how I want to put this into my bigger projects and all that stuff so a lot of really good stuff here let's jump in so the first thing here is versel a SDK 3.1 model Fusion joins the team um the model Fusion stuff I looked at this briefly beforehand the model Fusion stuff is really just the underlining provider that they seem to be using to get the apis for all these different AI models um really not super important here the thing that I'm really interested in are going to be the AIS SDK core the AIS SDK UI and the AI SDK RSC so the first thing here is the AIS SDK core so drawing inspiration from projects like drizzle and Prisma you can imagine the AIS SDK core as an omm style abstraction for llms and I really really like this to me this feels like a really good way of thinking about how we can work with these llms because for a while now we've just had like chat GPT and anthropics um Claude and all these different AI models and we've kind of just been building Like Rappers on top of them is what people call them that's where the sort of GPT rapper term came from and this gives us a really good way for interfacing with those to enhance our applications and do more complex stuff with them because obviously you can just do the things where you just do like a basic a uh open Ai call to like put a front end on it you can just have like a chat interface and do that but what these allow as they say here is these new apis provide a set of unified low-level Primitives to work with llms in any JavaScript environment abstracting away the quirks between between major model providers this simplifies integrating llms down to just two decisions what kind of data do you want to generate text or a structured object the structured object piece is incredibly useful I found I'll show you how I'm using that in the real world pretty soon here and then how do you want it delivered uh incrementally streamed or all at once um you would think you would always want to incrementally stream it but actually there are a lot of cases where you want to do it all at once because there was some there was a use case where I was using it where like I needed to go through and just parse out users text and then do a search off that so like we can't do anything until we have the full response from the AI so that would just be a normal generate text call but versus like if you're just doing like streaming down to the end user and you want to just give them like a recipe or you want to give them a breakdown of some of their data then you want to stream it because that will create a better user experience so it gives you the ability to pick which one you want to do and they have this nice diagram here which is basically kind of the abstractions we're working with where we have at our base level here we have our providers which is um open AI mistal Google's um and Claude all these different providers I think they're adding in more later uh hopefully llama gets added to this at some point I want to see that uh really I really want to see more open source models in here that's something that I'm personally going to try and do I'm building with just the GPT models because it's easiest but eventually especially when I take these things to production and we start scaling them up I'm going to want to try and figure out how to self host some of these open source models figure out how to fine-tune them a little bit all that stuff um obviously like this is bleeding edge Tech I'm super new to this like I have not built a huge amount with this I have been building a real application here there is so much to learn and I think everyone is kind of trying to figure out how do we actually use these things so here the way they're doing it is we have these providers and then the AISD is just a way of talking to these providers this is a very similar mental model to an orm to a database and I really really like that way of framing it this is what it actually looks like in code so it's really cool all we have to do is just call this a wait generate text we pass in whichever model we want so in this case they're using mistol pass in a prompt and then we're going to get the text response back so this is much more pleasant to work with than a lot of the old um just Bas level chat GPT apis or Mr apis or whatever I've worked with these before and they'll like give you like arrays back and a lot of weird annoying things and it really is like the SQL metaphor is really good here honestly I think because this generate text is very similar to a like it's kind of like a find first if that makes any sense this will be kind of be like db. fine first and'll just give you one text entry or one user profile file entry or whatever it's a really nice way of working with it with type safety and all these things um and then they show off down here that you can easily Swap this out to use chat GPT for Turbo and all of this in this core part right here is completely framework agnostic I could use this in spell kit could use this in xjs use it in remix whatever it all works which is really nice and then another piece they have down here which has been a godsend in building these actual real life applications is generating structured data this is something which was really painful to do before having sdks like this you would have to go through and you'd have to like Bully the model like hey give me an output in this exact format and then it would just give you like some Json text and you have to parse that and hope that it works and it doesn't error and do all this stuff but they're basically wrapping all that for you to where you just say like okay here's the schema of what I want my response to be so you want to get a recipe with a name and a list of inre ingredients and a list of steps and then you just pass in a prompt so generate a lasagna recipe and then it'll go through it'll generate a validated type- safed object which is awesome abstract so much just nonsense boiler plate away from you and then you'll get out classic lasagna list of ingredients and a list of steps great stuff then down here we have the AISD K UI so the chat interface in seconds this is actually one of the first things I played with where what you can do here is you can just create your little post request up here so this would be the backend endpoint which would handle everything here so we would take in our messages from the request we would just go ahead and stream out the text and then we'd return the result as a streamed text response and then on the actual page it self here we just have this nice used chat hook which gives us all the pieces we need to work with this and this is actually how I'm doing a lot of the stuff in the way AI application I'm building which I'll show you guys pretty soon here and I'm using a lot of these pieces because it's really nice to just have a list of the messages just a really easy way of handling your input your input handle change your handle submit it just abstracts a lot of the just boilerplate away from you which I'm a huge fan of and then finally the one that really got me into this and I think the most interesting and exciting one in all of these is the AI SDK RS which is moving beyond text and this is the one where we get to do the fancy generative UI stuff we get to have the tool and function calls which are incredibly useful and will allow us to make much more complicated and advanced and well useful applications with this because with those previous things you really are in just you know very basic GPT rapper territory you're just I mean that used chat interface example is literally just putting a front end on a chat GPT API call versus with this what we can do here is we solve two key problems of llms which is limited or imprecise knowledge and then plain text or markdown only responses we can fix that by providing a UI ourselves we can give it all of these pieces to actually perform useful functions so right here we have this nice example where they have this submit me message function this looks very similar to what I implemented in weights AI so we have our used server up here so this will run server side and then basically what we're doing down here is we're going to get our results and we're going to stream this UI down we pick our model to be gbt 4 we pass in our messages we have our basic text response but then where it gets really interesting and where the really cool stuff comes out is this tools section so in this tool section you can provide the AISD a bunch of different things that it can now do it's basically just providing it a bunch of functions that can call and those functions return UI so here we have this get City weather example so get the current weather for a city we pass in the parameters so they have to like pass in the city here so the llm will be smart enough to say if you just say like hey give me the weather for San Francisco it'll be smart enough to parse out of that query to grab San Francisco pass that in as the city parameter here and then you can just generate the UI for that here so we just yield the spinner we go const weather equals a weit get weather so this would just be a back-end database call that we provide ourselves again I'll show you a real life example of this later and then we just return the weather component with the info PED in and that's it we've streamed down real UI and what this looks like is exactly what this demo is so what is the weather and SF there it is really really cool stuff here and I think finally down here yeah they close with uh towards a complete typescript AI framework which I think is what they're looking for which is awesome um they forell AIS SDK 3.1 marks an important step towards delivering a complete typescript AI framework with the asdk core you get a unified API for calling llms that works anywhere JS or TS runs with the AISD K UI you can build chatter interfaces in seconds and then finally with the AISD K RSC you can go Way Beyond chat interfaces and deliver the next generation of AI native applications and that's this is what I want to build I want to build these AI native applications that's what I'm really interested in right now um so yeah with that out of the way let's take a look at the documentation and then I'll show you guys what this looks like in the real world so they completely refreshed the docs this is so much better beforehand it was kind of they worked but they were very basic and were a little chaotic but this is much much better so let's start here with prompts these are kind of the base layer that we're working with that generate text stream text Etc so they have a bunch of different stuff in here and what you'll notice is this kind of feels like like like Prisma or drizzle documentation it kind of feels like you know a bunch of functions to interface with our AIS the way drizzle would interface with our database so we have right here this little result equals a we generate text we pass in the model we pass in the prompt we get back a nice response to this prompt then we also have the ability to pass in some parameters so the message prompt is where we can start to get a bit more fancy with this where we can have multiple roles in here so we have Ro users so this is what the end user who is typing into their little GPT wrapper would be able to that would their input the assistant would be the ai's response and then again back to the user so this way we can basically record a conversation so say we had like 10 different pieces of conversation you could record all of those and every single time we make a call to this generate text we're going to pass in all of those messages so the AI has the context of what they previously talked about so it would know that the assistant had previously said hello how can I help it would already know that it said that so then it can tail our future responses to make sense in that context so if we for example example had like it already gave you a piece of information or the SDK had asked you for more information it would know what information was asking you for so this allows us to do more complex stuff we can go down here and get even more complicated add in our multimodal messages so it's really cool what we can pass in here so we can actually pass in images so we can say um we're asking it to describe this image in detail and then we're actually just literally passing an image in and I think uh I know a open I know open AI can do this I think a couple of the other providers can hopefully eventually they all can so we can now add in some stuff here like hey describe this image really useful stuff in here then we have these tool messages this is where we get into like really useful territory and where we can really deeply integrate this into an application in a useful way so with these tool calls allow us to do we go over to the documentation for these basically what they allow us to do is provide a set of tools that the AI can call because obviously an AI can't tell you what the current weather in some location is because it's just trained on data from the past and it doesn't know current uh it doesn't know current Day stuff it doesn't know live Real Time stuff it can't generate interfaces etc etc so what we can do here is we can provide it a bunch of tools to help it answer the questions so we just give it a description of when it should use this so it should use this when the user wants to get whether in a location the parameters we pass in is the location then the execute just goes through and actually gets the information and then this allows the AI to actually get a real answer to this and then return that to the user so we can go in here and this will will this result will actually result in us getting a real answer to the question of what is the weather in San Francisco which is a really cool sort of thing that we can now do and then finally down here we have our system messages and this is really just to kind of bully the AI into acting and sounding the way you want it to if you wanted to have a certain tone or you want it to do certain things or you want it to act in a certain way you can do that here with just these basic system messages to give it some more context about what it's supposed to be those are sort of the base layer Primitives that we're now working with which again these are just ways interfacing with the actual AI I want to jump into my real life example and kind of talk about how we're really using these things okay so this is a real world example of something that I'm actually building with this AI SDK and with these new AI tools it is a personal workout assistant it's right now it's just a weights tracker and a scheduling system eventually I'm going to add in some more basic stuff like goals um add in maybe like calorie counting just basic stuff like that really just kind of like a Jim bro companion type thing and I've gotten pretty far in this and it's going to be a real thing I've been testing it out using it in the real world and there's been a lot of things that I've been learning about this and I don't know I think it's really cool and there's a lot of things to kind of talk about here so first and foremost let me just show you how this works so if I go in here so I've already got a workout running so I'm just going to go ahead and click on show me my workouts information and basically what we're doing here within my um action. TSX file I'm defining the submit user message function and in this function I have all of these different function calls so I have like view current workout I have ad sets I have um view all workouts etc etc and all of these different functions are things that the AI can call and every time I submit a message in here it's going to go through and it's going to look at the things I've given it access to and then it's just going to call the one that would make the most sense there and then send out UI for that the whole point of this is to kind of just substract away the very spreadsh nature of um weight trackers and stuff like that so for example if like I um this is like a real workout I did a couple days ago so like if I wanted to go through in here and I wanted to be like okay um I just did a tricep push down 75 * 8 or whatever I can paste that in here and then we go through and then I just get access to this UI I can copy it I can delete it and then I save this in and then if I go into my active workout again I get this information added so I've added this tricep push down in here it's basically just a way for us to interact with our Weight Tracker via text instead of having to go through and do it via UI or via spreadsheet or whatever so this felt like a really natural way to kind of test this and run with this I don't know I've learned a lot from doing this so before I go any further I kind of want to diagram this out for just to kind of give you like a visual picture of how this actually works because it's a very different way of thinking about applications so let's just say over here I have my applications so this is like the actual UI screen I have my um responses up here and then I have my text down here so it's just like um I just did bench or whatever so we just we did bench we pass that in down here so then what is actually happening when we're calling this I've kind of described this but I wanted to show it visually so this is our actual little backend interface here so this is going to be our um AI SDK actions we'll just call it that for now we need we need good names for these things I'll figure that out eventually so we have our little actions over here and then within this actions we have a bunch of different functions we can call this is going to be um add set we have um view all workouts we have um manage schedule and we have a bunch more things in here so you can imagine there's more we'll say like floating way out here is GPT because we're using GPT for this SDK so what'll actually end up happening whenever I submit this is this will go ahead in here so it'll take this I just did bench it will pass it up here to our actions it will go to GPT and it'll be like hey GPT this is what the user just passed in here are all the functions I have access to right here which one should I pick and then GB will be like well this seems to be he's adding a set so let's go ahead and pick add a set and then once we've add and then once it's picked add a set it'll just execute that function and what that function does is it just goes into it just pulls out of the prompt it pulls out okay he did bench we want to match that with the saved exercises to that user because we want to make sure that we track these correctly in our database we want to go through and pull out some like weight information so like how much weight did you do how many sets did you do how many reps did you do etc etc and then we can just return some content so then what this will actually do is this will return into this will actually return down to the UI itself it'll create a component so it will go through and it will return out this little like um add set card so it'll actually return that out here with that data pre-populated so if we look at the actual code for this this add sets function what we're doing here is we are allowing the user to input their set they can fill in the exercise the Reps the weight etc etc we pass in the exercise we pass in the Reps and we pass and the weight and then we go through in here and we can just do a bunch of stuff CU again this is just a function call and that's kind of that's the weird mental thing that you have to get used to is you're basically just letting the AI pick a function so we're going in here we're getting our current workout we're going to if there's not a workout we'll just give it basically an error and whenever we return something from this function that is what gets sent into the UI so whatever we return is what gets added into our chat UI here at least the way it's currently set up so we'll return an error here and then otherwise if that's not here I'm going to talk about this in a different video cuz this is another thing I did but I actually used um PG Vector search to go through and actually match the um user input to their actual exercise in the database so if they had bench press saved as bench press instead of bench it would go through and it be like hey those two actually match and then match those together this is a topic for another day we go through and get that and then finally really the important thing down here is we're just going through and we're telling the AI like hey we just called a function we provided the UI for the user to add their sets and then we went ahead down here and we just return this card so we return down this add exercise card server and what this actually is is just a normal react component we're just passing in its initial State then we have our client side code here which is just a normal react component it's using use client has trpc in here doing a bunch of basic form logic stuff and that's really it so it's just an enhanced way of doing these applications I've been using this myself a lot and one thing I've kind of noticed is this sort of flow that versel is kind of pushing here of like um just creating a new card or a new UI every single time is it's good but it's kind of tiring and it's a little Annoying especially when I'm like on my phone in the gym it's a little weird so like every single time I go through and I add something I have to add it and then I have to be like hey show me my current workout information and I kind of solve this with a Band-Aid fix of like having like these uh shortcuts down here so like if I want to manage my schedule I can very quickly do that of just like let me manage my schedule I can edit this add my exercises in etc etc but it's kind of I don't know it's a little janky and it's not my favorite thing in the world it's just more button presses and it feels a little more convoluted than I'd like but the thing that I really do like here is I really like being able to do this like bench 135 * 9 thing being able to just do that and have it know like hey you want to add this here this is really really nice and this is kind of this was what I wanted from the beginning so what I'm kind of exploring right now is doing a sort of like hybrid interface here so we go into I'm like in the middle of building this this right now it's not done I'll show this on mobile cuz this is really this will be used on mobile uh that's another problem we have with the AI stuff right now it's really great on the web but it's not this is going to be tough to implement into a mobile app at least at the same power as you can here so what I'm doing here is we're going through and this little card is fixed up here so the way this will work is like when you get into like workout mode or whatever we'll fix this up here and then basically this down here will be our way of interfacing with that card but the card itself will also have some way of interfacing with it what I'm doing here is I did a 3x5 and I'm going through and I have all these sets under here so I have you can drop down you can see all the sets I'm going to add in a button here to duplicate these to delete these so we get like the nice user experience up here but we still gain the enhanced functionality of the AI down here and you can imagine when this really scales up and this becomes potentially a real application you would have the like ability to save sets you also have the ability to ask it questions about the sets themselves and this gets into a concept called rag um I think it's called retrieval augmented generation that's basically just a fancy way of saying providing the AI with more context based um when you do a query so we could say like if we wanted to add rag for like um how is the trend like how does this workout compared to last week's workout I can give it the context of last week's workout so I'd pass in just the raw data of here's the Json of last week's workout here's the Json of this week's workout answer this question about these data sets and we can just keep adding features like that on top of it and we gain all the benefits of AI but we still want to keep these kind of nice traditional user interfaces because these very tailored experiences of being able to like just have this up here and have this not go away and just you know manage it quickly and efficiently there I think that's important so a big question with all of this stuff and all this Tech is going to be how do we kind of balance these things how do we use these new um AI tools to enhance our applications and make them better and get the power out of these things because there's a lot of power here here while not just like making it into a gimmick app because that is one thing that I kind of felt here is like okay this is cool and it does solve the one problem of spread sheets suck but a lot of these things do feel a little gimmicky so what's the balance here how do we do that and I don't know I don't fully have the answer yet I don't think anyone does but we're trying to figure that out it's bleeding edge Tech um and yeah that's a very long- winded way of saying like this is a piece of technology that I'm really excited about I think the last thing I kind of want to close in this video with is going to be this kind of talking about like GPT rappers and all these things and um the way I'm kind of feeling about that a big thing you'll see on Twitter if you follow like AI stuff you see all these things is people just talking on like it's just GPT rappers it's whatever it's going to get nuked by open AI whatever whatever and I think to an extent that's kind of true but I think I was listening to a talk from um a bunch of I think it was a YC panel they were talking about all these new AI startups and all these things they're building and one of the things they kind of brought up up is like you know the GPT rapper thing is a little disingenuous cuz if you think about it really carefully a lot of applications a lot of traditional applications we use all the time you could kind of boil them down to being SQL rappers like at the end of the day Twitter is kind of just a SQL rapper I mean it's an extraordinarily large one but it is just kind of a giant SQL database the messages are SQL entries and we're just providing a really nice UI and capabilities of interacting with that and I think a similar sort of thought process can definitely be applied to these AIS as we scale these things up where yes in a way you could say like yeah this is a GPT rapper but we're building something on top of it and we're adding more functionality to it and I think the interface and the data and all these different things we're using to tie all these things together I think there's a lot of value there so I think I don't know that's something at least for me personally is like just not psyching yourself out and just be like oh well it's a gbt rapper who cares but really think about like okay what can we actually do to make these things useful and I don't know I'm playing with this a lot I'll definitely have more to say about this I'm going to launch this as like an actual product um it doesn't look particularly pretty right now but that's just because like it's just generated with vzer a vel's little AI UI generator um it looks very Shad Cen it's not the prettiest thing in the world but you know it works and I'm really just trying to test the core value prop here oh yeah one more thing this is all this code that I just showed off with this um gy AI completely open source links down below free to check it out and try it um like I said I'm going to be offering a paid hosted version of this but self-hosting will be available you can do all that stuff definitely check that out down below if you guys enjoyed this video make sure you like And subscribe if you made it this far thank you and uh yeah I will talk to you soon

Info

Channel: Ben Davis

Views: 5,677

Rating: undefined out of 5

Keywords:

Id: TC9inDok1x0

Channel Id: undefined

Length: 24min 12sec (1452 seconds)

Published: Mon May 13 2024