Pydantic is all you need: Jason Liu

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hey guys so I didn't know I was going to be one of the keynote speakers so this is probably going to be the most reduced scope talk of today I'm talking about typ hints and in particular I'm talking about how pantic might be all you need to build with language models in particular I want to talk about structured prompting which is the idea that we can use object to Define what we want back out rather than kind of praying to the llm gods that the comma is in the right place and the bracket was closed so everyone here basically kind of knows or at least agrees that large language models are kind of eating software but what this really means in production is 90% of the applications you build are just ones when you're asking a language model to Output Json or some structured output that you're parsing with a regular expression and that experience is pretty terrible and the reason this is the case is because we really want language models to be backwards compatible with the existing software that we have you know code gen works but a lot of the systems we have today are systems that we can't change and so yeah the idea is that although language models were introduced to us to chat GPT most of us are actually Building Systems and not chat Bots we want to process input data integrate with existing systems via apis or schemas that we might not have control over and so the goal for today is effectively introduce open AI function calling introduce pantic then introduce instructor and Marvin as a library to make you using pantic to prompt language models are much easier and what this gets us is uh you know better validation makes your code a little bit cleaner and then afterwards I'll talk over some design patterns that I've uncovered and some of the applications that we have um this is basically almost everyone's experience here right like you know Riley Goodside had a tweet about asking to get Json out of Bard and the only way you could do it was to threaten to take a human life and that's not code I really want to commit into my repos and then when you do ask for Json you know maybe it works today but maybe tomorrow instead of getting Json you're going to get like okay here you go here's some Json and then again you kind of pray that the Json parsed correctly and I don't know if you noticed but here user is a key for one query and username is a key for another and you would not really notice this unless you had like good logging in place but really this just not happening to begin with right right like you shouldn't have to like read the logs to figure out that the passwords didn't match when you're signing for an account and so what this means is our prompts and our schemas and our outputs are all strings we're kind of writing code and text edit rather than an IDE where you could you know get linting or typechecking or syntax highlighting and so open AI function calls somewhat fix this right we get to Define Json schema of the output that we want and open AI will do a better job in placing the Json somewhere that you can reliably parse out so instead of going from string to string to string you get string to dict to string and then you still have to call Json loads and again you're kind of praying that everything is in there and a lot of this is kind of praying through the LM Gods um on top of that like if this code was committed to any repo I was managing like I would be pissed right complex data structures are already difficult to Define and now you're working with the dictionary of Jon loads and that also feels very unsafe cuz you get missing Keys missing values and you get hallucinations and maybe the keys are spelled wrong and you're missing underscore and you get all these issues and then you end up writing code like this and this works for like name and age and email then you're checking if something is a bull by parsing a string it gets really messy and and what python has done to solve this is use pantic pantic is a library that do data model validation very similar to data classes it is powered by typ pints it is has really great model and field validation it has 70 million downloads a month which means it's a library that everyone can trust and use and know that it's going to be maintained for a long period of time and more importantly it outputs Json schema which is how you communicate with open AI function calling and so the general idea is that we can define an object like delivery say that the time stamp is a date time and the dimensions is a toule events and even if you pass in a string as a timestamp and a list of strings as tles every everything is parsed out correctly this is all the code we don't want it right this is why there's 70 million downloads more interestingly time stamp and dimensions are now things that your IDE is aware of they know the type of that you get autocomplete and spellchecking again just more bug-free code and so this really want brings me to the idea of structured prompting because now your prompt isn't a you know triple quoted string your prompt is actual code that you can look at you can review and every one has written a function that returns a data structure right everyone knows how to manage code like this instead of doing the migration of Json schemas in the onot examples you know I've done database migrations I know how some of these things work and more importantly we can program this way and so that's why I built a library called instructor a while ago and the idea here is just just to make open aai function calling super useful so the idea is you import instructor you patch the completion API uh debatable if this is the best idea but ultimately Define your pantic object you set that as the response model of that create call and now you're guaranteed that that response model is the type of the entity that you extract so again you get nice auto complete you get type safety really great I would also want to mention that this only works for open AI function calling if you want to use a more comprehensive framework to do some of this pantic work I think Marvin is a really great uh library to try out uh they they give you access to more uh language models and more capabilities above this uh response but the general idea here isn't that this is going to make your Json come out better right the idea is that when you define objects you can Define nested references you can Define methods of the behavior of that object you can return instances of that object instead of dictionaries and you're going to write cleaner code and code that's going to be easier to maintain as they're passed through different systems and so here you have for example a base model but you can add a method if you want to you can Define the same class but with an address key you can then Define new classes like best friend and friends which is a list of user details like if I was to write this in Json schema to make a post request it would be very unmanageable but this makes it a lot easier on top of that when you have doc strings the doc strings are now a part of that Json scheme where that is sent to open Ai and this is because the model now represents both the prompt the data and the behavior all in one right you want good dock strings you good want you good you want good field descriptors and it's all part of the Json schema that you send and now your code quality your prompt quality your data quality are all in sync there's this one thing you want to manage and one thing you want to review and what that really means is that you need to have good variable names good descriptions and good documentation and this is something we should have anyways you can also do some really cool things with pantic without language models for example you can define a validator here I Define a function that takes in a value I check that there's a string that value and if it's not I return a lowercase version of that CU that just might be how I want to par parse my data and when you construct this object you get an error back out right we're not going to fix it but we get a validation error something where we can catch reliably and understand but then if you introduce language models you can just import the llm validator and now you can have something that says like don't say mean things and then when you construct an object that has something that says that the meaning of life is to evil and steal things you're going to get an validation error and an error message and this error message the statement is objectable is actually coming out of a language model API call it's using instructor under the hood to Define that but you know it's not enough to actually just point out these errors you also want to fix that and so the easy way of doing that in instructor is to just add Max retries right now what we do is we'll append the the message that you had before but then we can also capture all the validations in one shot send it back to the language model and try again right but the idea here that this isn't like prompt chain this is this isn't constitutional AI here we just have validation error handling and then reasing and these are just separate systems in code that we can manage if you want something to be less than 10 characters there's a character count validator if you want to make sure that a name is in a database you can just add a post request if you want to but this is just classical code again this is the backwards compatibility of language models but we can also do a lot more right uh structured prompts get you structured outputs but ideally the structure actually helps you structure your thoughts so here's another example uh it's really important for us to give language models the ability to have an escape hatch and say that it doesn't know something or can't find something and right now most people will say something like return I don't know in all caps check if I don't know all caps in string right uh sometimes it doesn't say that it's very difficult to manage but here you see that I've defined user details with an optional role that could be none but the entity I want to extract is just maybe user it has a result that's maybe a user and then an error and an error message and so I can write code that looks like this I get this object back out it's a little bit more complicated but now I can kind of program with language models in a way that feels more like programming and less like chaining for example right um we can also Define reusable components here I've Define a work time and a Leisure Time as both a Time range and the time range has a start time and an end time if I find that this is not being parsed correctly what I could do is actually add Chain of Thought directly in the the time range component now I have modularity in some in some of how in some of these features and you can imagine having a system where in production you uh disable that Chain of Thought Field and then in in in testing you add that to figure out what's the latency or performance trade-offs you could also extract arbitrary values right here I Define a property called key and value and then I want to extract list of properties right you might want to add a prompt that says make sure the keys are consistent over those properties we can also add validators to make sure that's the case and then Reas when that's not the case if I want you know only five properties I could add an index to the property key and just say well now count them out and when you count to five stop and you're going to get much more reliable outputs uh some of the things that I find really interesting with this kind of method is prompting data structures here I have user details age name as before but now I Define an ID and a friends array which is a list of IDs and if you prompt it well enough you can basically extract like a network out of this data out of your data so you know we've seen that structured prompting kind of gives you really useful components that you can reuse and make modular um and the idea again here is that we want to model both the prompt the data and the behavior here I haven't mentioned too many methods that you could act on this object but the idea is almost like you know when we go from C to C++ the thing we get is object oriented programming and that makes a lot of things easier and we've learned our lessons with object oriented programming and so if we do the right track uh I think we're going to get a lot more productive development out of these language models and the second thing is that these language models now can output data structures right that you can like pull up your old like lead code textbooks or whatever and actually figure out how to Traverse these graphs for example process this data in a useful way and so now they can represent you know knowledge workflows and even plans that you can just dispatch to a classical computer uh computer system right you can create the data that you want to send to air flow rather than doing this for Loop hoping it terminates and so now I think about 6 minutes so I'll go over some Advanced applications um these are actually fairly simple I have some more documentation if you want to see that later on but um let's go over some of these examples so the first one is rag I think when we first started out a lot of these systems end up being systems where we embed the user query make a vector database search return the results and then hope that those are good enough but in in practice you might have multiple backends to search from Maybe you want to rewrite the user query maybe you want to decompose that user query right if you want to ask something like what what was something that was recent you need to have time filters and so you could Define that as a data structure right the search type is email or video search has a title a query a before dat and a type and then you can just implement the execute method that says you know if type his video do this if email do that really simple and then what you want to extract back out is multiple searches that give me a list of search queries and then you can write some like a in coyota map across these things and now because all that prompting is embedded in the data structure your prompt that you sent to open AI is very simple your helpful assistant segment the search queries and then what you get back out is this ability to just have an object that you can program with in a way that you've managed sort of like all your life right something very straightforward but you can also do something more interesting you can then plan right before we talked about like extracting a social network but you can actually just produce the entire dag here I had the same graph structure right it's an ID a question and a list of dependencies where I have a lot of information in the description here and that's basically the prompt and what I want back out is a query plan so now if you send it to a query planner that says like you're a helpful query planner like build out this query you can ask something like what is the difference in populations of Canada and Jason's home country and then what you can see is you know what like if I'm good at Elite code I could query the first two in parallel because there are no dependencies and then wait for dependency three to merge and then wait for four to merge those two but this requires one language model call and now it's just traditional Rag and if you have an IR system you get to skip this for Loop of agent queries you know an example that was really popular on Twitter recently was extracting knowledge graphs you know same thing here here what I've done is I've made sure that the data structure I model is as close as possible to the graph VI uh visualization API what that gets me is really really simple code that does basically the creation and visualization of a graph I just Define things one to one to the API and now what I can do is if I ask for something that's very simple like you know give me the description of quantum mechanics you can get a graph out right that's basically in like 40 lines of code because what you've done is you've modeled the data structure graph is needs to make the visualization and we we're kind of try to couple that a lot more this is a more advanced example so don't feel bad if you can't follow this one but here what I've done is I've done a question answer is a question and an answer and the answer is a list of facts and what a fact is is it's a fact as a statement and a substring quote from the original text I want multiple quotes as a substring of the original text and then what my validators do is it says you know what for every quote you give me validate that it exists in the text Chunk if it's not there throw out the fact and then the validator for question answer says only show me facts that have at least one substring quote from the original document so now I'm trying to encapsulate some of the business logic of not hallucinating not by asking it to not hallucinate but actually trying to figure out like what is the like par like the paraphrasing detection algorithms to to identify that what the quotes were and what this means is instead of being able to say that the answer was in page seven you can say the answer was this sentence that sentence and something else and I know they exist exist in the text junks and so I think what we end up finding is that uh as language models get more interesting and more capable we're only going to be limited in the creativity that we can have to actually prompt these things right like you can have instructions uh per object you can have like recursive structures right it it it goes into domain modeling more than it goes to prompt engineering and again now we can use the code that we've always used if you want more examples I have a bunch of examples here on different kinds of applications that I've had with some of my Consulting clients um yeah I think these are some really useful ones and I'll go to the next slide which is this doesn't have the p uh the QR code that's fine the updated slide has a QR code but instead you can just visit jxl github.io instructor I also want to call out that uh we're also experimenting with a lot of different uis to do this structured evaluation right where um you might want to figure out whether or not one response was mean but you also want to figure out what the distribution of floats was for a different attribute and be able to write evals against that and I think there's a lot of really interesting open work to be done right like right now we're doing very simple things around extracting graphs out of documents you can imagine a world where we have multimodal in which case you could be extracting bounding boxes right like one application I'm really excited about is being able to say give an image draw the bounding box for every image and the search query I would need to go on Amazon to buy this product and then you can really instantly build a UI that just says you know for every Bounty box render a modal right you can have like generative UI over images over audio I think in general it's going to be a very exciting space to play more with uh structured outputs thank [Applause] you
Info
Channel: AI Engineer
Views: 170,137
Rating: undefined out of 5
Keywords:
Id: yj-wSRJwrrc
Channel Id: undefined
Length: 17min 55sec (1075 seconds)
Published: Wed Nov 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.