Launch an LLM App in One Hour (LLM Bootcamp)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] agenda for this talk um a few intro sort of set up maybe everybody has their own reason for being excited about large language models and artificial intelligence right now uh but I'll give you mine and my perspective uh the one that we have at fsdl and then we'll uh go through what it takes to get an LM app up and going in an hour first uh on prototyping and iteration like figuring out what you can and can um and then deployment getting something out there and and running as an actual action people to interact with the basic form of what we put together um and what I describe in this talk will be available for you to interact with during the event um and so uh we'll we'll see what that looks like um and then we'll talk about what comes after that first exciting hour of quickly shipping um and explain why there's you know 10 more hours of talks after this one all right so uh to introduce why this is a particularly good time to be thinking about uh artificial intelligence I want you to imagine a world where there are disturbingly simple and heuristic computer programs that can mimic human cognitive process they're so good that actually humans are forming emotional attachments with their chatbot therapist um these there's computers that can play chess they can write mathematical proofs there's like this awesome result that they can pass examination that we do in school to check people's intelligence um and you know mind-blowingly you can get investment advice from an artificial intelligence um and you don't have to be some Visionary co-founder to imagine this world uh because it happened in 1965. so this is a paragraph from heuristic problem solving by computer a paper uh by Herbert Simon and Alan Newell that like took a look at the landscape of artificial intelligence at that time it noted that there was a level of Effectiveness comparable to humans um that they could play chess do proofs past School examinations and uh and simulate the behavior of an investment fund portfolio manager um so I you know reread this paper only like within the last month or two and was just like shocked at its uh similarity to what people are doing now so the question is like okay the you know clearly that didn't really pan out um we've not had artificial intelligence for 60 years uh so like what's what's different this time and I think the clearest answer is that there's like one tool that can do all of those things instead of a bunch of kind of like specialized tools one for each particular task that has a mixture of human knowledge expert knowledge and uh knowledge maybe you learned from data um there but one for each separate task now you can just use one single tool and with a tiny amount of configuration get it going on a new task which is you know more similar to the way that we get humans uh to solve tasks for us uh than the way that machine learning models or old school effort systems models solve problem uh so language models like what what are these tools that kind of change the game and made this possible in a way that it was not before uh language models model language that's uh like surprisingly simple trick that is required to get this capability um and what that looks like is something like this a paragraph Was appearing word by word and you wanted to guess the next word is appears you're pretty smart especially if you could predict more complicated things than just like conversations and chatter but things like you know in the in the midst of of speaking to somebody you might grow out maybe not a a mathematical equation but you might talk about python and especially if you look at written documents there are many written documents uh that include uh these kinds of more complicated things in them and as those uh words were appearing on the screen uh at certain places you were able to guess what the next word would be uh in a way that required your knowledge of something like python your knowledge of Science and the physical world your knowledge of mathematics and so large language models are really really good at this uh guessing what the next word is gonna be uh and so in order to get good at that task they get good at all of these other things like adding uh numbers together or writing python code or learning things about the physical world um and the I think one of the key pieces that is unlocked by this particular path to artificial intelligence and that we're excited to be talking about and sharing in this boot camp is language user interfaces so a language user interface is a type of user interface that's been speculated as a way to interact with computers since the very beginning so since like the 1960s the 1950s even we're familiar with graphical user interfaces and terminal user interfaces as ways to interact with computers but before the GUI came about and you could look at a computer and and like interact with like a physical space that was presented to you visually people thought oh maybe the way I want to interact with the computer is the way I interact with a person by talking to them by speaking in my natural language be it you know English or uh Spanish or uh whatever and so there was like early work at trying to get this to happen so a famous chat bot therapist named Eliza uh in the 1960s um uh by beisenbaum and uh then so that was just a purely language interface uh but people quickly realized that if you have a language interface that at least like gets the ideas from natural language into a computer you could also have that language impact uh like something that was not linguistic so for example like a block world a world of like simple objects so that would be that was put together by Terry Winograd in the shirt blue program in 1970 and uh this general idea of language user interfaces I guess like the kind of closest that we've gotten so far is things like search engines where you can type in something that it's a little bit Googlies maybe rather than your natural language um but it uh it you interact with it much more naturally and you're interact with like python you know uh programming language like Python and the first one of the first popular search engines asked you was like really presented itself as a natural language interface to the internet um and with large language models we can make much more flexible and much more capable language user interfaces that actually you know behave in process language much more the way that people expect uh than these older systems which are a lot more brittle a lot more focused on a specific like particular example like Psychotherapy only and a very simple kind or this block world only um this very limited world of discourse um and this also I think there's there's this like tremendous wave of hype that I think has got a lot of triggered a lot of people's like alarm Bells about uh language models and artificial intelligence uh these um in particular people who've seen these past ways um and so I I just want to get across the idea that this is something that was was guessed at before it this is something that thought might happen if we got language modeling for it um so this is a paper from the 1990s about what problems are AI complete what are the problems that if we were to solve them uh we would consider ourselves to have developed physical intelligence um it's kind of a slippery concept it's named after things from classical computer science like NP complete but it doesn't have quite that degree of mathematical rigor but it represents the sort of field belief about what it takes to um uh to solve this problem and the first one on the list is natural language understanding being able to understand natural language uh at a deep level um and so this combination of like really nice user interfaces a new type of user interface um and this like long-standing goal in the field of natural language understanding is sort of the reason we have this large capability unlocked with uh the release and open availability of large language model um so but it's like so we have things are meaningfully different this time um and that's one reason for Hope but it's also important to look back and examine like wait what did happen in 1966 why didn't we end up uh with um artificial intelligence because I'm sure there are people who would have been uh just as excited to share how their models worked back then uh and the main thing that happened is that we sort of oversold and under delivered as a field at that time uh which led to a phenomenon known as the as an AI winter um a period in which people became less interested in which funding was harder to pelmapall uh for these kinds of projects so the prominent one was right after that big wave in the 1960s and then there was another smaller one in the in the 1990s and it's important to look back at some of the details of like how that went down to understand how we can avoid that this time around uh and one of the like key documents of that time is this uh report by Sir Richard lighthill uh for the British government he was a mathematician he got interested in this whole artificial intelligence thing people were doing um so we looked into it and he decided it was like a load of nonsense um and wrote up his feelings about it and convinced the British government the British public and and um like eventually you know the world that this was a bad idea um and the I pulled out a couple of key phrases from this report he talks about the high hopes that were very far from having been realized um which was causing enormous sums to be spent with very little result um in particular in machine translation and and also he he really noted that this this money was being spent on these problems and then it wasn't being turned into things that were commercially valuable there's also a separate section about like research application but there's a a lot of it about the the fact that these programs were just like not versatile enough commercially viable and that meant that this was like a bad investment of public research money uh and So to avoid and another AI winter coming up in the 2020s we need to build actual products that people value we need to build software build tools build things that people actually use and then people will continue to be interested uh people will continue to uh like you know uh fund our projects and make sure we can pay rent while we're playing with uh chat even Pete so the good news is that this time around there is a lot more than just research happening um so I just went and grabbed like trending repositories on GitHub um I think this is for the last month um but looking at them the top trending repository is a uh is a the Deep speed Library which is for doing fast training and imprints with large models and it's particularly used with a lot of these large language models um there's also re of the the rest of the top five well two of the rest of the top five are actual applications they use uh these large language models so the open Assistant from Leon uh that's an attempt to build like an open replication of something like the chat model from open AI um and Andy like interacted um image editing tool painter that uses uh like Vision uh Vision models and language models um and then lastly a a new database technology that called the silk AI native chroma for AI native embedding um so there's you know these are actual libraries people are building people are creating companies around them people are contributing to them by the thousands downloading them and using them um the like if you want uh an even faster fire hose of like things people are building you can check out hugging faces spaces which is regularly showing cool demos um I think net Freeman Put It Best in that when he said that like finally there's tinkering so finally people are actually uh like putting things together with these models that are these like useful little uh little toys or simple simple demos or something like you know that makes for a good like tweet thread um or like an afternoon playing around um with uh with an image generation model or whatever um but the bad news is that the Gap from like a demo like that something that you can put together like fairly quickly and um and like use in certain cases the Gap from that kind of demo to a product is really big um even even when you're building something with a product orientation like when you aren't just a researcher uh so for example I remember back in the mid 2010s when people were really getting excited about self-driving cars on the basis of uh high quality uh image neural networks uh be there were a ton of demos of self-driving cars like this one from Nvidia at uh uh in 2017 and it's 2023 and we still don't have like self-driving cars uh readily available there's maybe a few few ways to use them in very limited settings but they're still closer to closer to demo than product uh and that Gap can be like really really big uh so the first the first demonstration of a neural network for for running a self-driving car goes back to nurex in 1988 uh with Alvin the autonomous land vehicle and a neural network um so uh if you know just because you've made something that can work in a specific setting or that um is impressive in a particular context doesn't mean that you can immediately productize that thing uh and it's starting to happen um and we're starting to the the field is starting to create and and release products um so probably the most famous one is open AIS chat GPT just directly productizing language modeling um which set a Record for fastest growing user Base According to some analysts maybe uh like uh fastest to a million users maybe fastest to 100 billion users as well um faster than Instagram faster than tick tock um so that's a good sign um uh tools for coding like GitHub copilot uh replica Ghostwriter are also popular uh powered by these large language models and uh also some of the tools that uh like that we use in full stack deep learning to build our uh build our content uh like dscript for editing video and podcasts like you would edited so the product building is actually happening uh and that's uh good news for avoiding AI wink pick and it also means that there's like kind of a Playbook emerging where there's patterns emerging for how to actually build these things you don't have to like fully Blaze the trail yourself uh and so that is what we are going to cover in um like over the course of these two days we're going to talk about a lot of the things that go into building those kinds of applications um like chat gbt or co-pilot and uh and in this first little like quick bit here we're going to like just uh slap Dash put together uh uh one of those down all right so I've got to put in a question slide but I want to pause here post questions in that q a session will our q a channel and we'll get we'll get back to them um but yeah let's let's dive into this sort of like uh process of putting together like a quick demo application um and uh yeah so uh this is in the past couple months there's been a number of hackathons around uh bay area and elsewhere oriented on like you know putting together quick like product demos using these tools and uh so I've been attending a bunch of them and seeing what other people are doing and uh and building stuff myself just wanted to walk through webpack like um so the first half hour is sort of full of like typing and and iteration um it's sort of like a process of discovery of what is possible um so the kind of like takeaways here are that like rather than needing to spend a bunch of time like uh like thinking really hard about whether machine learning can possibly solve this problem um you can just try with uh your application some high capability hosted model like open ai's model try them out in like a simple chat type interface first see if you can like see what works what doesn't work um and you'll like often be surprised by what you keep going in that simple context of course that's like not what you're gonna deploy but you can at least figure out like is this possible and see what the like core problems you're gonna have to solve are um and then you can like jump into some environment where you can do some like really quick tinkering in a programmatic way uh so uh we'll do that by jumping into like a quick notebook environment my preferred environment for that kind of uh tinkering uh and sort of like figure out like okay what are how would I make some of the things that I'm doing in this chat interface more programmatic um and uh and there's like a large set of like open source Frameworks and tools coming out that make that easy so first thing that I do when I want to figure out um like whether I can build something with large language models is I drop into some like convenient interface for interaction so it's like interface for chat q and T is great for this um and uh so that's usually where I start and I think it's it's helpful to sort of like start off with a particular problem statement uh so the problem that I have is that I want to learn as much as possible about large language models um and so I want love to be able to use large language models to make uh to help me learn about large language models um and you know I'm sure there's like lots of other people who I'm connected with who also have this problem uh and so I'd like to help them as well uh so like just first off I can go in and just like ask questions about uh language models to uh chat GPT so like it's a good idea to start with something where you know the right answer because at the beginning here you're kind of gonna end up simulating a lot of components of your final application in your head you have to pretend to be a user you have to think about how you're going to be programming this thing you have to maybe like step in and do stuff that you'll eventually automate with a language model um so there's like a lot of pieces here and you want to make sure that you're putting them together in the right way and getting and getting good results um so start off with some things you might already know so I picked um uh like something that I already knew about language model which is what is zero shot Chain of Thought prompting um and the answer that I get from chat is not really good zero shot Channel prompting is an NLP technique used to generate a response to a prompt without having been explicitly trained on that prompt that gets that gets some components of it gets the zero shot piece but it doesn't get the Chain of Thought it gives an example a prop like what is the capital of France language model can generate from Paris by recognizing the relationship uh so that's like that's not quite the right answer um it's it's missing a ton of additional information and the missing piece here is that these language models are like constructed on data that lasts only up to a certain point in time about 2021 for most models and a lot of the interesting stuff that has happened in language models has happened after the cutoff of their training date uh so that's um that's not gonna uh so these Pro this isn't gonna work and the and the like another piece is I really want to be able to like follow up on this information like hey I want to be able to read more not just from language models but like I want to discover new resources um and in general um the outputs of these models a bunch of stuff that does not exist um so for example in a previous time when I asked about this where I could read more it came up with it suggested one paper that did um it also suggested a bunch that didn't exist like how to build open AIS gbp3 is step-by-step guide on towards data science which sounds like something that might exist but as far as I can tell uh does not exist um and uh yeah and I don't think I'd want to read it if uh if it did so yeah so like we have a problem here which is that the language models like don't know um like where the sources are and they are they there's information that they don't have uh and so we need to like provide it to them if we want it to be a good um uh a good solution uh for learning about language models so um like in this chat interfate I might actually I might pull up this zero shot I can't just check so my I'm forming a hypothesis about have a language model the haze based off of like what I've read about them what I've learned other things I've seen on online and building a hypothesis about why it's like succeeding uh or failing but I should I should check that before I go on to like trying to build up more of a prototype uh so my hypothesis is that if it just had the right paper in there then it would give the right answer so let me just like quick look the paper that I'm thinking of up on archive grab the abstract of it and then fit in so here's the abstract the paper drop it in based on that abstract what is the earth shot so this isn't something we'd expect a user to do right this is something that's going to happen in our application but we're just mocking it in this are we uh early stage so now the model gets the answer correct zero shot CO2 prompting technique for performing complex multi-step reasoning without any handcrafted few shot examples involves simply adding the phrase listening step by step before each answer um so great so if we can get our sources into our language model by pulling them from somewhere and and we can ask the language model like questions with those sources in a prompt then we can probably get a decent answers um so at this point once I've like tried something like this out done a little bit of back and forth maybe um uh then I'd hop over to something like uh like an ephemeral notebook environment like collab to like try out like what what would this look like if I were to actually start automating some of these steps uh so apart from setting things up so the first thing is like how am I going to interact with the language model so open AI offers an API so you don't have to be typing stuff uh typing stuff in uh and there is a python SDK for it and a bunch of other sdks for it but uh when and and you can build your application on on top of that and lots of people do they want that level of control there's also a bunch of emerging like Frameworks uh open source Frameworks for uh for interacting with these models and probably the most popular than this Library Lang chain um and in my experience every time that I need to do something new with a language model like uh oh you know what I could do my Vector store in redis oh it looks like it's not in langfang let me code that up really quick then like a week later Lang chain adds it as a feature um so they're like they're moving extremely fast and sort of like adding all the things that you might need sort of um serving a similar role that like framework in uh like react might for front-end development like just giving you all the nuts and bolts you need so you don't have to reinvent the um so for example it gives us an abstraction over LMS where we just like call them like python functions which is convenient for not having to think too carefully about what's going on um yeah hot tip uh like when you're doing this like prototyping phase which we'll run into is that like just displaying this text is really challenging uh one of my favorites is in a notebook environment you have things marked down for rendering um yes all right so we've got a little a place where we can um where we can run the the language models we can see what their outputs look like so now we need to do that process of like you know Finding information and bringing it yeah uh bringing it into the context so there's uh at this point you could think oh gosh I gotta go and like come up with a way to like you know scrape uh information from archive um for me I my white whale was I spent a very long time trying to figure out the YouTube SDK um but it turns out there's almost always a python library with creeping uh you need to do so just like pip install archive and import it and you're ready to go pay for summary um and so this maybe that's what it's going to look like in our final application maybe not but first quick check if we just grab that summary and run it into our language model do we still get a good answer um and if you're such a thought prompting multi-step breathing single prompt template okay looks pretty good um but the answers are always going to be in the summary there's probably going to be answered elsewhere in that uh in the paper and now we run into a problem of like how do I get information out yeah um luckily uh not only is there a python package for like getting uh information out of PDFs it's also built into link chain as a document loader um so there's a lot of these like core pieces that you might need for lots of different applications already built in with lots of different options so I just picked a simple uh PDF order um and so again this the information for this question is just in those first two pages to look like grab it out and um uh and see if we get the right answer and again the model answers uh well answers correctly uh so then the last piece is like how do we solve this problem at a larger scale like how do I find like I have a bunch of information I want to like search through that and find the relevant um relevant stuff we'll hear a ton about all the different ways to solve that problem um in uh uh Josh's lecture on augmented language models but for now we'll just do what's uh become popular in the LM Community which is using embedding search so turn documents into and check the similarity of those of those factors all these things also built into Lang chain uh split things into pieces turn them into vectors and then check like which of these vectors have content that's similar to this question so it's a heuristic much like many of the other heuristics like keyword matching um you can definitely get a lot better than this um but maybe this will work for our simple uh simple example this says information zero shot cot at zero shot template prompting for chain and fall reasoning um so it's got it that it was the first thing returned has a has our example in it and we could continue like prompt fiddling like solve the problem like how do we take this thing and put them in there how do we get the language model to like cite its sources but at this point uh once I've like got the kind of core going I go out and I check like Twitter and uh Lang chain to see like who else has built something similar and you'll often find that there's like a really good uh chassis to jump off of so this QA was sources thing that I'm doing here emerged as a bit of like a pet store API or a um or a to-do list app kind of uh introductory example so there's tons of options uh available out there and we'll just run with uh linkedin's default example um and just uh like once again we get something that's like callable like a python function um and uses uh uses a language model under the hood so now we've got our source citation and an answer um coming out of our language model and so we've like demonstrated the feasibility here and like we just gotta turn this into something that can actually be used by people um uh and at a larger scale so um I think the the first yeah let's jump back so that's your like first half hour and again it's like the ability like prototype and Tinker has not been really part of the like machine learning workflow until like when I started it was like okay first you need to Define your task really carefully and then collect gigabytes of data um and then only then can you like try and figure out whether it's feasible or not and then once you figure out whether it's vehicle or not you can try and do like uh you know um like try and build the user interface around it so this is what's most different now with most like unlocked by the new uh the new affairs um okay so then lastly so then like once we've got that that like basic thing going it's time to like put together like an MVP version of that that's Deployable and ship it and uh this like the the key like building a building a demo is easy what what makes building the actual application hard is like figuring out what's actually useful to people not just yourself people like you but maybe like a broad collection of users and so like your priority should be like putting together a user interface where you can start getting feedback from users like as quickly um and um the like other major like unlocked here coming not from the machine learning world but from the broader software engineering world is it's like much easier to get started now thanks to like all like amazing tooling for like like Cloud native or cloud-based development um so long uh like this makes it super easy get started makes it super easy to scale when you finally like blow up on uh on social media um just like make sure to limit your your spending um so like all right so we have this working q a box uh it answers questions about stuff uh like what how can I turn that into like a user interface right some like readily available users will try it out and give me feedback well full stack deep learning has a Discord where um we have people from our past classes and so like let's build like a quick Discord bot um and this is uh you know popular enough of a way of building things with large models that if what mid-journey um if the primary user interface or mid Journey the like text to image model um so you know start here finish anywhere um so what is that like how do we put something like together uh we're gonna need to add some way to get a hold of data and put it into storage so we're gonna need information lots of all of our different sources put them in storage and then uh we need a way to like quickly index them so that was uh like pull out the relevant information uh that's uh then we're starting with this like very early version of the application we Pro we don't need to we don't want to have a bunch of servers on all the time because we want some serverless backend to communicate with a language model communicate with our various persistent Solutions um and then like that's where some of the heavier lifting happened and then we have a lightweight Discord bot server um that's like you know relatively cheap to run the whole thing um and so there's like an potentially like lots of different Stacks that can solve this um a particular collection well uh I'm gonna use in this demo application is um like open AI to give this our language models um pine cone to give this quick search over effect mongodb is our data storage just because you know we're moving fast here um skin is just going to slow us down at least for now and then uh for both our serverless backend and for our like loading of data into our data storage we'll use mode which provides just like kind of quickly scalable serverless compute for data science um and then for that Discord bot server we'll we can use something like AWS lightweight ecq instances are free for the first year um so like make a new Gmail account like sign up for AWS get a new credit card and now you have a whole year of free Discord um so the um not really enough time to go through the full details of how this is set up but I did want to just give one plus or why this like what oh yeah why why did I call it like Cloud native tooling is something that actually accelerates this process the bottleneck I ran I like repeat I repeatedly ran into Eugene mentioned running into this is like it's always a data processing right like that's always the part it's like heavy and slow and and fiddly in the details and so I ran into this problem where I had like a couple hundred PES that I wanted um yeah so the primary source for this is my uh like uh lit review about 300 models uh 300 acres on language modeling it takes a long time for even a computer to read uh 300 PDF and that was like bottleneck me and bottlenecking knee and development with modal I can just like put a little wrapper around a couple of my functions for like loading extracting PDFs and then just throw it up their infrastructure and launch like 100 containers one for each PDF like why would I put multiple PDFs multiple PFS in a single container um like that we're mixing concerns obviously we should split this out um so you could just like quickly like solve lots of your data problems without needing like to set up kubernetes to like get a whole data engineering team to like or to like pay for maintaining a bunch a bunch of of servers so like really great for small teams it's critical in this time where we have uh um like a lot capabilities like unlocked all at once with um language model uh so then at that point uh like we take we just take some of the stuff from that notebook like put it into uh put it into some scripts and wrap it with Moodle and uh it so we'll talk in like a lot more detail about this what this engine looks like in uh the afternoon we do a walk through uh code base of a sort of slightly more mature version of this um application so you can see what it actually looks like um but uh yeah that's that's the basic playbook for getting something going really quickly and so the um this version of this is already running um it's in the ask emphasity lbot channel in Discord um and it's got information about llms and it's also got information from our past classes so it's got a bunch more uh data sources so um type slash ask uh and you can ask the bot in the Discord I would like to just like a quick example there all right um all right so our first question coming in help me visualize what happens behind the scenes available and visualizing what happens behind the scenes can be done by constructing belief graph okay so this is interesting this is a an interesting paper back from like the birthdays that tried to figure out what the like knowledge base of a language model was so interesting resource not a bad answer to the question um also um yes this uh bot also has some stuff for uh for handling irrelevant questions um like how are you doing today so yeah you can Emoji react uh so just type slash ask you get this question pop up so what is your shot VOD prompting boom um what are these elements billions of users that might come in they might look how cool it is that it links to a specific time stamp oh yeah yeah up here so yeah index and why do I need it um yeah so it so up here it's got links not only to like papers but also stuff from our past classes with links to find stamp components of our YouTube videos so we'll see a little bit more about how that's done um yeah so I encourage you to try this out like ask questions in different languages and see what it does like try and prompt inject it I'm sure that there are ways to prompt inject it if you're familiar with that so just like try stuff out we're collecting up this user data um and we'll sort of like analyze it in the afternoon um and because like so one of the next steps that you need to do after you put this together uh one of the next steps that you need to do is like start monitoring this thing um since you can see like how you guys are interacting with it what's going well what's going poorly um so that you can start making Improvement um so that'll be uh we'll uh the stuff that you provide will provide some interesting lessons for how language model power applications fail and succeed hopefully at least foreign [Music]

Info

Channel: The Full Stack

Views: 67,489

Rating: undefined out of 5

Keywords: deep learning, machine learning, mlops, ai

Id: twHxmU9OxDU

Channel Id: undefined

Length: 39min 32sec (2372 seconds)

Published: Thu May 11 2023