GPT-4 - How does it work, and how do I build apps with it? - CS50 Tech Talk

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

all right well this is a cs50 tech talk thank you all so much for coming so about a week ago we circulated to Google form as you might have seen at 10 52 a.m and by like 11 52 a.m We had 100 RSVPs which I think is sort of Testament to just how much interest there is in this world of AI and open Ai and GPT chat GPT and the like and in fact if you're sort of generally familiar with what everyone's talking about but you haven't tried it yourself like this is the URL which you can try out this tool that you've probably heard about chat GPT um you can sign up for a free account there and start tinkering with what everyone else has been tinkering with and then if you're more of the app minded type which you probably are if you are here with us today open AI in particular has its own low-level apis via which you can integrate AI into your own software but of course as is the case in computer science there's all the more abstractions and services that have been built on top of these Technologies and we're so happy today uh to be joined by our friends from McGill University and steamship uh sill and Ted from whom you'll hear in just a moment to speak to us about how they are making it easier to build to deploy to share applications using some of these very same Technologies so our thanks to them for hosting today our friends at Plimpton Jenny Lee and alumna who's here with us today but without further Ado allow me to turn things over to Ted and sill and pizza will be served shortly after 1pm outside all right over to you Ted thanks a lot uh Hey everybody it's great to be here I think we've got a really good talk for you today uh still is going to provide some research grounding into how it all works what's going inside uh the brain of GPT as well as other language models and then I'll show you some examples that we're seeing on the ground of how people are building apps and what apps tend to work in the real world so our perspective is we're building AWS for AI apps so we get to talk to a lot of the makers who are building and deploying their apps and through that see both the experimental end of the spectrum and also see what kinds of apps are getting pushed out there and turned into companies turned into side projects we did a cool hackathon yesterday uh many thanks to to Neiman to David Malin and cs50 for helping us put all of this together to Harvard for hosting it and there were two sessions lots of folks built things if you go to steamship.com hackathon you'll find a lot of guides a lot of projects that people built and you can follow along we have a text guide as well just as a quick plug for that if you want to do it remotely or on your own um so to tee up so we're going to talk about basically two things today that I hope you'll walk away with and really know how to then use as you develop and as you Tinker one is what is GPT and how is it working get a good sense of what's going on inside of it other than as just this magical machine that predicts things and then two is how are people building with it and then importantly how can I build with it too if you're a developer and if you have cs50 background you should be able to pick things up and start building some great apps I've already met some of the cs50 grads yesterday and the things that they were doing were pretty amazing so hope this is useful I'm going to kick it over to sill and talk about some of the theoretical background of GPT yeah so thank you Ted um my name is so I'm a graduate student in the digital Humanities at McGill I study literature and computer science and Linguistics in the same breath and I've published some research over the last couple of years exploring what is possible with language models and culture in particular and my half or whatever of the presentation is to describe to you what is GPT that's real it's really difficult to explain in 15 minutes and there are even a lot of things that we don't know but a good way to approach that is to First consider all the things that people call GPT by or descriptors so you can call them large language models you can call them Universal approximators from computer science you can say that that it is a generative AI we know that they are neural networks we know that it is an artificial intelligence to some it's a simulator of culture to others it just predicts text it's also a writing assistant if you've ever used Chachi PT you can plug in a bit of your essay get some feedback it's amazing for that it's a Content generator people use it to do copywriting jasper.ai pseudorite Etc it's an agent so the really hot thing right now if you might have seen it on Twitter Auto GPT baby AGI people are giving these things tools and letting them run a little bit free in the wild to interact with the world computers etc we use them as chat Bots obviously and the actual architecture is a Transformer so there's lots of ways to describe GPT and any other one of them is a really perfectly adequate way to begin the conversation but for our purposes we can think of it as a large language model and more specifically a language model and a language model is a model of language to if you allow me the tautology but really what it does is it produces a probability distribution over some vocabulary so let us imagine that we had the task of predicting the next word of the sequence I am so if I give a neural network the words I am what of all words in English is the next most likely word to follow that at its very core is what GPT is trained to answer and how it does it is it has a vocabulary of 50 000 words and it knows roughly given the entire internet which words are likely to follow other words of those fifty thousand in some sequence up to two thousand words up to four thousand up to eight thousand and now up to thirty two thousand three pt4 so you give it a sequence here I am and over the vocabulary of 50 000 words it gives you the likelihood of every single word that follows so here it's I am perhaps the word happiest fairly frequent so we'll get that high probability if we look at all words all utterances of English it might be I am sad maybe that's a little bit less probable I am school that really should be at the end because I don't think anybody would ever say that I am Bjork that's a little bit it's not very probable but it's less probable than happy sad but there's still some probability attached to it and when we say it's probable that's literally a percentage that's like happy follows I am maybe like five percent of the time sad photos I am maybe two percent of the time or whatever so for every word that we give GPT it tries to predict what the next word is across 50 000 words and it gives every single one of those 50 000 words uh number that reflects how probable it is and the Really magical thing that happens is you can generate new text so if you give GPT I am and it predicts happy as being the most probable word over fifty thousand you can then append it to I am so now you say I am happy and you feed it into the model again you sample another word you feed it into the model again and again and again and again and there's lots of different ways that I am happy I am sad can go and you add a little bit of Randomness and all of a sudden you have a language model that can write essays that can talk and a whole lot of things which is really unexpected and something that we didn't predict more even five years ago so this is all relevant and if we move on as we scale up the model and we give it more compute in 2012 Alex that came out and we figured out we can give the model uh we can run the model in gpus so we can speed up the process we can give the model lots of information downloaded from the internet and it learns more and more and more and the frequent the probabilities that it gives you get better as it sees more examples of English on the internet so we have to train the model to be really large really wide and we have to train it for a really long time and as we do that the model gets more and more better and expressive and capable and it also gets a little bit intelligent and for reasons we don't understand so but the also the issue is that because it learns to replicate the internet it knows how to speak in a lot of different genres of text and a lot of different registers if you begin the conversation like chat GPT can you explain the moon landing to a six-year-old in a few sentences gpt3 this is an example drawn from the instruction PTA paper from openai gpt3 would have just been like okay so you're giving me an example like explain the moon landing to a six-year-old I'm going to give you a whole bunch of similar things because those seem very likely to come in a sequence it doesn't necessarily understand that it's being asked a question has to respond with an answer gpt3 did not have that apparatus that interface for responding the questions and the scientists had openai came up with the solution and that's let's give it a whole bunch of examples of question and answers such that we first traded on the internet and then we train it with a whole bunch of questions and answers such that it has the knowledge of the internet but really knows that it has to be answering questions and that is when chat GPT was born and that's when it gained 100 million users in one month I think it'd be tick tock's record at 20 million in one month it was a huge thing and for a lot of people they went oh this thing is intelligent I can answer I can ask it questions it answers back we can work together to come to a solution and that's because it's still predicting words it's still a language model but it knows to protects words in the framework of a question and answer so that's what a prompt is that's what instruction tuning is that's a key word that's what rlhf is if you've ever seen that acronym reinforcement alignment with human feedback and all those combined means that the models that are coming out today the types of language predictors that are coming out today work to operate in a q a form gpt4 exclusively only has the Align model available and this is a really great solid foundation to build on because you can do all sorts of things you can ask Chachi PT can you do this for me can you do that for me you might have seen that openai has allowed plug-in access to chat CPT so it can access Wolfram it can search the web it can search it can do instacart for you it can look up recipes once the model knows that not only it has to predict language but that it has to solve a problem and the problem here being give me a good answer to my question it's suddenly able to interface with the world in a really solid way and from there on there's been all sorts of tools that it build on this q a form that Chachi PT uses you have Auto GPT you have Lang chain you have uh react there's a react paper where a lot of these come from and turning the model into an agent with which to achieve any ambiguous goal is where the future is going and this is all thanks to instruction tuning and with that I think I will hand it off to Ted who will be giving a demo or something along those lines for how to use GPT as a agent so all right so I'm a super replied guy I kind of look at things and think okay how can I like to add this Lego add that Lego and clip them together and build something with it and right now you know if you look back in computer science history when you look at the kinds of things that were being done in 1970 right after Computing was invented the microprocessors were invented uh people were doing research like how do I sort a list of numbers and that was meaningful work and importantly it was work that's accessible to everybody because nobody knows what we can build with this new kind of oil this new kind of electricity this new kind of unit of computation we've created and anything was game and anybody could participate in that game to figure it out and I think one of the really exciting things about GPT right now is yes in and of itself it's amazing but then what could we do with it if we call it over and over again if we build it into our algorithms and start to build it into broader software so the world really is yours to figure out those fundamental questions about what could you do if you could script computation itself over and over again in the way that computers can do not just talk with it but build things atop it so we're a hosting company we host apps and these are just some of the things that we see I'm going to show you demos of this with code and try to explain some of the thought process but I wanted to give you a high level of overview of you've probably seen these on Twitter but kind of when it all sorts out to the top these are some of the things that we're seeing built and deployed with language models today companionship that's everything from I need a friend do I need a friend with a purpose I want a coach I want somebody to tell me go to the gym and do these exercises I want somebody to help me study a foreign language question answering this is a big one this is everything from your Newsroom having a slack bot that helps assist you does this article conform to the Style Guidelines of our Newsroom all the way through to and you need help on my homework or hey I have some questions that I want you to ask Wikipedia combine it with something else synthesize the answer and give it to me utility functions I would describe this as as there's a large set of things for which human beings can do them if only or computers could do them if only they had access to language computation language knowledge an example of this would be read every tweet on Twitter tell me the ones I should read that way I only get to read the ones that actually make sense to me and I don't have to skim through the rest creativity image generation text generation storytelling proposing other ways to do things and then these wild experiments and kind of baby AGI as people are calling them in which the AI itself decides what to do and is self-directed so I'll show you examples of many of these and what the code looks like and if I were you I would think about these as categories within which to both think about what you might build and then also seek out starter projects for how you might go about building them online all right so I'm just going to dive straight into demos and code for some of these because I know that's what's interesting to see as fellow Builders with a high level diagram for some of these as to how it works so approximately you can think of a companionship bot as a friend that has a purpose to you and there are many ways to build all of these things but one of the ways you can build this is simply to wrap GPT or a language model in an endpoint that additionally injects into the prompt some particular perspective or some particular goal that you want to use it really is that easy in a way but it's also very hard because you need to iterate and engineer The Prompt so that it consistently performs the way you want it to perform so a good example of this is something somebody built into hackathon yesterday and I just wanted to show you uh the project that they built it was a mandarin idiom coach and I'll show you what the code looked like first I'll show you the Demo First I think I already pulled it up here we go so the the buddy that this person wanted to create was a friend that if you gave it a particular problem you were having it would pick a Chinese idiom a four character Chung you that describe poetically like here's a a particular way you could say this and it would tell it to her so that the person who built this was studying Chinese and she wanted to learn more about it um so I might say something like I'm feeling very sad and it would think a little bit and if everything's up and running it will generate one of these four character phrases and it will respond to it uh with an example now I don't know if this is correct or not so if somebody can call me out if this is actually incorrect you please please call me out um and it will then finish up with something encouraging saying hey you can do it I know this is hard keep going so let me show you how they built this and I uh pulled up the code right here so this was the particular starter replit that folks were using in the hackathon yesterday and we we pulled things up into basically you have a wrapper around GPT and there's many things you could do but we're going to make it easy for you to do two things one of them is to inject some personality into the prompt and I'll explain what that prompt is in a second and then the second is ADD tools that might go out and do a particular thing search the web or generate an image or add something to a database or fetch something from a database so having done that now you have something more than GPT now you have GPT which we all know what it is and how we can interact with it but you've also added a particular lens through which it's talking to you and potentially some tools so this particular Chinese tutor all it took to build that was four lines so here's a question that I think is is frying the minds of everybody in the industry right now so is this something that we'll all do casually and nobody really knows well we just all say in the future to the llm hey for the next five minutes please talk like a teacher and maybe but also definitely in the meantime and maybe in the future it makes sense to wrap up these personalized endpoints so that when I'm talking to GPT I'm not just talking to GPT I have a whole Army of different buddies of different companions that I can talk to they're kind of human and kind of talk to me interactively but because I pre-loaded them with hey by the way you particular I want you to be a kind helpful Chinese teacher that responds to every situation by explaining the changu that fits it speak in English and explain the Chung in its meaning then provide a note of encouragement about learning language and so just adding something like that programmer and it'll pop it up to the web it'll take it over to a telegram bot that then you can even interact with hey I'm feeling too busy and interact with it over telegram over the web and this is the kind of thing that's now Within Reach for everybody from CS 101 grad sorry I'm using the general purpose framing all the way through to Professionals in the industry that you can do just with a little bit of manipulation on top of sort of this raw unit of conversation and intelligence so companionship is is one of the first common types of apps that we're seeing so a second kind of app that we're seeing and this blew up if for those of you who are on uh kind of Twitter followers this blew up I think the last few months is question answering and I want to unpack a couple of different ways this can work because I know many of you have probably already tried to build some of these kinds of apps there's a couple of different ways that it works the general framework is a user queries GPT and maybe it has general purpose knowledge maybe it doesn't have general purpose knowledge but what you want it to say back to you is something specific about an article you wrote or something specific about your course syllabus or something specific about a particular set of documents from the United Nations on a particular topic and so what you're really seeking is what we all hoped the customer service bot would be like we've all interacted with these customer service Bots and we're kind of Smashing our heads on the keyboard as we do it but pretty soon we're going to start to see very high fidelity Bots that interact with us comfortably and this is approximately how to do it as an engineer so here's your game plan as an engineer step one take the documents that you want it to respond to step two cut them up now if you're an engineer this is going to Madden you you don't cut them up in a way that you would hope for example you could cut them up into clean sentences or clean paragraphs or or semantically coherent sections and that would be really nice honestly the way that most folks do it and this is a simplification that tends up tends to be just fine is you window you have a sliding window that goes over the document and you just pull out fragments of text having pulled out those fragments of text you turn them into something called an embedding Vector so an embedding Vector is a list of numbers that approximate some point of meaning so you've already all dealt with embedding vectors yourself in regular life and the reason you have and I know you have is because everybody's ordered food from Yelp before so when you order food from Yelp you look at what genre of restaurant is it is it in a pizza restaurant is it an Italian restaurant is it a Korean barbecue place you look at how many stars does it have one two three four five you look at where is it so all of these you can think of as points in space dimensions in space Korean barbecue restaurant four stars near my house it's a three three number vector that's all this is so this is a thousand number vector or a ten thousand number Vector different models produce different size vectors all it is is chunking pieces of text turning it into a vector that approximates meaning and then you put it in something called a vector database and a vector database is just a database that stores numbers but having that database now when I ask a question I can search the database and I can say hey the question was what does cs50 teach what pieces of text in the database have vectors similar to the question what does cs50 teach and there's all sorts of tricks and Empires being made on refinements of this General approach but at the end you the developer model it simply as thus and then when you have your query you embed it you find the document fragments and then you put them into a prompt and now we're just back to the personality the the companionship Bots now it's just a prompt and the prompt is you're an expert in answering questions please answer user provided question using Source documents results from the database that's it so after all of these Decades of engineering these customer service spots it turns out with a couple of lines of code you can build this so let me show you I made one just before the class with the cs50 syllabus so we can pull that up and I can say I I added the PDF right here so I just I searched I don't know if I apologize I don't know if it's an accurate or recent syllabus I just searched the web for cs50 syllabus PDF I put the URL in here uh it loaded it into here this is just a like a hundred line piece of code deployed that will now let me talk to it and I can say what will cs50 teach me so under the hood now what's happening is exactly what that slide just showed you it takes that question what will cs50 teach me it turns it into a vector that Vector approximates without exactly representing the meaning of that question it looks into a vector database that steamship hosts of fragments from that PDF and then it pulls out a document and then passes it to a prompt that says hey you're an expert at answering questions someone has asked you what the cs50 teach please answer it using only the source documents and Source materials I've provided now those Source materials materials are dynamically loaded into the prompt it's just basic prompt engineering and I want to keep harping back onto that what's amazing about right now is Builders is that so many things just boil down to very creative Tactical rearrangement of prompts and then using those over and over again in an algorithm and putting that into software so the result and again it could be lying it could be making things up it could be hallucinating is cs50 will teach students how to think algorithmically and solve problems efficiently focusing on topics such as abstraction and then it Returns the source document from which it was found so this is another big category of which there are tons of potential applications because you can repeat for each context you know you can create arbitrarily many of these once it's software because once it's software you can just repeat it over and over again so for your dorm for your club for your slack for your telegram you can start to begin putting pieces of information in and then responding to it and it doesn't have to be documents you can also load it straight into the prompt I think I have it pulled up here and if I don't I'll just skip it oh here we go one other way you can do question answering because I I think it's healthy to always encourage the simplest possible approach to something you don't need to engineer this giant system it's great to have a database it's great to use embeddings it's great to use this big approach it's fancy it scales you can do a lot of things but you can also get away with a lot by just pushing it all into a prompt and as a as an engineer I'm you know that's one of our teams this year always says like Engineers should aspire to be lazy and I couldn't agree more you as an engineer should want to set yourself up so that you can pursue the lazy path to something so here's how you might do the equivalent of a question answering system with a prompt alone let's say you have 30 friends and each friend is good at a particular thing or you can you know this isomorphic to many other problems you can simply just say hey I know certain things here's the things I know a user is going to ask me something how should we respond and then you load that into an agent that agent has access to GPT you can ship deploy it and now you've got a bot that you can connect to telegram you can connect to slack and that bot now it won't always give you the right answer because at a certain level we can't control the variance of the model underneath but it will tend to answer with respect to this list and and the degree to which it tends to is to a certain extent something that both industry is working on to just give everybody as a capacity but also you doing prompt engineering to tighten up the the error bars on it so I'll show you just a few more examples uh and then in about eight minutes I'll turn it over to questions because I'm sure you've got a lot about how to build things so just to give you a a sense of of where we are foreign this is one I don't have a demo for you but if you were to come to me and you were to say Ted I want a weekend Hustle Man what should I build holy moly there are a set of applications that I would describe as utility functions I don't like that name because it doesn't sound exciting and this is really exciting and it's it's low-hanging fruits that automate tasks that require basic language understanding so examples for this are generate a unit test I don't know how many of you have ever been writing tests and you're just like ah come on I can get through this I can get through this if you're a person who likes writing tests you're a lucky individual looking up the documentation for a function rewriting a function making something conform to your company guidelines doing a brand check all of these things are things that are kind of relatively context-free operations or scoped context operations on a piece of information that requires linguistic understanding and really you can think of them as something that is now available to you as a software Builder as a weekend project Builder as a startup Builder and you just have to build the interface around it and presented to other people in a context in which it's meaningful them for them to consume and so the space of this is extraordinary I mean it's the space of all human endeavor now with this new tool I think is the way to the way to think about it people often joke about how in when you're building a company when you're building a project you don't want to start with a hammer because you want you want to start with a problem instead and it's generally true but my God like we've just got a really cool new hammer and to a certain extent I would encourage you to at least casually on the weekends run around and hit stuff with it and see what can happen from a Builders from a Tinkers from an experimentalist's point of view creativity this is another huge mega app now I'm primarily living the text world and so I'm going to talk about text-based things I think so far this is mostly uh been growing in the imagery world because we're such visual creatures and the the images you can generate are just staggering with AI certainly brings up a lot of questions too around IP and artistic style but the template for this if you're a builder that we're seeing in in the wild is approximately the following and the thing I want to point out is domain knowledge here this is really the purpose of this slide is to to touch on the importance of the domain knowledge so many people approximately find the creative process as follows come up with a big idea over generate possibilities edit down what you over generated repeat right like anybody who's been a writer knows when you write you write way too much and then you have to delete lots of it and then you revise and you write way too much and you have to delete lots of it this particular task is fantastic for AI one of the reasons it's fantastic for AI is because it allows the AI to be wrong you know you've pre-agreed you're going to delete lots of it and so if you pre-agree hey I'm just going to build you know generate five possibilities of the story I might tell five possibilities of the advertising headline five possibilities of what I might write what I might write my thesis on you pre-agreed it's okay if it's a little long because you are going to be the editor that steps in and and here's the thing that you really should bring to the table is don't think about this as a technical activity think about this as your opportunity not to put GPT in charge instead for you to grasp the steering wheel tighter I think at least in python or the language you're using to program because you have the domain knowledge to wield GPT in the generation of those so let me show you an example of what I mean by that so this is a a cool app that someone created for the writing Atlas project so writing Atlas is a set of short stories and you can think of it as Goodreads for short stories so you can go in here you can browse different stories and this was something somebody created where you can type in a story a description that you like and this is going to take about a minute to generate so I'm going to talk while it's generating and while while it's working what it's doing and I'll show you the code in a second is it's searching through the collection of stories for similar stories and here's where the domain knowledge part comes in then it uses GPT to look at what it was that you wanted and use knowledge of how an editor how a Bookseller thinks to generate a set of suggestions specifically through the lens of that perspective with the goal of writing that beautiful handwritten note that we sometimes see in a local bookstore tacked on underneath a book and so it doesn't just say hey you might like this here's a general purpose reason why you might like this but specifically here's why you might like this with respect to what you gave it it's either stalling out or it's taking a long time oh there we go so here's its suggestions and in particular these things these are things that only a human could know at least for now uh two humans specifically the human who said they wanted to read a story that's the text that came in and then the human who added domain knowledge to script a sequence of interactions with the language model so that you could provide very targeted reasoning over something that was informed by that domain knowledge so for these utility apps bring your bring your domain knowledge let me actually show you how this looks in code because I think it's it's useful to see how simple and accessible this is this is really a set of prompts so why might they like a particular location well here's the prompt that did that this is an open source project and and it has a bunch of examples and then it says well here's the one that we're interested in here's the audience here's a couple of examples of why might people like a particular thing in terms of audience it's just another prompt same for topic same for explanation and if you go down here and look at how it was done suggesting the story is what is this line 174 to line 203 it really is and again like over and over again I want to impress upon you this really is Within Reach it's really just what 20 odd lines of Step One search in the database for similar stories step two given that I have similar stories pull out the data step three with my domain knowledge in Python now run these prompts step four prepare that into an output so the thing we're scripting itself is some approximation of human cognition if you're willing to go there metaphorically we're not you know we're not sure I'm not going to weigh in on on where we are in the is open AI uh a life form argument all right uh one kind of really far out there thing and then I'll uh tie it up for questions because I know there's probably a lot and I also want to make sure you get a great pizza in your bellies and that is a baby AGI Auto GPT is what you might have heard them called on Twitter I think of them as multi-step planning bots so everything I showed you so far was approximately One-Shot interactions with GPT so this is the user says they want something and then either python mediates interactions with GPT or GPT itself does some things with the inflection of a personality that you've added from some prompt engineering really useful pretty easy to control if you want to go to production if you want to build a weekend project if you want to build a company that's a great way to do it right now this is wild and if you haven't seen this stuff on Twitter I would definitely recommend going to search for it this is what happens the simple way to put it is if you put GPT in a for Loop if you let GPT talk to itself and then tell itself what to do so it it's an emergent Behavior like and like all emergent behaviors it starts with a few simple steps the Conway's Game of Life many elements of reality turn out to be math equations that fit on a t-shirt but then when you play them forward in time they generate DNA or they generate human life so this is approximately step one take a human objective step two your first task is to write yourself a list of steps and here's the critical part repeat now do the list of steps now you have to embody your agent with the ability to do things so it's really only limited to do what you give it the tools to do and what it has the skills to do so obviously this is still very much a set of experiments that are running right now and and but it's something that we'll see unfold over the coming years and this is the scenario in which python stops becoming so important because we've given it the ability to actually self-direct what it's doing and then it finally gives you a result and I want to give you an example still of just again impressing upon you how much of this is prompt engineering which is wild how little code this is let me show you what baby AGI looks like so here is a baby AGI that you can connect to Telegram and this is an agent that has two tools so I haven't explained to you what an agent is I haven't explained to you what tools are I'll give you a quick one sentence description an agent is just a word to mean GPT plus some bigger body in which it's living maybe that body has a personality maybe it has tools maybe it has python mediating its experience with other things tools are simply ways in which the agent can choose to do things like imagine if GPT could say order a pizza and instead of you seeing the text order a pizza that caused the pizza to be ordered that's a tool so these are two tools it has one tool is generated to-do list one tool is do a search on the web and then down here it has a a prompt saying hey your goal is to build a task list and then do that task list and then this is just placed into a harness that does it over and over again so after the next task kind of uncue the results of that task and and keep it going and so in doing that you get this kickstarted Loop where essentially you kick start it and then the agent is talking it to itself talking to itself so this unless I'm wrong I don't think this has yet reached production in terms of what we're seeing in the field of how people are deploying software but if you want to dive into sort of the wildest part of experimentation this is definitely one of the places you can start and it's really within reach all you have to do is download one of the starter projects for it and you can kind of see right in the prompting here's how you kick start that process of of iteration all right so I know that was super high level uh I hope it was useful uh it's I think from the field from the bottoms up what we're seeing and what people are building kind of the this high level categories of apps that people are making all of these apps are apps that are within reach to everybody which is really really exciting uh and there's I suggest Twitter is a great place to hang out and build things uh there's a lot of AI builders on Twitter uh publishing and if I think we've got a couple minutes before Pizza is arriving maybe 10 minutes keep on going oh so if there's any questions why don't we uh kick it to that because I'm sure there's some uh uh questions that you all have I guess I ended a little early yes I'm giving you like a physics problem from a pset and we want to do that yeah yeah 40 of the time just raw yeah do you have any like actionable recommendations that we as developers should be doing to make history less or maybe even things that open AI on the back end should be doing to reduce illustrations would it be something where you like use our lhf so the question was how approximately how do you manage the hallucination problem like if you give it a physics lecture and you ask it a question on the one hand it appears to be answering you correctly on the other hand it appears to be wrong to an expert's eye 40 of the time 70 of the time 10 of the time it's a huge problem and then what are some ways as developers practically you can use to mitigate that I'll give an answer still you may have some specific things too so one high level answer is the same thing that makes these things capable of synthesizing information is part of the reason why it hallucinates for you so it's hard to have your cake and eat it too to a certain extent so this is part of the game in fact humans do it too like people talk about you know just folks who kind of are too aggressive in their assumptions about knowledge I can't remember the name for that phenomenon where you'll just say stuff right so we do it too um some things you can do are kind of a range of activities depending on how much money you're willing to spend how much technical expertise you have that can range from fine-tuning a model to practically so I'm in the applied world so I'm very much in a world of duct tape and sort of how developers get stuff done so some of the answers I'll give you are sort of very duct tape answers giving it examples tends to work for acute things if it's behaving in wild ways the more examples you give it uh the better that's not going to solve the domain of all of physics so for the domain of all the physics I'm gonna I'm gonna bail and give it to you because I think you are far more equipped than me to speak on that sure so the model doesn't have a ground truth it doesn't know anything any sense of meaning that is derived from the training process is purely out of differentiation one word is not another word words are not used in the same context it understands everything only through examples given through language it's like someone who learned English or how to speak but they grew up in a featureless gray room they've never seen the outside world they have nothing to rest on that tells them something is true and something is not true so from the models perspective everything that it says it's true it's trying its best to give you the best answer possible and if it lying a little bit or conflating two different topics is the best way to achieve that then it will decide to do so it's a part of the architecture we can't get around it there are a number of cheap tricks that surprisingly get it to confabulate or hallucinate less one of them includes recently there was a paper that's a little funny if you get it to prepend to its answer my best guess is that will actually improve or reduce hallucinations by about 80 percent so clearly it has some sense that some things are true and other things are not but we're not quite sure what that is to add on to what Ted was saying a few cheap things you can do include letting it Google or Bing as in Bing chat what they're doing it cites this information asking it to make sure its own response is good if you've ever had shotgpt generate a program there's some kind of problem and you ask Chachi PT I think there's a mistake often it'll locate the mistake itself why it didn't produce the right answer at the very beginning we're still not sure but we're moving in the direction of reducing hallucinations now with respect to physics you're going to have to give it an external database to rest on because internally for really really domain specific knowledge it's not going to be as deterministic as one would like these things work in continuous spaces these things they don't know what is wrong what is true and as a result we have to give it to us so everything that Ted demo today is really striving at reducing hallucinations actually really and giving it more abilities I hope that answers your question one of the ways to I mean I'm a simple guy like I I tend to think that all of the world tends to be just a few things repeated over and over again and we have human systems for this you know in a team like companies work are a team playing Sport and we're not right all the time even when we aspire to be and so we have uh systems that we've developed as humans to deal with things that may be wrong so you know human number one proposes an answer human number two checks their work human number three provides the follow final sign off this is really common anybody who's worked in a company has seen this in practice the interesting thing about the state of software right now we tend to be in this mode in which we're just talking to GPT as one entity but once we start thinking in terms of teams so to speak where each team member is its own agent with its own set of objectives and skills I suspect we're going to start seeing a programming model in which the way to solve this might not necessarily be make a single brain smarter but instead B draw upon the collective intelligence of multiple software agents each playing a role and and I think that that would certainly follow the human pattern of how we deal with this to give it an analogy space shuttles things that go into space spacecraft they have to be good if they're not good people die they have like no no margin for error at all and as a result we over engineer in those systems most spacecraft have three computers and they all have to agree in unison on a particular step to go forward if one does not agree then they read calculate they recalculate they recalculate until they arrive at something the good thing is that hallucinations are generally not a systemic problem in terms of its knowledge it's often a one-off the model something tripped it up and it just produced a hallucination in that one instance so if there's three models working in unison just as Ted is saying that will generally speaking improve your success assertions like you are an engineer you are an AI you are a teacher yeah what's the mechanism by which that influences this location of probability sure I'm going to give you what might be an unsatisfying answer which is it tends to work but I think we know why it tends to work and again it's because these language models approximate how we talk to each other so if I were to say to you hey help me out I need you to mock interview me that's a direct statement I can make that kicks you into a certain mode of interaction or if I say to you help me out I'm trying to apologize to my wife she's really mad at me can you role play with me that kicks you into another mode of interaction and and so it's really just a shorthand that people have found to kick the agent in to kick the LOM into a certain mode of interaction that tends to work in the way that I as a software developer am hoping it would work and to really quickly add on to that um being in the digital Humanities that I am I like to think of it as a narrative a narrative will have a few different characters talking to each other their roles are clearly defined two people are not the same this interaction with GPT it assumes the personality it can simulate personalities it itself is not cautious in any way but it can certainly predict what a conscious being would react like in a particular situation so we when we're going you are X it is drawing up that personality and talking as though it is that person because it is it is like completing a transcript or completing a story in which that character is present and interacting and is active so yeah I think we've got about five minutes until the the pizza outside eight minutes yes sir no I'm not I'm not yes person but um it's been a fun thing with this and I understand the sort of word by word generation and the sort of vibe the feeling of it you know the narrative some of my friends and I have tried giving it logic logic problems like things from the LSAT for example and it doesn't work like and I'm just wondering why that would be so it like it will generate answers that sound very plausible rhetorically like given this condition ask you and this would be why but it'll often like even contradict itself in its answers but it almost never correct so I was wondering what why that would be like it just can't reason it can't like think and like can you would we get to a place where it can so to speak I mean not you know what I mean I don't need to think like it's conscious I mean like have thoughts you want to talk about react so gpt4 one of when gpt4 released back in March I think it was it was passing LSAT it was yeah yes yes it it just passed as I understand it because that's one of the weird things is that yeah if you pay for job GPT they give you access to the better model and one of the interesting things with it is prompting it's so finicky if you it's very sensitive to the way that you prompt there were earlier on when gpt3 came out some people were going look I can pass literacy tests or no it can't pass literacy tests and then people who are pro or anti-gpt would be like I modified The Prompt a little bit suddenly it can or suddenly it can't these things are not conscious their ability to reason is like an aliens they're not us they don't think like people they're not human but they certainly are capable of passing some things empirically which demonstrates some sort of rational or logic within the model but we're still slowly figuring out like a prompt Whisperer what exactly the right approach is have you seen instances where it directly creates some sort of business value in sort of a startup or a company where is there like a real added value of having sort of leads little AI apps yeah I mean we host companies on top of us who that's their their primary product uh the the value that it adds is like any company I mean it's you know what is the Y combinator mono make something people want I mean it I wouldn't think of this as GPT inherently provides value for you as a builder like that's their product that's Open the Eyes product you pay chat GPT for prioritized access where your product might be is how you take that and combine it with your data somebody else's data some domain knowledge some interface that then helps apply it to something it is two things are both true there are a lot of experiments going on right now uh both for fun and people trying to figure out where the economic value is but but folks are also spinning up companies that are 100 supported by applying this to data [Music] I I I think that it is likely that today we call this GPT and today we call these llms and tomorrow it will just slide into The Ether I mean imagine what the imagine what the progression is going to be today there's one of these that the people are primarily playing with there's many of them that exist but one people are primarily bidding atop tomorrow we can expect that there will be many of them and the day after that we can expect they're going to be on our phones and they're not even going to be connected to the internet and for that reason I think that like today we don't call our software microprocessor tools or microprocessor apps like the processor just exists I think that one useful model five years out ten years out is to even if it's only metaphorically true and not literally true I I think it's useful to think of this as a second processor we had this before with uh with with floating point co-processors and Graphics scope processors already as recently as the 90s where it's useful to think of the trajectory of this as just another thing that computers to do can do and will be incorporated into absolutely everything hence the term Foundation model which also crops up I'm sorry so pizza is ready one more question maybe one more and then uh then we'll break for some food in the glasses right there sorry I was just being told we need to we need to get two more so yeah it's hard to get it to do that reliably it's incredibly useful to get it to do reliably so some tricks you can use are you can give it examples you can just ask it directly those are two common tricks um and and look at the prompts that others have used to work I mean there's a lot of art to finding the right prompt right now a lot of it is Magic incantation another thing you can do is post process it so that you can do some checking and you can have a happy path in which it's a one shot and you get your answer and then a sad path in which maybe you fall back on other prompts so then you're going for the diversity of approach where it's it's fast by default it's slow but ultimately converging upon higher likelihood of success if it fails and then something that I'm sure we'll see in people do later on is is fine tune instruction tuning style models which are more likely to respond with the computer parcel uh output so I guess one one last question sure um so the one you talked a couple of things one is as you talk about domain expertise here and you're encoding a bunch of domain expertise in terms of the prompts that you're putting there what is that where do those prompts end up do those prompts end up back in the Jeep chapter model and is there a privacy issue associated with that that's a great question so the question was and I apologize I just realized we haven't been repeating all the questions for the the YouTube listeners so I'm sorry for the folks on YouTube if you weren't able to hear some of the questions the question was what are the Privacy implications of some of these prompts if one of the messages is so much depends upon your prompt and the fine-tuning of this prompt what does that mean with respect to my IP maybe the prompt is my business I can't offer you the exact answer but I can paint for you what approximately the landscape looks like so in all of software and so too with AI what we see is they're the SAS companies where you're using somebody else's API and you're trusting that their terms and service will be upheld there's the set of companies in which they provide a model for hosting on one of the big cloud providers and this is a version of the same thing but I think with slightly different mechanics this tends to be thought of as the Enterprise version of software and by and large the industry has moved over the past 20 years from running my own servers to trusting that Microsoft or Amazon or Google can run servers for me and they say it's my private server even though I know they're running it and I'm okay with that and you're gonna you've already started to see that Amazon with hugging face Microsoft with open AI Google 2 with their own version of Bard are going to do these where you'll have the SAS version and then you'll also have the private VPC version and then there's a third version that I think we haven't yet seen practically emerge but this would be the the maximalist I want to make sure my IP is maximally safe version of events in which you are running your own machines you are running your own models and then the question is is the open source and or privately available version of the model as good as the publicly hosted one and does that matter to me and the answer is right now realistically it probably matters a lot in the fullness of time you can think of any one particular task you need to achieve as requiring some fixed point of intelligence to achieve and so over time what we'll see is the privately obtainable versions of these models will cross that threshold and with respect to that one task yeah sure use the open source version run it on your own machine but we'll also see the SAS intelligence get smarter it'll probably stay ahead and then you're question is which one do I care more about do I want like the better aggregate intelligence or is my task somewhat fixed point and I can just use the open source available one for which I know it'll perform well enough because it's crossed the threshold so to answer your question specifically yes uh you might be glad to know if Chachi PT recently updated their privacy policy to not use prompts for the training process but up until now everything went back into the bin to be trained on again okay and that's just a fact so I think Pizza is now Pizza time yay okay foreign [Applause]

Info

Channel: CS50

Views: 669,201

Rating: undefined out of 5

Keywords: cs50, harvard, computer, science, david, malan

Id: vw-KWfKwvTQ

Channel Id: undefined

Length: 53min 51sec (3231 seconds)

Published: Mon May 01 2023