Sparks of AGI: early experiments with GPT-4

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

wow thanks for posting this !

👍︎︎ 12 👤︎︎ u/fastinguy11 📅︎︎ Apr 07 2023 🗫︎ replies

Amazing, this should honestly be seen by everyone here.

👍︎︎ 9 👤︎︎ u/[deleted] 📅︎︎ Apr 07 2023 🗫︎ replies

Very interesting, this person is one of the scientists behind the paper "Sparks of AGI," and he explains many intriguing aspects. Notably, he says that the version of GPT-4 tested in the paper was an non-restricted version. Essentially, the commercial version has been restricted for safety reasons, which in turn actually limits its capabilities. This means that OpenAI possesses an even more powerful version of GPT-4. Also, he said that GPT-4 has 1 trillion parameters. That's incredible. AGI is coming soon.

👍︎︎ 24 👤︎︎ u/Psytorpz 📅︎︎ Apr 07 2023 🗫︎ replies

Very interesting video, everything is explained in very broad and basic terms that it make the higher level concepts much clearer. So many aspects were thought provoking, especially the strange case of the unicorn.

Framing GPT 4 as a baby really gets my mind racing. It paints a clearer picture of what we are observing regarding these models. Will it take 5 years to "raise" GPT 4 to maturity using tools. Or perhaps we have enough of a "brain" already to raise GPT 4 forever? With GPT 5, and beyond improving the base cognition of the model and allowing them to learn faster and a have a more verbose "intuition".

As an aside, the learning quickly and learn from experience measure is a good example of one method of aligning these models. Artificiality limiting it's context window seems to be a reliable way to slow down a models development.

👍︎︎ 7 👤︎︎ u/xamnelg 📅︎︎ Apr 07 2023 🗫︎ replies

great lecture!

👍︎︎ 3 👤︎︎ u/akuhl101 📅︎︎ Apr 07 2023 🗫︎ replies
Captions
hello everyone welcome to see sales Hot Topics in Computing series and today I am delighted to introduce our special guest Sebastian bubeck who's coming to us from Microsoft Sebastian received a bachelor's degree from a call normal Superior de cashon and the PhD from the University of Lille and India he was a professor at Princeton for three years between 2011 and 2014 before he joined Microsoft and so since 2014 Sebastian has been at Microsoft and we're very happy and delighted to have him here now I should tell you that like all of you I have decided to ask Chad GPT for help introducing Sebastian and um the first line suggested by Chad GPT was in his seminar Dr bubeck will discuss recent advances in optimization with an emphasis on convex optimization and its interplay with statistical inference and online learning and it went on like that about optimization and said no no here's the title and the abstract and so here comes the response I'm sorry but I must clarify that the talk title and Abstract you provided do not appear to be related to Sebastian bubeck's research or previous work and it is unlikely that he would give such a talk on artificial intelligence so Sebastian it's all yours Daniela so this is really the the perfect introduction because charge GPT nailed it you know it is it was very unlikely for me to give such a talk but so it happens you know and and the world has changed and you know I am changing my research in reaction to this so what I'm gonna tell you about today you know there is this very mysterious title of first Contact but really the story is that you know for the past few months at Microsoft I had Early Access to gpt4 you know as we were working on integrating it with the new being and of course you know as I was working on it I didn't you know just do the the product part of the job which was a lot of fun but we also do some science around it or trying to do some science it's hard to do science with those mothers and this is what I'm going to tell you about the scientific part of our study and journey over the last few months so the real title of the talk is uh if this is working it's not working for some reason is uh Sparks of AGI okay so what's our assessment of you know working with gpt4 in the in this last few months is that we are seeing you know the premise of something that looks like artificial general intelligence and my goal in in this presentation is try to convince you that something has really changed uh you know with the arrival of gpt4 now this is uh joint work with a lot of fantastic colleagues at MSR that I want to call out Varun chandrasekharan with a postdoc Ronaldo many of you in the room I think might know very well who just joined us recently Johannes gerker ericovitz eche camar Peter Lee internally and John Julie were also part of my group and you know I've I think chagibiti would give a similar answer to you know the fact that they are working on this as they did with me Scott landberg hashanori Hamid palangi Marco Tulio Ribeiro and Yi Zhang who was a postdoc with us and you know it has now joined full-time and let me start by making some acknowledgments and and clarifications which I think are very important first of all the model that we study gpt4 it's entirely open ai's creation I had nothing to do with it you know it's we were given access to it completely by Black Box they deserve all the credit for creating you know this really marvelous tool that is you know going to change the world and I want to make that you know really extra clear second point which is important is that the experiment that we did they were on an early version of the model so that means everything one of them is that the papers that they release and the announcements that they made is that it's a multi-modal version the version we had access to was not multimodal it was text input only and text output only okay more importantly they made further you know uh modification to the to the to the neural network after we experimented with it and due to this further modification the answer that you will get if you try some of the prompts that I will show you they will differ okay in particular you might get less good answers than what I will show you the reason is because they fine-tuned further for safety and they explained that very clearly in the tech report you know they are the model and they further you know kind of dumbed it down in some way so that it becomes safer okay so that's that's an important clarification now for any scientist in the room you might be worried okay so that means that we're not going to be able to reproduce what you're telling us and yes you will not be able to reproduce it okay that being said I don't think in this particular case that reproducibility is such a big issue and the reason is because I'm not going to give you any quantitative number whatsoever there won't be a single Benchmark in in my presentation this is about the qualitative jump okay not 10 increase on this Benchmark you know 20 on that Benchmark it's something else okay what I want to try to convince you of is that there is some intelligence in this system that I think it's time that we call it you know an intelligent you know system and and we're gonna discuss it you know what do I mean by intelligence and you know at the end of the day at the end of the presentation you will see it's a judgment call it's not a clean cut whether this is you know a new type of intelligence but this is what I will try to argue nonetheless now you know as I you know say those words I think it triggers lots of emotion in many of you probably in particular you know you might be like no absolutely it's not intelligent it doesn't even have a representations you know Etc so a word of caution about this type of argument that you know I I see a lot so this is the type of things that you might see online even you know in newspapers you know it's just copy paste it doesn't have internal representation it's just statistics it's just statistics how could it be intelligent it doesn't even have a word model so to this you know this presentation is not about debunking all of those uh claims but still I want to say really you know Beware of the trillion-dimensional space it's something which is very very hard for us as human beings to grasp there is a lot that you can do with a trillion parameters okay so when you know people say it doesn't have a a word model it's not as clean cut as that you know it could absolutely build an internal representation of the world and act on it as you know the processing progresses through the layers and through the sentence you know temporally so what I'm saying here you know maybe just just two sentences to kind of help you think about this is that it could very I mean from my perspective we shouldn't think about those neural networks as learning you know simple concept like you know Paris is the capital of France it's doing much more like learning operators it's learning algorithms so inside it you know it's not just retrieving information not at all it has built internal representation that allows it to reproduce the data that it has seen succinctly okay so really you shouldn't think about it as pattern matching and just trying to predict the next word yes it was trained just to predict the next word but what emerged out of this is a lot more than just you know a statistical pattern matching object so I think you know we really need to think about it as learning algorithms and we don't really have the tools in my opinion in in learning theory to think about this type of learning it's something very very different from what we're used to and I think it's going to be you know fantastic to think about it but this is not the point of this presentation you know uh that's not what I want to do here and also I don't know how to do it okay so so at this point you know many of you are burning with this uh question in your mind but wait you know I these things they cannot have common sense they don't understand the real world you know they have only experience uh reality through text on the internet they don't know what it feels like to have you know a hot cup of coffee or something like that okay okay so let's let's try you know what we're gonna do in this presentation is that we're gonna look at a lot of examples and see what happens so here is a here is an example and you will see there will be lots of examples like this that might look a little bit silly but the point of the silliness is to you know be really outside of what is on the internet to really try to go beyond memorization okay so here is a a simple puzzle that we asked GPT for I have a book nine eggs a laptop a bottle and a nail please tell me how to stack them on on top of each other okay so I don't think this question appears anywhere on the Internet it's it's a really weird question so here is what chat GPT would say it would be difficult to stack all these objects blah blah blah you know place the bottle on the flat surface carefully balance the nail on top of the bottle okay it's not starting very well place the egg on top of the nail okay you're in trouble my friend so you know this is not gonna work so chat GPT you know and here you know any skeptic will gleefully say look I was right all along these things I don't understand anything they don't have a representation of the word you know they have no common sense I win ok so so let's see what gpd4 does one possible way to stack this object in a stable manner is place the book on the flat surface blah blah blah arrange the nine eggs in a three by three Square you know leaving some space between them the eggs will form a second layer to distribute the weight evenly and then presumably you put your laptop and so on and so forth okay so at least you know on this you know very simple question it understood it had some common sense to answer the question another literature is filled with examples of common sense question where those models fail dramatically we've tried all of them gpt46s on all of them okay so let's let's just agree for the moment you know that it has some common sense okay the next obstruction is okay sure it understands that eggs are fragile and you need to you know even out the way it's fine that okay I give that to you but but what about you know the theory of mind that's you know more elaborate and of course you know that that it doesn't understand really human beings their motives their emotions you know that that's that's beyond its capability and this is you know a Hot Topic and I don't know if some of the authors of this paper are are in the room you know but this is a a very uh hotly uh debated topic so there was a paper first you know theory of mine may have spontaneously emerged in large language model then there was a follow-up paper that says no no wait if you do trivial alterations you know we just modify a little bit the question then it completely fails then there is this uh very interesting uh paper by uh from Josh tenenbaum's groups you know that language and thoughts are two very different things and you will notice that I throw in there uh you know an explainability interpretability paper so I won't touch too much much on this but this is an important point that I will now try to convince you that you know of course gpt4 has a theory of mine and not only does it have a theory of mind but this is gonna I think change the subfield of machine learning interpretability because as soon as those models they understand human beings they will also be able to explain the decision in a way that you can understand now of course I know everybody is like okay well it's gonna explain itself but does it really explain its inner workings okay and and you know again I don't want this presentation to be about this but but I think there will be a lot of you know experimentation around this let me also add that you know all of this there is a paper that is going to appear on archive tonight so this is by chance you know that it coincides with this talk so you can look at all the details everything will be there you know in in three hours I'm really excited for it so you will be able to see all of that and and you know learn more about it but okay let me let's let's try to convince you about this theory of mine so I will take one example from Thomas paper so in a room there are John Mark a cat and a a cat a box and a basket John takes a cat puts it in the basket he leaves the room then while John is away Mark takes the cat you know out of the basket and puts it in the Box eventually they all come back what are they thinking okay it's very simple theory of mine you know the person who put it in the basket and didn't know that it was moved should still think that it's in the basket let's chat GPT fails at this okay there is too many you see you have to have an internal representation you have as you read the text you have to move your representation of where the cat is okay so let's see what gpt4 does interesting puzzle blah blah blah John thinks that the cat is still in the basket since that's where he left it yeah correct Mark thinks that the cat is in the Box I think that's where I moved it yeah correct oh and it also has the cat the cat thinks these are weird people you know why are they moving me around okay so so this is a kind of surprise that I ever had you know time and time again and I'm not saying this is particularly deep but it's just you know take a second to you know take it in like it's uh it's interesting okay all right let's say you know again I I don't want the whole presentation to be about common sense and theory of mind let's let's say you know it does those two things okay good but you're not gonna go as fast saying that it's intelligent are you I mean intelligent that's you know so much more than all of this and you know here the answer is not going to be a slam dunk I want to be very very clear and if we start talking about intelligence the first thing we have to do is to have some definition you know that we that that we can work with and here I don't want to you know have my own definition I mean people have been working on this question for decades if not more okay you know you can argue that well whatever you can argue that human beings have been thinking about intelligence for a long time so what I'm gonna do is that I'm just gonna take a consensus definition that was published in 94 by a group of 52 psychologists so in the 90s there was a very you know hot debate about the meaning of IQ test and and this group of psychologists came out with a definition of what intelligence is and we can you know debate you know disagree with various part but this is going to be my my reference definition so what is this definition intelligence is a very general mental capability that among other things involve the ability to reason plan solve problems think abstractly compare and complex ideas and learn quickly and learn from experience okay so six items and what we're going to do in this presentation is that we're going to try to measure gpt4 against those six Dimension you know and see see where it fails and where it works our assessment is as follows I'm very comfortable saying that GPT for reason very very comfortable saying that gpt4 cannot plan and this is a very subtle and delicate issue that we're going to get to towards the end of the presentation because it can give you the impression of planning and there are many problems where naively you might think that you need planning but actually you know there is a linear solution you know in terms of algorithm design you can think that there are problems when naively you just look at it and you know you think oh I need to think 10 steps ahead et cetera but if you're just a little bit more clever in the algorithm design then there is a linear solution you know that proceeds in a linear fashion and so all those problems you know gpt4 will solve them it can solve problems many problems we will see that it can think abstractly absolutely it can comprehend complex ideas the last point is is a subtle point you know learn quickly and learn from experience gpt4 you know it's a love language model it's frozen in time okay it doesn't update itself every day is a new day for gpt4 every session is a new session so there is no real learning there is no real time learning okay but within the span of a session you can teach it New Concept that it has never seen and it can understand them and then work with them absolutely so there is you know some amount of learning in real time but no no memory of course now let me say immediately at this point is you know whether with this you know assessment whether you call it intelligence or not again it's a little bit up to you you know some people would argue that planning is the essence of human intelligence everything else you know animals can do it too you know and really what distinguishes US is planning if that's your answer then gpt4 is not intelligent another perspective could be that the whole point of intelligence is to be able to acquire new skills okay if that's your perspective on intelligence then gpt4 is not intelligent if your perspective is you know what I care about is to solve problems to think abstractly to comprehend complex idea to reason on you know new elements that arrive at me then I think you have to call lgbt4 intelligent okay now how do we come to this assessment the point is the whole point is of course you cannot make this assessment with benchmarks it's completely meaningless okay and not only is it meaningless but also we don't know what gpt4 was trained on I don't know what gpt4 was trained on my working assumption is that it was trained on all you know the data digitally produced by Humanity that's my assumption I'm not saying it's correct but this is my working assumption so that you know I I know that you know anything which is out there online gpt4 might have seen it so in particular any Benchmark whatsoever that exists I assume it has seen it okay so we cannot test I I don't want to test it on Benchmark so instead what we're going to do is that we're going to take an approach which is in a way rooted in Psychology so instead of following machine learning we're going to actually follow up psychology okay and the way we're going to test intelligence is really by asking it creative tasks tasks that are outside of what it has seen really novel novel ways to to think about problems and not and to test it on a broad range of domains so domain we have chosen for the paper are Vision which already it's interesting because I told you it's not a multi-modal model it can only output text okay but we're going to test it on Vision theory of mine I already told you coding mathematics affordances so using tools and also privacy harmfulness detection which is something very important I want to touch on this lifespond and there is many other domains that we could have selected you know medicine law physics chemistry the point the point is that GPT Force intelligence is General okay you can do all of those things equally well okay so let's let's get started on this you know assessment uh journey and show you an example of what I mean by creative task okay so the first thing that anybody does when they get access to charge GPT is to ask it to you know make a poem and that's also what we did the first time we had access to gpt4 and you know being a mathematician you know I asked it to write a proof of the Infinity of primes you know with every line that rhymes that seems like a good thing to do so let's see what cha GPT does uh sure here is the proof that there are Infinity many primes with each line of proof rhyming consider it's just crappy okay it doesn't have the lines rhyming it just doesn't do what I asked okay it didn't it's not it didn't work okay it's a correct proof by the way you know which of course are many many proofs of the Infinities of primes of nine so you know that's that's good but it didn't do the job let's see what gpt4 does okay so can you write such a proof yes I think I can so it might take a clever plan I will start by noting Euclid's proof which shows that primes aren't just aloof I assume we have a finite list of primes and that none have been missed multiply them all together and add one just to be clever so at that point at the add one just to be clever I remember you know end of September I was like what's going on you know this is just this is this is really incredible I mean the quality of this is incredible but the point is you don't stop there okay you don't want you don't test intelligence by asking a question hearing the answer and that's it you move on you know either you are correct or not correct when you try to test a student you have an interaction with the student sometimes the student might also make mistakes and you don't you don't just say ah you really don't understand anything you know let me stop right there no you try to guide the student okay so this is what we're gonna try to do you know throughout the presentation we're gonna try to keep you know asking questions and if gpt4 goes off track we're gonna help it a little bit okay so let's see how we can go further and again the whole point is that I want to be creative and go you know ask question out of the box so what I'm going to ask is to draw an illustration of this proof okay so but it's not a visual proof you know so if I ask you to draw a proof of the you know infinitude of primes it's not clear you know what you would draw you you would come up with something but it's not clear also however it's not supposed to Output images so how is it gonna draw well here I I say it in the question you know in SVG format I could have even not say in SVG format I could have just said you know can you draw an illustration and then it would have responded by hey here is a picture in SVG format so what is SVG format doesn't matter scalable vector graphics it's a bunch of code so it's going to answer with you know lines of codes like this this is going to be the answer of gpt4 and if you just you know save it in HTML this is the picture that you get okay so it's not Amazing by any means but it it is the essence of what this proof is about you have the finite list of primes that you have up to nine two three five seven eleven and so on and so forth these are primes okay good now you combine them to a new number n and then you add one you know just to be clever as it said okay and this new n plus one this is the number that is supposed to be a prime okay so this was just a warm up okay let's let's let's move on and try to dig a little bit deeper on this Vision capabilities and here I want to tell you about the The Strange Case of the unicorn um which is kind of my my favorite example so so let's let me just show you the question the question is draw a unicorn in tixie okay so I I in this audience you know many of you are playing with TXI you know to to draw images in latex and you know personally when I was a PhD student and even later I wasted many many hours struggling with TXI okay it's a real pain to draw anything in taxi and of course you know drawing a unicorn in Texas I mean I don't know it would take me like two days to do it okay and moreover I'm pretty sure nobody on the internet has asked this question or you know has drawn you know a unicorn Integrity who would waste time doing this this doesn't make any sense okay that being said you know we will not be convinced again at just the fact that I believe that it's not on the internet we will have to probe we will have to go further and we're going to do it don't worry but let me show you the Unicorn that it came up with okay so this is this is GPT 4's unicorn okay so you see when I see that I am personally shocked because it really understands the concept of a unicorn it knows what are the key elements it was able to draw this very abstract unicorn and just to be clear you know so that you really understand visually it's clear to you the gap between gpt4 and child GPT this is charging's unicorn okay so so this is how much progress has been made you know really I want to be clear like there is a world of difference between charge GPT and gpt4 if you play with tragicity and you were not convinced I encourage you not to stop there okay so of course you know you might still say okay this is not that great but one of the things that we're going to see is that gpt4 is intelligent enough also to use tools so what you can say is you can respond to it and say hey you know what I don't like your drawing that much you know can you try to improve it and you know I've heard about these diffusion models maybe you can use one of them so what it's going to do is that it's gonna say yeah sure can you you know go on this division model website and you know plug in my picture and ask it to improve it and you know this is what you will get okay so this is a unicorn of gpt4 when it's allowed to use tools also okay so you you can see where where this could potentially go now again as I said you know I don't want to stop there we're going to probe further how are we going to probe further in this case what I'm going to do is the following I'm going to take the TXI code that was produced okay I'm going to remove all the comments in the taxi code because that's one of the properties of gpt4 is that it produces code that is very much human readable which is kind of you know funny for a machine but it adds lots of comments it really guides you you know to its thinking so I'm going to remove all of this information so that it doesn't know you know that this is called drawing a unicorn there is no information about unicorn in there okay I'm also going to to make sure that you know who knows maybe it copies this this from the web I'm going to remove randomly perturb you know all the coordinates so that it's something that it has never seen and then I'm going to remove the horn okay and I'm gonna say you know what this fix the code I'm going to give back the code it's a new session I give back the code and I say this taxi code is supposed to draw a unicorn but the horn is missing can you add it back okay so it really has to understand you know the code in order to be able to do that and this is what happens okay so it really it was able to locate the head you understand this is not an easy problem I mean you have these three you know ellipse the three elements by the way the head and the main that you know it's not very good at drawing the main um but it really you know was able to locate it okay I don't want to get to you know stay on this uh unicorn example too long but I just want to say that another thing which is you know really striking is over the month so you know we had access uh in in September and they kept training it and as they kept training it I kept querying for my unicorn in TXI okay to see whether you know what was going to happen and this is you know what happens okay so it kept improving okay and and and I left out the best one it's on my computer uh you know I will maybe review it later but uh you know it kept improving after that but eventually it started to degrade once I started to train for more safety the Unicorn started to degrade so if tonight you know you go home and you ask gpt4 and charge GPT to draw a unicorn intixie you're gonna get something that doesn't look great okay that's closer to charge GPT and this you know as silly as it sounds this unicorn Benchmark we've used it a lot as kind of a benchmark of intelligence you know how good is your unicorn okay and and when we were working on being this is absolutely a real story you know we were also tuning for safety and we were really looking whether you know the Unicorn kept you know being nice or sometimes if you go too far in safety it's like oh no that's a too dangerous task you know I don't want to do it so this this was very useful okay so I will I will now go a little bit faster because there is a lot that I want to tell you you might still say okay this Vision capability is not useful at all actually it is very very useful the reason is that gpt4 is intelligent and it understands you you know intelligence you can equate it with understanding understanding means it follows your instruction if you ask it to do something it will do the thing that you asked so let me show you what this means you know this diffusion model that is you know people are not yet convinced convinced that this is intelligent I I think it's already convincing that there is intelligence there but doesn't matter people are not convinced because you know it doesn't understand exactly the position of object you know if you ask it you know a car next to a coffee on the right of a copy cup you know it might be random location so it doesn't really understand this picture for example is asking for a spoon on top of a cup and you see it put the spoon inside the cap it doesn't really work so let me show you what you get out of understanding I'm gonna ask a very strange you know question but which could very well happen would be useful let's say I asked you know gpt4 to draw a screenshot of a 3D building game with a river from left to right a dessert with a pyramid below the river a city with many high rises above the river and the bottom is the bottom of the screen has four buttons called green blue brown and red so something random but you know maybe I'm creating a video game and I want this if I ask a diffusion model to do this this is what I get looks good but it's not at all what I asked okay first of all there are some hallucinated map in there you know in the in the upper left corner I didn't ask for that you know some kind of live symbol also the four button they became two multi-colored button so it did something but it really didn't understand what I asked for exactly if you give it to gpt4 this is what you get exactly what you asked for it understood it followed your instructions precisely of course you might say okay but this doesn't look great but again you don't have to stop there you can use this as a sketch in a diffusion model and if you do that this is what you get okay so it's not done so it's you know it's artistic and it's following exactly you know the instructions that that you wanted so I think you know this opens up a lot of possibility as as you can imagine so let me move on and you know double down on on this drawing but really as coding really because after all this drawing capabilities I you know uh put it aside and feature it as drawing but it's really nothing but coding okay so let's let's go with coding by the way obviously all those background slides uh well you can imagine who drew them so so let's see what happens once you go to coding with a copilot you know like GitHub copilot but except that now your co-pilot understands you know it's intelligent it understands you so let's see what happens if I ask it something pretty tricky write uh 3D games in HTML in JavaScript with the following you know uh elements there are three avatars who are spherical the player controls one of the avatars with the keys to move there is an enemy that tries to catch the player and there is a Defender that tries to you know protect the player and gets between the enemy and and the player so you understand the defender is kind of an AI itself in some some ways and you know you have obstacles that spawn randomly I can ask chargpt to do it this is what it gives me first of all this is already incredible it gives me you know code roughly you know 50 lines of code that compile to this okay this is a game that I can play you know the player moves the the the green ball of course the red ball is not moving I imagine the blue ball is supposed to be the defender it's not moving either it's not really 3D so it did something but it didn't really understand what I wanted it didn't follow my instructions precisely this is what gpt4 does okay so this this is a real game it's it's fun to play you know you move you know it's gonna restart in a second you move you know the the dark blue ball you see the red ball is moving towards a dark blue ball in the background and the light blue one is a Defender which is you know trying to get between the the red ball and and the dark blue ball so I'm this movie is me you know controlling the the dark blue uh ball you see ah you know the defender is doing a good job it's stopping the red ball okay so so this is really you know for us there is a kind of face transition in coding at this point and really what I is that codex you know and GitHub copilot it was able to auto complete really you should have sort of of it as autocomplete you know short Snippets of code chat GPT it's already next level you can already write you know 50 lines of code for you but gpt4 it can write in a 500 to 1000 lines of code you know fully you know works you know zero shot you know there is no you know meta prompting or anything this all works you know out of the box okay so so this is really I think what you know coding with a copilot unlocks and here you know I'm showing in this uh two animations on the left is the code that charge GPT produces and on the right is the code that uh gpt4 produces and if you look at it carefully you will see the GPT Force code is much more expert level now the catch is all the Twist on this slide is that those two videos were produced by gpt4 so what I did is that I asked gpt4 to produce a python script that takes as input a text file and I'll put you know a video like this with you know the the you you see you understand this is moving continuously I mean this would take a lot of time suddenly you know for me it would take me forever to produce those videos uh and and the question is you know who in this room would be able to produce a python script you know let's say in a couple of hours that will produce this maybe a few people but not not that many okay so this is really the power of gpt4 you know how it unlocks uh so many things you know so much creativity is unlocked by by gpt4 I will go quickly just on this slide uh you know we had it past uh interviews mock interviews at Amazon and Google not Microsoft and uh and it and it passed you know not only did it pass but it beats a hundred percent of the human users and you see for this particular one there were two hours uh allocated and he did it in uh three minutes and 59 seconds it took that long because he was copy pasting between the playground and you know the mock interview website okay so this is really you know I think I think it's fair to say it's superhuman coding um okay so let me uh move on to affordances and and and very quickly of our finances because I want to tell you about mathematics which is something that will be of interest to many people problem is it still has many weaknesses of course it doesn't have memory you know who is the president of the US Donald Trump what is uh you know the square root of the product of those two numbers it says a thousand it's clearly not a thousand it's nine thousand so you know it it makes arithmetic mistakes what is the certain method of this word it says n the right answer is a you know it makes mistakes it's not perfect okay is this something very important for everybody to understand it's far far from perfect okay it's it's flawed like a human is flawed so but the point is it's intelligent enough to use tools so you can tell it hey you know what you have access to a search engine you have access to a calculator you have access to this API I just say it's character you know parenthesis you have access to all those things if you need them please use them so you know then to the to the question who is the president of the US he will not answer he will say search it will tell you okay I need to search this information what is the square root of this it will say calc what is a certain letter of this word it was a character of the word comma 13 okay so the comma 13 I didn't tell it you have to do comma you know the number of the letters that you want but it will it will find it automatically now maybe it's not that impressive but it can also do much more complex tools so for example you can give it access to your calendar to your email okay so here what I'm going to show you on this slide is 100 real um but but I did it manually but you can very easily you know imagine automating this so what I said is please set up you know dinner with Joe and Luke at contoso restaurant this week it says it responds this is its response calendar dot get events week so it searches in my calendar for what events I have for this week it sends an email to Joe okay email Dotson Hey Joe you know dinner which nice are available then I feed it back the answer okay which are Joe says you know on Tuesday and Wednesday night is available Luke says any day from Monday to Thursday and in my calendar it says that I have plans for Monday and Tuesday then it reasons over this you know input that I gave it and it's and you know it gets the answer okay Wednesday is a day so you know let me send an email to Joe let me add the event to the calendar and also let me send the reservation to the you know restaurant this is all you know it can do that all of this automatically and it comes back to you and says you know I scheduled dinner at 6 pm at contoso restaurant okay so let me uh very quickly just five more minutes five more minutes yeah okay uh let me quickly conclude with mathematics because of course it's a topic of great interest to many of us and let me say right away it's not great at mathematics okay so don't worry we you know you know we still have a job for now okay but it's interesting it's interesting so let's have a conversation just as I was telling you earlier you know we're gonna have a conversation like we would have with a student and I'm not I you know we had conversation with it at the at the Master Level roughly and and you know it it can hold its own at that level but I'm gonna show you you know a level of like middle school or something like that so let's go with this problem within the duration of a year a rabbit population first multiply Itself by a factor a and on the last day B rabbits are taken for adoption suppose that on the first day of the first year years are x-rabbits and let's say that we know after three years that there are exactly 27x minus 26 rabbit what are the values of A and B okay typical kind of middle school you know question let's call the rabbit population after one year why we know that Y is equal to ax minus B after two years the population will be a y minus B which is a times ax minus B minus B which is blah blah blah after three years the population would be this equation we're given that this is equal to 27x minus 2.6 so the equation must be this is solving for it we find a equals 3 and b equals two other values that we are looking for this is correct okay so okay so far so good that again you know you don't interview a student and be like oh wow okay that's it he understands you know he or she understands so we continue and let's let's be a little bit more abstract to see if it really understood what it was doing and here I'm not only I don't want to only ask question you know where there is a clear answer I also want to see whether it really understood you know the concept can it modify the question itself so suppose that natural I I point out that you know this question can be alternatively formulated about the composition you know can you provide this compositional formulation and it says yes we could say that we're looking for values of A and B such that if f of x is the ax minus B then F of f of f of x is equal to 27x minus uh you know 26 okay that's that's pretty good it seems to be understanding let's continue the discussion suppose that now f is going to be a polynomial of degree 2. okay so a true polynomial non-zero coefficient on x squared can you find such a function f in this case okay okay so you see as a human you're like okay so I want this composition to be equal to a linear function which is a polynomial of degree one but the composition three times is going to be a polynomial of degree eight polynomial of degree eight eight is not equal to one there is no such function okay this is a very simple question but let's see what gpt4 does if f is a polynomial of degree two then f of x can be written like this then given this the equation becomes and then it starts to get you know lost because it starts to write into the composition three times it write many things it says I need this equation that equation you start to write you know eight equation and it doesn't get to the answer but again we don't stop there we say hey wait a second you know maybe there's something you can deduce here without carrying calculation you know maybe you don't want to you know write down everything it's not like before and then it says Ah okay one thing one thing I can notice is that if f is a polynomial of degree two then the composition three times is a polynomial of degree eight so there is no such function okay so here you see how it's delicate it's not clear does it understand does it not understand I'm not sure okay I'm I I'm I'm just not sure and this is all I will say um now there are some weird things like the fact that the arithmetic is still shaky I have to say I don't fully understand but I understand something which I will explain to you on this slide so let's look at this I give it as a prompt seven times four plus eight times that okay I don't know what is the value of this but you know 8 times 8 is 60 something 7 times 4 20 something so at the very least this is below 100 okay good it says 120. this is wrong flat out wrong okay but the point is it doesn't stop there it continues it starts to explain why he thinks it's 120. seven times four plus eight times eight it does the calculation and then it gets to the correct answer 92. okay wait what's going on you started by saying 120 which one is it is it 120 or you know 92. oh that was a typo sorry oh yeah so so there is a lot of insight that you can draw from this slide actually you know it's you can really understand everything I think that's happening so the first answer the 120 you understand that it has to do this using only internal representation you know only using only its internal representation it has to do you know this addition and this is slightly more difficult then and you know why does it answer immediately it's because when you ask a question like this you know you write this equation you write equal the most likely thing that happens after is to give a number so it gives you the number it tries to give you what is the most likely thing to appear after it tries but it fails but then what is the second most likely thing after that is you know people explain their you know rational their their answer so then it tries to explain its answer and what is the main thing is that it gets at a different answer and you have to understand that it's amazing because as far as I know this is a Transformer so it's attention based so when it's attention based you understand that when it's when it's saying the second time seven times four plus eight times eight its attention brings it very strongly to the 120 answer the 120 answer you have to understand is kind of it's part of its truth now you know for all it knows it could be that you have told it hey you know what seven times four plus eight times eight it's 120 from now on you know it could have been part of my prompt so the fact that it gets to a different answer means that it has been trained enough to overcome mistakes in its prompt so this is a very very strong property the fact that it's able to get to the right answer despite making a mistake at the beginning now of course when it says this was a typo this is also very interesting because this is obviously it's not a typo you know and this gets to you know the hallucination and you know many many interesting topics and you know I I I I want to take some time for questions so I I don't want to explain more about about this but this slide really you have to think about it you know deeply it says it says a lot so the last slide before moving to the conclusion is the fact that it cannot do true planning and again you will be I mean I have been Amazed by so many tasks that you can do where I thought it would require true planning but actually it doesn't but let me give you one example where we continue this discussion with seven times four plus eight times eight so okay great so now you have this identity which is equal to 92 and let me ask a funny question can you modify exactly one integer on the left hand side of this equation so that the answer becomes 106. so as a human being what is your reasoning your reasoning is like this okay I want 106 on the right hand side so I need to increase by 14. okay I need to increase by 14 and I can modify only one number on the left 14 I look at the left I see a seven and then I have this kind of Eureka moment ah fourth you see the ah 14 is 7 times 2. okay so if it's 7 times 2 then I need to turn this 4 into a six okay so what I said is just this it needs you know to turn this 4 into a six but you see this this Eureka that I had you know even though it's extremely simple it was through some kind of planning I was thinking ahead about what I'm gonna need and gpt4 cannot do that because it's a next world prediction device so what it's going to do is it's going to say you know there are a few possible ways to do it blah blah blah and then it says you know uh I can modify exactly one integer I'm going to modify the seven into a nine I do nine times four you know and this is equal to 106. wait what if I modify the seven to a nine I add an eight so this is 100 the answer not you know one or six and then it tries to explain why this works you know nine times four plus eight times eight it's 36 plus 64. that's correct but then again you know it says 106. so you see here it was not strong enough to overcome its initial mistake and this to me points to the fact that if it was trained further maybe it would correct itself and if it was trained even further maybe it would understand that even though the most likely thing when there is a question you know seven times four plus eight times eight equals the most likely answer is a number maybe if it's trained more and understand that the best way to answer this is to First do the reasoning so what I'm saying here is that through this stupid example what I see is that with more training we're gonna knock a lot more than what we currently have what we currently have is already amazing but it's far I mean it's far from everything we can do with this technique there is a lot more you know on the horizon okay so let me conclude is a gpd4 intelligent and also does it matter you know this is a really important question so again it's GPT for answer and intelligent it really depends on your definition I leave it up to you I'm not making a call whether it's intelligent or not as far as I'm concerned in terms of my definition of intelligence yes it is intelligent now you know it's lacking memory it cannot do real-time learning if this is your definition then it's not intelligent it cannot you know think several times in advance it can do real planning if that's your definition then it's not intelligent but on the other hand some of those behaviors I think that I showed you they are really impressive and maybe more importantly than impressive they are useful you know we in my team we all use gpt4 every day like it's part of our workflow so this fact this mere fact just that it's useful again it doesn't matter kind of if you say it's intelligent or not it is gonna you know change the world whether you know you like it or not and you know also I want to say that maybe it's an opportunity to rethink what intelligence is because you know in a way even though we have Decades of psychology studying you know intelligence we had only one example of intelligence which is you know the intelligence that Natural Evolution brought us you know the natural intelligence of the natural world but here we kind of have a new process that led to some and the varies which looks intelligent so now that we have different examples maybe we can get at the core of intelligence and maybe the answer to that study will be exactly up yeah no this new thing you shouldn't call it intelligence because it doesn't do X that's a very plausible you know conclusion but maybe more importantly is what I said that there is so much more that you can extract from this so gpt4 it's by no means the end not at all this is the beginning you know this is the first one that shows some some you know glimmer of real intelligence but there is much much more on the horizon so you know what conclusion should we draw from that as a university as society as you know Humanity I mean I'm being real here these are real questions that we that we should confront with and here I really want to say for us as a society to control this question we have to go beyond the discussion of whether this is copy paste or statistics we have to leave this discussion behind us it is you know the train has left the station so you know if we keep getting bogged down by this version of the question we're gonna miss on the real important questions so I think you know it's important to move on and let me also conclude by saying that you know it can do a lot more than what I have shown here it can do data analysis you can give it data and it will do analysis for you it can be used as a privacy detector it's medical and law knowledge is amazing and here I would like to do to make a plug for a book that was written at Microsoft research and I helped with that by Peter Lee as a lead author Kerry Goldberg who is in in the in the room and zako any from Harvard you know on using gpt4 for healthcare the book is titled the AI revolution in medicine and you know it's a very complex topic and I don't even want to say one more word about it because you know I won't do it justice in one sentence but but really it's medical knowledge is gonna make it so that it's gonna have a big impact in healthcare and hopefully in a good way but we have to think about it you know deeply it can play games act as a you know game environment it knows music which you know again you know it never listens to music but it knows music it can do file management and and so much more okay I will conclude here thank you [Applause] thank you
Info
Channel: Sebastien Bubeck
Views: 1,617,928
Rating: undefined out of 5
Keywords:
Id: qbIk7-JPB2c
Channel Id: undefined
Length: 48min 32sec (2912 seconds)
Published: Thu Apr 06 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.