Breakthrough potential of AI | Sam Altman | MIT 2023

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
come on contacts for you I know you're on this worldwide tour uh trying to help control the fire of the gpt4 and Chad GPT have started this particular room is maybe a little different from from a lot of that most of the people here in this room are either building companies or are working on plans to build companies that are in the ecosystem really triggered by chat gbt yeah I wish you were here too actually uh they are exactly your kind of people and um and I know that part of the mission here is to make the world a better place but also to build on top of the platform that you've created and obviously you navigated to the position you're in in life very deliberately and you're the perfect person to help advise them so we're going to try and keep this on focus in a way that it helps this room as much as possible these 900 people create successful companies the first thing I'm going to ask you about is um you know if AGI is is in the near-term future uh then we're right now at this inflection point where human history has a period of time up till AGI and then obviously has a completely different history from here forward so it seems to me that at this stage you're going to be a centerpiece of the history books no matter how this evolves do you think it's the same do I think it's the same in terms of what in terms of in terms of the way history will describe this moment this moment being this year of innovation in this field um I mean I I think I hope this will be like uh you know a page or a chapter in history books but I think that over the you know next several billion years uh such unbelievable things are going to happen that this will be just sort of like you know one small part and there'll be new and bigger and more exciting opportunities and challenges in front of us yep so I think uh one of the things that a lot of people are asking you know with prior iterations of GPT open source iterations you had a whole variety of ways of taking that source code and making a vertical company out of it or an adjacent company something of Federated learning or something in the future iteration of these companies you've got this highly tunable closed API to start from any quick advice on okay I'm starting a company now I have to make some decisions right out of the gate what do I start with how do I how do I make it work in any given vertical use case you know I think there's always more that stays the same about how to make a good company than what changes the the and a lot of people whenever there's like a new platform shift like this thing just because they're using the platform like that's what's going to guide business strategy it doesn't nothing lets you off the hook for building a product that people love for being like very close to your users fulfilling their needs for thinking about a long-term durable business strategy like that's actually probably only more important during a platform shift not not less like if we think back to the launch of the app store which is probably the most recent similar example uh there were a ton of companies that built very lightweight things with I don't want to call them like exploitative mechanics but just like you know it was it was not something durable and those companies had incredible meteoric Rises and falls and then the companies that really like did all the normal things you're supposed to do to build a great business endured for the last 15 years and so you definitely want to be in that ladder category and the tech the technology is just like this is a new enabler um but what you have to do as a company is like build a great company that has a long-term compounding strategic advantage and then what about Foundation models just as a starting point you know if I look back two years uh one of the best ways to start was to take an existing Foundation model uh maybe add some layers uh and retrain it in a vertical use case now the foundation models or the base model is maybe a trillion parameters so it's much much bigger but your ability to manipulate it without having to retrain it is also far far more flexible I think you have 50 000 tokens to play with right now in the basic model is that right about 32 000 in the biggest model thirty two thousand eight thousand base one okay and actually so how's that going to evolve there are new iterations that are going to come out pretty quickly we're still trying to figure out exactly what developers want in terms of model customization um we're open to doing a lot of things here and we're you know we also hold our like developers are our users so our goal is to make developers super happy and figure out what what they need um we thought it was going to be much more of a fine-tuning story and we had been thinking about how to offer that in different ways but people are doing pretty amazing things with the base model and for a bunch of reasons often seem to prefer that so we're like actively reconsidering what customization to prioritize given what users seem to want seem to be making work as the models get better and better uh it does seem like there is a trend towards less and less of a need to fine-tune and you can do more and more in the context and when you say fine tune you mean changing parameter weights yeah yeah I mean is there going to be ability and ability at all to change the perimeter weights uh in the um the GPT world we yeah we'll definitely offer something there but it like right now it looks like maybe that will be less used than ability to offer like super cheap context like 1 million if we can ever figure that out on the baseball yeah let's drill in on that just a little bit because it seems like uh regardless of the specifics the trend is toward as the models are getting bigger and bigger and bigger so you go from one trillion to 10 trillion parameters the amount you can achieve with just changing uh prompt engineering or changing the the tokens that are feeding into it is growing disproportionately to the model size does that sound right um disproportionately to the model size yes but I think we're like at the end of the era where it's going to be these like giant giant models and we'll make them better in other ways um but I would say it like it grows proportionate to the model capability yep and then the investment in the creation of the foundation models um is in the on the order of 50 million 100 million just in the in the training process um so it seems like is it now What's the magnitude there we don't share base much more than that okay and and Rising I assume over time yeah so then so then somebody trying to start from scratch somebody trying to start from scratch you know is trying to catch up to something or maybe or maybe we're all being incredibly dumb and we're missing one big idea and all of this is not as hard or expensive as we think and there will be a totally new paradigm that obsoletes us which would be great and not great for us but like great for the world yeah yeah so let me get your take on something so Paul Graham calls you the greatest business strategist uh that he's ever encountered and of course all these people are wrestling with their business strategy and what exactly to build and where and so I've been asking you questions that are more or less vertical use cases that sit on top of gpt4 and chatty and soon gpt5 and so on but there's also all these business models that are adjacent so things like Federated learning or data conditioning um or just deployment and and um uh and so those are interesting business models too if you were just investing in a class of company that's in the ecosystem any thoughts on where the greater returns are where the faster growing more interesting business models are I don't think PG quite said that I know you said something like in that direction but uh in any sense in any case I I don't think it'd be true uh I think there are people who are like unbelievable business strategists and I'm not one of them so I I hesitate to give advice here um the only thing I know how to do I think is this one strategy again and again which is very long time Horizon Capital intensive difficult technology bits and I don't even think I'm particularly good at those I just think not many people to try them so there's very little competition which is nice I mean I don't have a lot of competition um but the strategy that it takes to now like take a platform like open AI and build a new fast-growing defensible consumer Enterprise company I know almost nothing about I know all of the theory but none of the practice and I would go find people who have done it and get the practice get the advice from them all right good advice a couple questions about the underlying Tech platform here so I've been building neural networks myself since the the parameter count was sub 1 million and they were they're actually very useful for a bunch of commercial applications and then kind of watch them tip into the billion and then the you know with gpt2 I think about one and a half billion or so uh and then gpt3 and now gpt4 so you go up we don't know the current parameter count but I think it was 175 billion um in gpd3 and it was just mind-blowingly different from gpt2 and then gpt4 is even more mind-blowingly different so the the raw underlying parameter count uh seems like it's on a trend just listening to nvidia's forecasts where you can you can go from a trillion to 10 trillion um and then they're saying up to 10 quadrillion in a decade so you've got four factors of 10 or 10 000 X in a decade does that even sound like it's in the right ballpark I think it's way too much focus on Perimeter I mean parameter count will Trend up for sure uh but this reminds me a lot of the gigahertz race in chips in the like 90s and 2000s where everybody was trying to like point to a big number and then event like you don't need probably most of you don't know how many gigahertz are on your iPhone but it's fast yeah like what we actually care about is capability and I think it's important that what we keep the focus on is rapidly increasing capability and if there's some reason that parameter count should decrease over time uh or we should have like multiple models working together Each of which are smaller we would just do that like what we want to deliver to the world with the most capable useful and safe models um we are not here to like jerk ourselves off about parameter counts yeah yeah yeah well that's my uh okay uh well thank you thank you for taking that away from me um so uh uh but one thing that's absolutely unique about this class of algorithm versus anything I've ever seen before uh is that it it surprises you with raw horsepower regardless of whether you measure it in parameter count or some other way uh it it does things that you didn't anticipate purely by putting more horsepower behind it and so it takes advantage of the scale the analogy I was making this morning is if you have a spreadsheet you coated it up you run it on a computer that's 10 000 times faster it doesn't really surprise you it's it's nice and responsive it's still a spreadsheet whereas this class of algorithm does things that it just couldn't do before and so we actually one of our partners in our Venture fund wrote an entire book on gpt2 and you can buy it on Amazon it's called start here or start here romance I think about 10 copies have sold I bought one of them so maybe nine copies have sold but if you read the book it's just not a good book and here we are it's only that was four years ago it's only been four years and now the quality of the book has gone from you know gbt234 not a good book you know somewhat reasonable book to now it's possible to write a truly excellent book you have to give it the framework you have to you know you're still effectively writing the concept but it's filling in the words just beautifully and so as an author that could be a force multiplier of something like 10 100 and it just enables an author to be that much more powerful so this class of algorithm then if the underlying substrate is getting faster and faster and faster it's going to do surprising things on a relatively short time scale and so I think one of the things the people in the room need to predict is okay what is the next real world society benefiting use case that hits that Tipping Point on this curve so any insights you can give us into you know what's what's going to be possible that wasn't possible a year prior two years prior okay I said I don't have like business strategy advice I just thought of something I do uh I I think in new areas like this one of the right approaches is to let tactics become strategy instead of the other way around and you know I have my ideas I'm sure you all have your ideas maybe we'll be mostly right we'll be wrong in some ways and and even the details of how we're right will be wrong about um the I think you never want to lose sight of vision and focus on the long term but a very tight feedback loop of paying attention to what is working and what is not working and doing more of the stuff that's working and less of the stuff that's not working and just very very careful user observation can go super far so like you know I can speculate on ideas you all can speculate in ideas none of that will be as valuable as putting something out there and really deeply understanding what's happening and being responsive to it um for the next question Sam when did you know your baby chat uh GPT was something really special and what was the Special Sauce that allowed you to pull off something that others haven't and Dave will come back but yeah oh who likes Sam so far all right if Sam was hiring would you consider being part of his team okay all right we got a lot of hands great yeah please please come we we really need help and it's going to be a pretty exciting next few years um I mean we've been working on it for so long that it's like you kind of know with gradually increasing confidence that it's it's really going to work but this is you know we've been doing the company for seven years um these things I would say bye and like in terms of why it worked when others haven't it's just because we've like been on the grind sweating every detail for a long time and most people aren't willing to do that um in terms of when we knew that Chachi BT in particular was gonna like Catch Fire as a consumer product probably like 48 hours after launch yeah all right so before Dave comes one back I asked Lex to ask a sexy question hey Lex hey do you want to use the communicator you're good what is it it's a Star Trek or you're good I'm good okay I grew up in the Soviet Union we didn't have uh check off check second second season yeah let me ask some sexy controversial questions so you got uh Legends in uh artificial intelligence Ilyas discover and Audrey kapathi over there who's smarter just kidding just kidding you don't have to answer that that's that was a joke what he was about too he was thinking about it all right I like it uh no it just uh so we're at MIT and from here with Max tagmark and others they put together this open letter to Halt AI development uh for six months what are your thoughts about the uh this open letter there's parts of the thrust that I really agree with we spent more than six months after we finished training gpt4 before we released it so taking the time to really study the safety of a model to get external audits external red teamers um to to really try to understand what's going on and mitigate as much as you can that's important it's been really nice since we have launched gbt4 how many people have said like wow this is not only the most capable model opening eyes put out but like by far the safest and most aligned and unless I'm trying to get it to do something bad it won't um so that we totally I totally agree with um I also agree that as safety Cape as as capabilities get more and more serious that the safety bar has got to increase um but unfortunately I think the letter is missing like most technical nuance about what's where we need the pause like it's actually like open AI an earlier version of the letter claimed that open AI is trained in gp5 right now we are not alone for some time um so in that sense it was sort of silly but we are doing other things on top of gpt4 that I think have all sorts of safety issues that are important to address and we're totally left out of the letter um so I think moving with caution and an increasing rigor for safety issues is really important the letter I don't think is the optimal way to address it was just a quick question for me one more uh is do you have been extremely open having a lot of conversations being honest uh others at open AI as well what's the philosophy behind that because compared to other companies that are much more closed in that in that regard and do you plan to continue doing that we certainly plan to continue doing that um the trade-off is like we say dumb stuff sometimes you know stuff that turns out to be totally wrong and I think a lot of other companies don't want to say something until they're sure it's right um but I think this technology is going to so impact all of us that we believe that engaging everyone in the discussion putting these systems out into the world deeply imperfect though they are in their current state so that people get to experience them think about them understand the upsides and the downsides it's worth the trade-off even though we do tend to embarrass ourselves in public and have to change our minds with new data frequently um so we're going to keep doing that because we think it's better than any alternative and a big part of our goal at open AI is to like get the world to engage with this and think about it and and gradually update and build new institutions or adapt our existing institutions to be able to figure out what the future we all want is uh so that's kind of like why we're here so we only have a few minutes left and I have to ask you a question that that has been on my mind since I was 13 years old so I think if you read Ray Kurzweil or any of The Luminaries in this sector that the day when the algorithms start writing the code that improves the algorithms is a is a pivotal day it accelerates the process toward infinity or in the singularity view of the world to Absolute infinity and so now a lot of the companies that I'm an investor in or have been co-founder of are starting to use llms for cogeneration and it is interesting very wide range of lifts or Improvement in the performance of an engineer ranging from about five percent to about 20x and it depends on what you're trying to do what type of code how much context it needs a lot of it is related to tuning in the in the system so there's two questions in there first within open AI how much of a force multiplier do you already see within the creation of the next iteration of the code and then the follow-on question is okay what does it look like a few months from now a year from now two years from now are we getting close to that day where the thing is so rapidly self-improving that it hits some yeah great question um I think that it is going to be a much fuzzier boundary for you know getting to self-improvement or or not um I think what will happen is that more and more of the Improvement Loop will be aided by AIS but humans will still be driving it um and it's going to go like that for a long time and there's like a whole bunch of other things that I have never believed in the like one day or one month takeoff um for a bunch of reasons but like one of which is how incredibly long it takes to build new data centers bigger data centers like even if we knew how to do it right now just like waiting for the concrete to dry getting the power into the building this stuff takes a while um but I think what will happen is humans will be more and more augmented and be able to do things in the world faster and faster and it will not work out like it will not somehow like most of these things don't end up working out quite like the Sci-Fi books and neither will this one but the rate of change in the world will increase forever more from here as humans get better and better tools [Music]
Info
Channel: Imagination in Action
Views: 230,022
Rating: undefined out of 5
Keywords: emergingtech, AI, blockchain, womenintech, blockchainAI, blockchaininaction, web3, imaginationinaction, imaginators, mit
Id: T5cPoNwO7II
Channel Id: undefined
Length: 21min 25sec (1285 seconds)
Published: Mon May 08 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.