Greg Brockman: OpenAI and AGI | Lex Fridman Podcast #17

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

If the GPT-2-Large model can almost generate coherent stories a page or two long, perhaps by scaling that up 1000x they will be able to go all the way to complete, coherent short stories, with no obvious semantic/commonsense errors. Or they will be able to build chatbots that are extremely good at holding a conversation for a few rounds (say 5 minutes to 10 minutes long).

πŸ‘οΈŽ︎ 10 πŸ‘€οΈŽ︎ u/Yuli-Ban πŸ“…οΈŽ︎ Apr 05 2019 πŸ—«︎ replies

When are we going to get a chance to use it? Enough with the secrecy. There’s potential here for great art.

πŸ‘οΈŽ︎ 4 πŸ‘€οΈŽ︎ u/UNOBTANIUM πŸ“…οΈŽ︎ Apr 05 2019 πŸ—«︎ replies

Given the exponential advance rate, it won't surprise me that 100x it's a paper away and 1000x 2 papers from now. Amazing

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/SantoshiEspada πŸ“…οΈŽ︎ Apr 05 2019 πŸ—«︎ replies

TImestamp to Ilya discussing 'we'll scale up GPT-2 10x, 100x, 1000x': https://youtu.be/bIrEM2FbOLU?t=2740 'fast forward to not GPT-2 but GPT-20 and think about what that can do'

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/gwern πŸ“…οΈŽ︎ Apr 15 2019 πŸ—«︎ replies

It will be pretty interesting to see how it turns out.

πŸ‘οΈŽ︎ 2 πŸ‘€οΈŽ︎ u/dethb0y πŸ“…οΈŽ︎ Apr 05 2019 πŸ—«︎ replies
Captions
the following is a conversation with Greg Brockman he's the co-founder and CTO of open AI a world-class research organization developing ideas and AI with the goal of eventually creating a safe and friendly artificial general intelligence one that benefits and empowers humanity open AI is not only a source of publications algorithms tools and datasets their mission is a catalyst for an important public discourse about our future with both narrow and general intelligence systems this conversation is part of the artificial intelligence podcast at MIT and beyond if you enjoy it subscribe on youtube itunes or simply connect with me on twitter at Lex Friedman spelled Fri D and now here's my conversation with Greg Brockman so in high school and right after you wrote a draft of a chemistry textbook I saw that that covers everything from basic structure of the atom to quantum mechanics so it's clear you have an intuition and a passion for both the physical world with chemistry and now robotics to the digital world with AI deep learning reinforcement learning so on do you see the physical world in the digital world is different and what do you think is the gap a lot of it actually boils down to iteration speed right that I think that a lot of what really motivates me is is building things right is the I you know think about mathematics for example where you think you're really hard about a problem you understand it you're right down in this very obscure form that we call proof but then this is in humanities library right it's there forever this is some truth that we've discovered you know maybe only five people in your field will ever read it now but somehow you've kind of moved humanity forward and so I actually used to really think that I was going to be a mathematician and then I actually started writing this chemistry textbook one of my friends told me you'll never publish it because you don't have a PhD so instead I decided to build a website and try to promote my ideas that way and then I discovered programming and I you know that in programming you think hard about a problem you understand it you write down in a very obscure form that we call a program but then once again it's in humanities library right and anyone could get the benefit from and the scalability is massive and so I think that the thing that really appeals to me about the digital world is that you can have this this this insane leverage right a single individual with an idea is able to affect the entire planet and that's something I think is really hard to do if you're moving around physical atoms but you said mathematics so if you look at the the what thing you know over here our mind do you ultimately see it as just math is just information processing or is there some other magic as you've seen if you've seen through biology and chemistry and so on I think it's really interesting to think about humans is just information processing systems and that it seems like it's actually a pretty good way of describing a lot of kind of how the world works or a lot of what we're capable of to think that that you know again if you just look at technological innovations over time that in some ways the most transformative innovation that we've had it has been the computer right in some ways the internet you know that what is the right the Internet is not about these physical cables it's about the fact that I am suddenly able to instantly communicate with any other human on the planet I'm able to retrieve any piece of knowledge that in some ways the human race has ever had and that those are these insane transformations do you see the our society as a whole the collective as another extension of the intelligence of the human being so if you look at the human being is an information processing system you mentioned the internet then networking do you see us all together as a civilization as a kind of intelligence system yeah I think this is actually a really interesting perspective to take and to think about to you sort of have this collective intelligence of all of society the economy itself is this superhuman machine that is optimizing something right and it's all in some ways a company has a will of its own right that you have all these individuals we're all pursuing their own individual goals and thinking really hard and thinking about the right things to do but somehow the company does something that is this emergent thing and that is it so there's a really useful abstraction and so I think that in some ways you know we think of ourselves as the most intelligent things on the planet and the most powerful things on the planet but there are things that are bigger than us that these systems that we all contribute to and so I think actually you know it's a it's interesting to think about if you've read as a guys a models foundation right that that there's this concept of psychohistory in there which is effectively this that if you have trillions or quadrillions of beings then maybe you could actually predict what that being that that huge macro being will do I and almost independent of what the individuals want I actually have a second angle on this I think is interesting which is thinking about a technological determinism one thing that I actually think a lot about with with open a tie right is that we're kind of coming on onto this insanely transformational technology of general intelligence right that will happen at some point and there's a question of how can you take actions that will actually steer it to go better rather than worse and that I think one question you need to ask is as a scientist as an inventor as a creator what impact can you have in general right you look at things like the telephone invented by two people in the same day like what does that mean what does that mean about the shape of innovation and I think that what's going on is everyone's building on the shoulders of the same giants and so you can kind of you can't really hope to create something no one else ever would you know if Einstein wasn't born someone else would have come up with relativity you know he changed the timeline a bit right that maybe it would have taken another 20 years but it wouldn't be that fundamentally humanity would never discover these these fundamental truths so there's some kind of invisible momentum that some people like Einstein or open the eyes plugging into that's anybody else can also plug into and ultimately that wave takes us into a certain direction that's me that's right that's right and you know this kind of seems to play out in a bunch of different ways that there's some exponential that is being ridden and that the exponential itself which one it is changes think about Moore's law an entire industry set its clock to it for 50 years like how can that be right how is that possible and yet somehow it happened and so I think you can't hope to ever invent something that no one else will maybe you can change the timeline a little bit but if you really want to make a difference I think that the thing that you really have to do the only real degree of freedom you have is to set the initial conditions under which a technology is born and so you think about the internet right that there are lots of other competitors trying to build similar things and the internet one and that the initial conditions where that was created by this group that really valued people being able to be you know anyone being able to plug in this very academic mindset of being open and connected and I think that the Internet for the next 40 years really played out that way you know maybe today things are starting to shift in a different direction but I think if those initial conditions were really important to determine the next 40 years worth of progress that's really beautifully put so another example of that I think about you know I recently looked at it I looked at Wikipedia the formation of Wikipedia and I wonder what the internet would be like if Wikipedia had ads you know there's a interesting argument that why they chose not to make it put advertisement wikipedia i think it's i think wikipedia is one of the greatest resources we have on the internet it's extremely surprising how well it works and how well it was able to aggregate all this kind of good information and they essentially the creator of wikipedia I don't know there's probably some debates there but set the initial conditions and now it carried it itself forward that's really interesting so you're the way you're thinking about AGI or artificial intelligences you're focused on setting the initial conditions for the for the progress that's right that's powerful okay so look into the future if you create an AGI system like one that can ace the Turing test natural language what do you think would be the interactions you would have with it what do you think are the questions you would ask like what would be the first question you would ask it her/him that's right I think it at that point if you've really built a powerful system that is capable of shaping the future of humanity the first question that you really should ask is how do we make sure that this plays out well and so that's actually the first question that I would ask a powerful AGI system is so you wouldn't ask your colleague you wouldn't ask like Ilya you would ask the AGI system oh we've already had the conversation with Ilya right and everyone here and so you want as many perspectives and a piece of wisdom as you can for it for answering this question so I don't think you necessarily defer to whatever your powerful system tells you but you use as one input I like to try to figure out what to do but and I guess fundamentally what it really comes down to is if you built something really powerful and you think about think about for example the creation of of shortly after the creation of nuclear weapons right the most important question the world was what's the world order going to be like how do we set ourselves up in where we're going to be able to survive this species with a GI I think the question is slightly different right that there is a question of how do we make sure that we don't get the negative effects but there's also the positive side right you imagine that you know like like what won't AGI be like like what will be capable of and I think that one of the core reasons that an AGI can be powerful and transformative is actually due to technological development yeah right if you have something that's capable as capable as a human and that it's much more scalable that you absolutely want that thing to go read the whole scientific literature and think about how to create cures for all the diseases right you want it to think about how to go and build technologies to help us create material abundance and to figure out societal problems that we have trouble with like how we're supposed to clean up the environment and you know maybe you want this to go and invent a bunch of little robots that will go out and be biodegradable and turn ocean debris into harmless molecules and I think that that that positive side is something that I think people miss sometimes when thinking about what an AGI will be like and so I think that if you have a system that's capable of all of that you absolutely want its advice about how do I make sure that we're using your your capabilities in a positive way for Humanity so what do you think about that psychology that looks at all the different possible trajectories of an AGI system many of which perhaps the majority of which are positive and nevertheless focuses on the negative trajectories I mean you get to interact with folks you get to think about this maybe within yourself as well you look at sam harris and so on it seems to be sorry to put it this way but almost more fun to think about the negative possibilities whatever that's deep in our psychology what do you think about that and how do we deal with it because we want AI to help us so I think there's kind of two problems so I entailed in that question the first is more of the question of how can you even picture what a world with a new technology will any like now imagine were in 1950 and I'm trying to describe Buber to someone apps and the internet yeah I mean your yeah that's that's going to be extremely complicated but it's imaginable it's imaginable right but and now imagine being a 1950 and predicting goober right and you need to describe the internet you need to describe GPS you need to describe the fact that everyone's going to have this phone in their pocket and so I think that that just the first truth is that it is hard to picture how a transformative technology will play out in the world we've seen that before with technologies that are far less transformative than AG I will be and so I think that that one piece is that it's just even hard to imagine and to really put yourself in a world where you can predict what that that positive vision would be like and you know I think the second thing is that it is I think it is always easier to support the negative side than the positive side it's always easier to destroy than create and you know less than in a physical sense and more just in an intellectual sense right because you know I think that with creating something you need to just get a bunch of things right and to destroy you just need to get one thing wrong yeah and so I think that that what that means is that I think a lot of people's thinking dead ends as soon as they see the negative story but that being said I actually actually have some hope right I think that the the positive vision is something that I think can be something that we can we can talk about I think that just simply saying this fact of yeah like there's positive there's negatives everyone likes to draw them the negative people have to respond well to that message and say huh you're right there's a part of this that we're not talking about not thinking about and that's actually something that's that's that's I think really been a key part of how we think about AGI at open AI right you can kind of look at it as like okay like opening eye talks about the fact that there are risks and yet they're trying to build this system like how do you square this those two facts so do you share the intuition that some people have I mean from Sam Harris even Elon Musk himself that it's tricky as you develop AGI to keep it from slipping into the existential threats into the negative what's your intuition about how hard is it to keep a a development on the positive track and you what's your intuition there to answer the question you can really look at how we structure open AI so we really have three main arms we have capabilities which is actually doing the technical work and pushing forward what these systems can do there's safety which is working on technical mechanisms to ensure that the systems we build are lined with human values and then there's policy which is making sure that we have governance mechanisms answering that question of well whose values and so I think that the technical safety one is the one that people kind of talk about the most right you talk about like think about you know all of the dystopic AI movies a lot of that is about not having good technical safety in place and what we've been finding is that you know I think that actually a lot of people look at the technical safety problem and think it's just intractable right this question of what do humans want how am I supposed to write that down can I even write down what I want no way and then they stop there but the thing is we've already built systems that are able to learn things that humans can't specify you know even the rules for how to recognize if there's a cat or a dog in an image turns out its intractable to write that down and yet we're able to learn it and that what we're seeing with systems we build it open it yeah and there's still an early proof of concept stage is that you are able to learn human preferences you're able to learn what humans want from data and so that's kind of the core focus for our technical safety team and I think that they're actually we've had some pretty encouraging updates in terms of what we've been able to make work so you have an intuition and a hope that from data you know looking at the value alignment problem from data we can build systems that align with the collective better angels of our nature so aligned with the ethics and the morals of human beings to even say this in a different way I mean think about how do we align in humans right think about like a human baby can grow up to be an evil person or a great person and a lot of that is from learning from data right that you have some feedback as a child is growing up they get to see positive examples and so I think that that just like them that the the only example we have of a general intelligence that is able to learn from data I too aligned with human values and to learn values I think we shouldn't be surprised that we can do the same sorts of techniques or whether the same sort of techniques end up being how we we saw value alignment for AG eyes so let's go even higher as I don't know if you've read the book sapiens mm-hmm but there's an idea that you know that as a collective is us human beings who kind of develop together and ideas that we hold there's no in that context objective truth we just kind of all agree to certain ideas and hold them as a collective if you have a sense that there is in the world of good and evil do you have a sense that to the first approximation there are some things that are good and that you could teach systems to behave to be good so I think that this actually blends into our third team right which is the policy team and this is the one the the aspect I think people really talk about way less than they should all right because imagine that we built super-powerful systems that we've managed to figure out all the mechanisms for these things to do whatever the operator wants the most important question becomes who's the operator what do they want and how is that going to affect everyone else right and and I think that this question of what is good what are those values I mean I think you don't even have to go to those those very grand existential places to start to realize how hard this problem is you just look at different countries and cultures across the world and that there's there's a very different conception of how the world works and you know what what what kinds of of ways that society wants to operate and so I think that the really core question is is is actually very concrete um and I think it's not a question that we have ready answers to right is how do you have a world where all the different countries that we have United States China Russia and you know the hundreds of other countries out there are able to continue to not just operate in the way that they see fit but in that the world that emerges in these where you have these very powerful systems I operating alongside humans ends up being something that empowers humans more that makes like exhuming existence be a more meaningful thing and the people are happier in wealthier and able to live more fulfilling lives it's not nob vyas thing for how to design that world once you have that very powerful system so if we take a little step back and we're having it like a fascinating conversation and open eyes in many ways a tech leader in the world and yet we're thinking about these big existential questions which is fascinating really important I think you're a leader in that space and it's a really important space of just thinking how AI affect society in a big-picture view so Oscar Wilde said we're all in the gutter but some of us are looking at the Stars and I think open air has a charter that looks to the Stars I would say to create intelligence to create general intelligence make it beneficial safe and collaborative so can you tell me how that came about how a mission like that and the path to creating a mission like that open yeah I was founded yeah so I think that in some ways it really boils down to taking a look at the landscape alright so if you think about the history of AI that basically for the past 60 or 70 years people have thought about this goal of what could happen if you could automate human intellectual labor right imagine you can build a computer system that could do that what becomes possible well out of sci-fi that tells stories of various dystopian and you know increasingly you have movies like heard that tell you a little bit about maybe more of a little bit utopic vision I you think about the impacts that we've seen from being able to have bicycles for our minds and computers and that I think that the the impact of computers and the Internet has just far outstripped what anyone really could have predicted and so I think that it's very clear that if you can build an AI it will be the most transformative technology that humans will ever create and so what it boils down to then is a question of well is there a path is there hope is there a way to build such a system and I think that for 60 or 70 years that people got excited and I they you know ended up not being able to deliver on the hopes that the people I pinned on them and I think that then you know that after you know two to winters of AI development that people I you know I think kind of almost stopped daring to dream right the really talking about a GI or thinking about a GI became almost this taboo in the community but I actually think that people took the wrong lesson from AI history and if you look back starting in nineteen fifty nine is when the perceptron was released and this is basically you know one of the earliest neural networks it was released to what was perceived as this massive overhype so in the New York Times in nineteen fifty-nine you have this article saying that you know the the perceptron will one day recognize people call out their names instantly translate speech between languages and people at the time looked at this and said this is Jack your system can't do any of that and basically spent ten years trying to discredit the whole perceptron direction and succeeded and all the funding dried up and you know people kind of went in other directions and you know the 80s there was a resurgence and I'd always heard that the resurgence in the 80s was due to the invention of back propagation and these these algorithms that got people excited but actually the causality was due to people building larger computers that you can find these these articles from the 80s saying that the democratization of computing power suddenly meant that you could run these larger neural networks and then people start to do all these amazing things the backpropagation algorithm was invented and you know that the the neural nets people running were these tiny little like 20 neuron neural nets right what are you supposed to learn with 20 neurons and so of course they weren't able to get great results and it really wasn't until 2012 that this approach that's almost the most simple natural approach that people have come up with in the 50s right in some ways even in the 40s before there were computers with a Pitts McCullen air and neuron suddenly this became the best way of solving problems right and I think there are three core properties that deep learning has that I think are very worth paying attention to the first is generality we have a very small number of deep learning tools SGD deep neural net maybe some some you know RL and it solves this huge variety of problems speech recognition machine translation game playing all these problems small set of tools so there's the generality there's a second piece which is the competence you want to solve any of those problems throw it forty years worth of computer vision research replacing the deep neural net it's kind of work better and there's a third piece which is the scalability right the one thing that has been shown time and time again is that you if you have a larger neural network for a more compute more data at it it will work better those three properties together feel like essential parts of building a general intelligence now it doesn't just mean that if we scale up what we have that we will have an AGI right there are clearly missing pieces they're missing ideas we need to have answers for reasoning but I think that the core here is that for the first time it feels that we have a paradigm that gives us hope the general intelligence can be achievable and so as soon as you believe that everything else becomes comes into focus right if you imagine that you may be able to and you know that the timeline I think remains uncertain on the but I think that that you know certainly within our lifetimes and possibly within a much shorter period of time than then people would expect if you can really build the most transformative technology that will ever exist you stop thinking about yourself so much right and you start thinking about just like how do you have a world where this goes well and that you need to think about the practicalities of how do you build an organization and get together a bunch of people and resources and to make sure that people feel motivated and ready to do it but I think that then you start thinking about well what if we succeed and how do we make sure that when we succeed that the world is actually the place that we want ourselves to exist then and almost in the Rawls the unveils sense of the word and so that's kind of the broader landscape and opening I was really formed in 2015 with that high level picture of AGI might be possible sooner than people think and that we need to try to do our best to make sure it's going to go well and then we spent the next couple years really trying to figure out what does that mean how do we do it and you know I think that typically with a company you start out very small so you in a co-founder and you build a product you got some users you get a product market fit you know then at some point you raise some money you hire people you scale and then you know down the road then the big companies realize you exist and try to kill you and for opening I it was basically everything in exactly the order let me just pause for a second he said a lot of things and let me just admire the jarring aspect of what open AI stands for which is daring to dream I mean you said it's pretty powerful you caught me off guard because I think that's very true the-the-the step of just daring to dream about the possibilities of creating intelligence in a positive in a safe way but just even creating intelligence is a much needed refreshing catalyst for the AI community so that's that's the starting point okay so then formation of open AI was just I just say that you know when we were starting opening AI that kind of the first question that we had is is it too late to start a lab with a bunch of the best people possible that was an actual question so those were those that was the core question of you know hey there's dinner in July of 20 2015 and there's that was that was really what we spent the whole time talking about and you know cuz it's the you think about kind of where AI was is that it transitioned from being an academic pursuit to an industrial pursuit and so a lot of the best people were in these big research labs and that we wanted to start our own one that you know no matter how much resources we could accumulate it would be you know pale in comparison to the big tech companies and we knew that and there's a question of are we going to be actually able to get this thing off the ground you need critical mass you can't just do you and a co-founder build a product right you really need to have a group of you know five to ten people and we kind of concluded it wasn't obviously impossible so it seemed worth trying well you're also dreamers so who knows right that's right okay so speaking of that competing with with the the big players let's talk about some of the some of the tricky things as you think through this process of growing of seeing how you can develop these systems a task at scale that competes so you recently recently formed open ILP a new cap profit company that now carries the name open it so open has now this official company the original non profit company still exists and carries the opening I nonprofit name so can you explain what this company is what the purpose of us creation is and how did you arrive at the decision yep to create it openly I the whole entity and opening I LP as a vehicle is trying to accomplish the mission of ensuring that artificial general intelligence benefits everyone and the main way that we're trying to do that is by actually trying to build general intelligence ourselves and make sure the benefits are distributed to the world that's the primary way we're also fine if someone else does this all right it doesn't have to be us if someone else is going to build an AGI and make sure that the benefits don't get locked up in one company or you know one one want with one set of people like we're actually fine with that and so those ideas are baked into our Charter which is kind of the the foundational document that are describes kind of our values and how we operate but it's also really baked into the structure of open at LP and so the way that we've set up opening ILP is that in the case where we succeed right if we actually build what we're trying to build then investors are able to get a return and but that return is something that is capped and so if you think of AGI in terms of data the value that you could really create you're talking about the most transformative technology ever created it's going to create orders of magnitude more value than any existing company and that all of that value will be owned by the world like legally title to the nonprofit to fulfill that mission and so that's that's the structure so the mission is a powerful one and it's a it's one that I think most people would agree with it's how we would hope a I progresses and so how do you tie yourself to that mission how do you make sure you do not deviate from that mission that you know other incentives that are profit driven wouldn't don't interfere with the mission so this was actually a really core question for us for the past couple years because you know I'd say that like the way that our history went was that for the first year we were getting off the ground right we had this high level picture but we didn't know exactly how we wanted to accomplish it and really two years ago it's when we first started realizing in order to build a GI we're just going to need to raise way more money than we can as a nonprofit I mean you're talking many billions of dollars and so the first question is how are you supposed to do that and stay true to this mission and we looked at every legal structure out there and concluded none of them were quite right for what we wanted to do and I guess it shouldn't be too surprising if you're going to do something like crazy unprecedented technology that you're gonna have to come up with some crazy unprecedent structure to do it in and a lot of a lot of our conversation was with people at opening I write the people who really join because they believe so much in this mission and thinking about how do we actually raise the resources to do it and also stay true to to what we stand for and the place you got to start is to really align on what is it that we stand for right what are those values what's really important to us and so I'd say that we spent about a year really compiling the opening I'd charter and that determines and if you even look at the first the first line item in there it says that look we expect we're gonna have to marshal huge amounts of resources but we're going to make sure that we minimize conflicts of interest with the mission and that kind of aligning on all of those pieces was the most important step towards figuring out how do we structure a company that can actually raise the resources to do what we need to do I imagined open AI the decision to create open ILP was a really difficult one and there was a lot of discussions as you mentioned for a year and there was different ideas perhaps detractors with an open AI sort of different paths that you could have taken what were those concerns what were the different paths considered what was that process of making that decision like yep um but so if you look actually at the opening I charter that there's almost two paths embedded within it there is we are primarily trying to build AGI ourselves but we're also ok if someone else does it and this is a weird thing for a company it's really interesting actually yeah there there is an element of competition that you do want to be the one that does it but at the same time you're ok somebody else's and you know we'll talk about that a little bit that trade-off that's the day that's really interesting and I think this was the core tension as we were designing open an ILP and really the opening eye strategy is how do you make sure that both you have a shot at being a primary actor which really requires building an organization raising massive resources and really having the will to go and execute on some really really hard vision all right you need to really sign up for a long period to go and take on a lot of pain and a lot of risk and to do that normally you just import the startup mindset right and that you think about okay like how do we how to execute everyone you give this very competitive angle but you also have the second angle of saying that well the true mission isn't for opening high to build a GI the true mission is for AGI to go well for Humanity and so how do you take all of those first actions and make sure you don't close the door on outcomes that would actually be positive in fulfill the mission and so I think it's a very delicate balance right I think that going 100% one direction or the other is clearly not the correct answer and so I think that even in terms of just how we talk about opening I and think about it there's just like like one thing that's always in the back of my mind is to make sure that we're not just saying opening eyes goal is to build AGI right that it's actually much broader than that right that first of all I you know it's not just AGI it's safe AGI that's very important but secondly our goal isn't to be the ones to build it our goal is to make sure it goes well for the world and so I think that figuring out how do you balance all of those and to get people to really come to the table and compile the the like a single document that that encompasses all of that wasn't trivial so part of the challenge here is your mission is I would say beautiful empowering and a beacon of hope for people in the research community and just people thinking about AI so your decisions are scrutinized more than I think a regular profit driven company do you feel the burden of this in the creation of the Charter and just in the way you operate yes so why do you lean into the burden by creating such a charter why not to keep it quiet I mean it just boils down to the to the mission right I'm here and everyone else is here because we think this is the most important mission right dare to dream all right so what do you think you can be good for the world or create an a GI system that's good when you're a for-profit company from my perspective I don't understand why profit interferes with positive impact on society I don't understand by Google that makes most of its money from ads you can't also do good for the world or other companies Facebook anything I don't I don't understand why those have to interfere you know you can profit isn't the thing in my view that affects the impact of a company what affects the impact of the company is the Charter is the culture is the you know the people inside and profit is the thing that just fuels those people so what are your views there yeah so I think that's a really good question and there's there's there's some some you know real like long-standing debates in human society that are wrapped up in it the way that I think about it is just think about what what are the most impactful nonprofits in the world what are the most impactful for profits in the world right is much easier to lists the for profits that's right and I think that there's there's some real truth here that the system that we set up the system for kind of how you know today's world is organized is one that that really allows for huge impact and that that you know kind of part of that is that you need to be you know for profits are our self-sustaining and able to to kind of you know build on their own momentum and I think that's a really powerful thing it's something that when it turns out that we haven't set the guardrails correctly causes problems right think about logging companies that go and DeForest you know you know the rain forest that's really bad we don't want that and it's actually really interesting to me the kind of this this question of how do you get positive benefits out of a for-profit company it's actually very similar to how do you get positive benefits out of an AGI right that you have this like very powerful system it's more powerful than any human and it's kind of autonomous in some ways you know super human and a lot of axes and somehow you have to set the guardrails to get good to happen but when you do the benefits are massive and so I think that the when when I think about nonprofit vs. for-profit I think it's just not enough happens in nonprofits they're very pure but it's just kind of you know it's just hard to do things they're in for profits in some ways like too much happens but if if kind of shaped in the right way it can actually be very positive and so with open NLP we're picking a road in between now the thing I think is really important to recognize is that the way that we think about opening ILP is that in the world where AGI actually happens right in a world where we are successful we build the most transformative technology ever the amount of value we're going to create will be astronomical and so then in that case that the if it the the cap that we have will be a small fraction of the value we create and the amount of value that goes back to investors and employees looks pretty similar to what would happen in a pretty successful startup and that's really the case that we're optimizing for right that we're thinking about in the success case making sure that the value we create doesn't get locked up and I expect that in another you know for-profit companies that it's possible to do something like that I think it's not obvious how to do it right and I think that as a for-profit company you have a lot of fiduciary duty to your shareholders and that there are certain decisions you just cannot make in our structure we've set it up so that we have a fiduciary duty to the Charter that we always get to make the decision that is right for the Charter rather than even if it comes at the expense of our own stakeholders and and so I think that when I think about what's really important it's not really about nonprofit vs. for-profit it's really a question of if you build a GI and you kind of you know humanities now in this new age who benefits whose lives are better and I think that what's really important is to have an answer that is everyone yeah which is one of the core aspects of the Charter so one concern people have not just with open the eye but with Google Facebook Amazon anybody really that's that's creating impact that scale is how do we avoid as your Charter says avoid enabling the use of or AGI to unduly concentrate power why would not a company like open a I keep all the power of an AGI system to itself the Charter the Charter so you know how does the Charter actualize itself in day to day so I think that first to zoom out right there the way that we structure the company is so that the the power first sort of you know dictating the actions that opening eye takes ultimately rests with the board right the board of the nonprofit I'm and the board is set up in certain ways certain certain restrictions that you can read about in the opening hi LP blog post but effectively the board is the is the governing body for opening ILP and the board has a duty to fulfill the mission of the nonprofit and so that's kind of how we tie how we thread all these things together now there's a question of so day to day how do people the individuals who in some ways are the most empowered ones ain't no the board sort of gets to call the shots at the high level but the people who are actually executing are the employees the way that people here on a day-to-day basis who have the you know the the keys to the technical Kingdom and their I think that the answer looks a lot like well how does any company's values get actualized right I think that a lot of that comes down to that you need people who are here because they really believe in that mission and they believe in the Charter and that they are willing to take actions that maybe are worse for them but are better for the Charter and that's something that's really baked into the culture and honestly I think it's I you know I think that that's one of the things that we really have to work to preserve as time goes on and that's a really important part of how we think about hiring people and bringing people into opening I so there's people here there's people here who could speak up and say like hold on a second this is totally against what we stand for cultural eyes yeah yeah for sure I mean I think that that we actually have I think that's like a pretty important part of how we operate and how we have even again with designing the Charter and designing open alp in the first place that there has been a lot of conversation with employees here and a lot of times where employees said wait a second this seems like it's coming in the wrong direction and let's talk about it and so I think one thing that's that's I think I really and you know here's here's actually one thing I think is very unique about us as a small company is that if you're at a massive tech giant that's a little bit hard for someone who's aligned employee to go and talk to the CEO and say I think that we're doing this wrong and you know you look at companies like Google that have had some collective action from employees to you know make ethical change around things like maven and so maybe there are mechanisms that other companies that work but here super easy for anyone to pull me aside to pull Sam aside to Balilla aside and people do it all the time one of the interesting things in the Charter is this idea that it'd be great if you could try to describe or untangle switching from competition to collaboration and late-stage AGI development it was really interesting this dance between competition and collaboration how do you think about that yeah assuming you can actually do the technical side of AGI development I think there's going to be two key problems with figuring out how do you actually deploy it make it go well the first one of these is the run-up to building the first AGI you look at how self-driving cars are being developed and it's a competitive race I'm the thing that always happens in a competitive race is that you have huge amounts of pressure to get rid of safety and so that's one thing we're very concerned about right is that people multiple teams figuring out we can actually get there but you know if we took the slower path that is more guaranteed to be safe we will lose and so we're going to take the fast path and so the more that we can both ourselves be in a position where we don't generate that competitive race where we say if the race is being run and that you know someone else's is further ahead than we are we're not gonna try to to leapfrog we're gonna actually work with them right we will help them succeed as long as what they're trying to do is to fulfill our mission then we're good we don't have to build AGI ourselves and I think that's a really important commitment from us but it can't just be unilateral right I think that's really important that other players who are serious about building AGI make similar commitments right I think that that you know again to the extent that everyone believes that AGI should be something to benefit everyone then it actually really shouldn't matter which company builds it and we should all be concerned about the case where we just race so hard to get there that something goes wrong so what role do you think government our favorite entity has in setting policy and rules about this domain from research to the development to early stage to late stage a a inhi development so I think that first of all is really important the government's in their right in some way shape or form you know at the end of the day we're talking about building technology that will shape how the world operates and that there needs to be government as part of that answer and so that's why we've we've we've done a number of different congressional testimonies we interact with a number of different lawmakers and the you know right now a lot of our message to them is that it's not the time for regulation it is the time for measurement right that our main policy recommendation is that people and you know the government does this all the time with bodies like NIST um spend time trying to figure out just where the technology is how fast it's moving and can really become literate and up to speed with respect to what to expect so I think that today the answer really is about about about measurement and I think if there will be a time in place where that will change and I think it's a little bit hard to predict exactly I what what exactly that trajectory should look like so there will be a point oh it's regulation federal in the United States the government steps in and and helps be the I don't want to say the adult in the room to make sure that there is strict rules may be conservative rules that nobody can cross well I think there's this kind of maybe to two angles to it so today with narrow AI applications that I think there are already existing bodies that are responsible and should be responsible for regulation you think about for example with self-driving cars that you want the you know the National Highway it's exactly to be very good mat that makes sense right that basically what we're saying is that we're going to have these technological systems that are going to be do performing applications that humans already do great we already have ways of thinking about standards and safety for those so I think actually empowering those regulators today is also pretty important and then I think for for a GI you know that there's going to be a point where we'll have better answers and I think that maybe a similar approach of first measurement and you know start thinking about what the rules should be I think it's really important that we don't prematurely squash you know progress I think it's very easy to kind of smother the budding field and I think that's something to really avoid but I don't think it's the right way of doing it is to say let's just try to blaze ahead and not involve all these other stakeholders so you've recently released a paper on GPT two language modeling but did not release the full model because you have concerns about the possible negative effects of the availability of such model it's uh outside of just that decision is super interesting because of the discussion as at a societal level the discourse it creates so it's fascinating in that aspect but if you think that's the specifics here at first what are some negative effects that you envisioned and of course what are some of the positive effects yeah so again I think to zoom out like the way that we thought about GPT 2 is that with language modeling we are clearly on a trajectory right now where we scale up our models and we get qualitatively better performance right GPT 2 itself was actually just a scale-up of a model that we've released in the previous June right and we just ran it at you know much larger scale and we got these results we're suddenly starting to write coherent prose which was not something we'd seen previously and what are we doing now well we're gonna scale up GPT 2 by 10x by hundred X by thousand X and we don't know what we're going to get and so it's very clear that the model that we were that we released last June you know I think it's kind of like it's it's it's it's a good academic toy it's not something that we think is something that can really have negative applications or you know to the sense that it can the positive of people being able to play with it is you know far far outweighs the possible harms you fast forward to not GPT to buy GPU 20 and you think about what that's gonna be like and I think that the capabilities are going to be substantive and so if there needs to be a point in between the two where you say this is something where we are drawing the line and that we need to start thinking about the safety aspects and I think for GPT too we could have gone either way and in fact when we had conversations internally that we had a bunch of pros and cons and it wasn't clear which one which one outweighed the other and I think that when we announced that hey we decide not to release this model then there was a bunch of conversation where various people said it's so obvious that you should have just released it there other people said it's so obvious you should not have released it and I think that that almost definitionally means that holding it back was the correct decision right if it's contra if there's if it's not obvious whether something is beneficial or not you should probably default to caution and so I think that the overall landscape for how we think about it is that this decision could have gone either way there are great arguments in both directions but for future models down the road and possibly sooner than you'd expect because you know scaling these things up doesn't have to take that long those ones but you're definitely not going to want to release into the wild and so I think that we almost view this as a test case and to see can we even design you know how do you have a society or how do you have a system that goes from having no concept of responsible disclosure where the mere idea of not releasing something for safety reasons is unfamiliar to a world where you say okay we have a powerful model let's at least think about it let's go through some process and you think about the security community it took them a long time to design responsible disclosure right you know you think about this question of well I have a security exploit I send it to the company the companies like tries to prosecute me or just sit just ignores it what do I do right and so you know the alternatives of oh I just just always publish your exploits that doesn't seem good either right and so it really took a long time and took this this it was bigger than any individual right is really about building the whole community that believed that okay we'll have this process where you send it to the company you know if they don't act in a certain time then you can go public and you're not a bad person you've done the right thing and I think that in AI part of the response of gbt to just proves that we don't have any concept of this so that's the high level picture um and so I think that I think this was this was a really important move to make and we could have maybe delayed it for D BT 3 but I'm really glad we did it for GPT too and so now you look at GPT 2 itself and you think about the substance of okay what are potential negative applications so you have this model that's been trained on the Internet which you know it's also going to be a bunch of very biased data a bunch of you know very offensive content and there and you can ask it to generate content for you on basically any topic right you just give it a prompt and we'll just start start writing and all writes content like you see on the internet you know even down to like saying advertisement in the middle of some of its generations and you think about the possibilities for generating fake news or abusive content and you know it's interesting seeing what people have done with you know we released a smaller version of GPT too and the people have done things like try to generate now I you know take my own Facebook message history and generate more Facebook messages like me and people generating fake politician content or you know there's a bunch of things there where you at least have to think is this going to be good for the world there's the flip side which is I think that there's a lot of awesome applications that we really want to see like creative applications in terms of if you have sci-fi authors that can work with this tool and come up with cool ideas like that seems that seems awesome if we can write better sci-fi through the use of these tools and we've actually had a bunch of people write in to us asking hey can we use it for you know for a variety of different creative applications so the positive I actually pretty easy to imagine there if you know the usual NLP applications are really interesting but let's go there it's kind of interesting to think about a world where look at Twitter where that just fake news but smarter and smarter BOTS being able to spread in an interesting complex in that working way in information that just floods out us regular human beings with our original thoughts so what are your views of this world with deep t20 right what are you how do we think about again it's like one of those things about in the 50s trying to describe the the internet or the smartphone what do you think about that world the nature of information do we and one possibility is that we'll always try to design systems that identify it robot versus human and we'll do so successfully and so we will authenticate that we're still human and the other world is that we just accept the part the fact that we're swimming in a sea of fake news and just learn to swim there well have you ever seen the there so you know popular meme of of robot eye with a physical physical arm and pen clicking the I'm not a robot button yeah I think I think the truth is that that really trying to distinguish between robot and human is a losing battle ultimately you think it's a losing battle I think it's a losing battle ultimately right I think that that is that in terms of the content in terms of the actions that you can take I mean think about how captures have gone alright the captures used to be a very nice simple you have this image all of our OCR is terrible you put a couple of of artifacts in it you know humans are gonna be able to tell what what it is an AI system wouldn't be able to today like I can barely do CAPTCHAs yeah and I think that that this is just kind of where we're going I think CAPTCHAs where we're a moment in time thing and as AI you systems become more powerful that they're being human capabilities that can be measured in a very easy automated way that the a eyes will not be capable of I think that's just like it's just an increasingly hard technical battle but it's not that all hope is lost right and you think about how do we already authenticate ourselves right the you know we have systems we have social security numbers if you're in the u.s. or you know you have you have uh you know ways of identifying individual people and having real world identity tied to to digital identity seems like a step towards you know authenticating the source of content rather than the content itself now there are problems with that how can you have privacy and unanimity in a world where the only content you can really trust is or the only way you can trust content is by looking at where it comes from and so I think that building out good reputation networks maybe maybe one possible solution but yeah I think that this this question is it's not an obvious one and I think that we you know maybe sooner than we think we'll be in a world where you know today I often will read a tweet and be like I feel like a real human wrote this or you know don't feel like this is like genuine I feel like I kind of judge the content a little bit and I think in the future it just won't be the case you will get for example the FCC comments on net neutrality it came out later that millions of those were auto-generated and that the researchers were able to do various statistical tik techniques to do that what do you do in a world where those statistical techniques don't exist it's just impossible to tell the difference between humans at any highs and in fact the the the the most persuasive arguments are written by by AI all that stuff it's not sci-fi anymore you okay GPT to making a great argument for why recycling is bad for the world you got to read that be like huh you're right yeah that's that's quite interesting I mean ultimately it boils down to the physical world being the last frontier of proving so you said like basically networks of people humans vouching for humans in the physical world and somehow the authentication and ends there I mean if I had to ask you I mean you're way too eloquent for a human so if I had to ask you to authenticate like prove how do I know you're not a robot and how do you know I'm not a robot you know I think that's so far were this in the space this conversation we just had the physical movements we did is the biggest gap between us and AI systems is the physical relation so maybe that's the last frontier well here's another question is is you know why why is why is solving this problem important right like what aspects are really important to us I think that probably where we'll end up is will hone in on what do we really want out of knowing if we're talking to a human and and I think that again this comes down to identity and so I think that the Internet of the future I expect to be one that will have lots of agents out there that will interact with with you but I think that the question of is this you know a real flesh-and-blood human or is this an automated system be less important let's actually go there it's GPT two is impressive and let's look at GPT 20 why is it so bad that all my friends are GPT 20 well why is it so why is it so important on the internet do you think to interact with only human beings why can't we live in a world where ideas can come from models trained on human data yeah I think this is I think is actually a really interesting question this comes back to the how do you even picture a world with some new technology right and I think that that one thing I think is important is is you know Gosei honesty um and I think that if you have you know almost in the Turing test style sense sense of technology you have a eyes that are pretending to be humans and deceiving you I think that is you know that that feels like a bad thing right I think that it's really important that we feel like we're in control of our environment right that we understand who we're interacting with and if it's an AI or a human um that that's not something we're being deceived about but I think that the flipside of can I have as a meaningful of an interaction with an AI as I can with a human well I actually think here you can turn to sci-fi and her I think is a great example of asking this very question right and one thing I really love about her is it really starts out almost by asking how meaningful are human virtual relationships right and and then you have a human who has a relationship with an AI and that you really start to be drawn into that right and that all of your emotional buttons get triggered in the same way as if there was a real human that was on the other side of that phone and so I think that that this is one way of thinking about it is that I think that we can have meaningful interactions and that if there's a funny joke some sense it doesn't really matter if it was written by a human or an AI but what you don't want anyway I think we should really draw hard lines is deception and I think that as long as we're in a world where you know why do why do we build AI systems at all alright the reason we want to build them is to enhance human lives to make humans be able to do more things to have human humans feel more fulfilled and if we can build AI systems that do that I you know sign me up so the process of language modeling how far do you think it take us let's look at movie her do you think a dialog natural language conversation is formulated by the Turing test for example do you think that process could be achieved through this kind of unsupervised language modeling so I think the Turing test in it seems real form isn't just about language right it's really about reasoning to write that to really pass the Turing test I should be able to teach calculus to whoever's on the other side and have it really understand calculus and be able to you know go and solve new calculus problems and so I think that to really solve the Turing test we need more than what we're seeing with language models we need some way of plugging and reasoning now how different will that be from what we already do that's an open question right might be that we need some sequence of totally radical new ideas or it might be that we just need to kind of shape our existing systems in a slightly different way but I think that in terms of how far language modeling will go it's already gone way further than many people would have expected right I think that things like and I think there's a lot of really interesting angles to poke in terms of how much does GBT to understand physical world like you know you you read a little bit about fire under water in ng bt - so it's like okay maybe it doesn't quite understand what these things are but at the same time I think that you also see various things like smoke coming from flame and you know a bunch of these things that gbg - it has no body it is no physical experience it's just statically read data and I think that I think that if the answer is like we don't know yet then these questions though we're starting to be able to actually ask them to physical systems the real systems that exist and that's very exciting do you think what's your intuition do you think if you just scale language modeling maintain like significantly scale that reasoning can emerge from the same exact mechanisms I think it's unlikely that if we just scale gbt - that will have reasoning in the full-fledged way and I think that there is like you know the type signature is a little bit wrong right that like there's something we do with that we call thinking right where we spend a lot of compute like a variable amount of compute get to better answers right I think a little bit harder I get a better answer and that that kind of type signature isn't quite encoded in a gbt all right G BT well kind of like it's been a long time and it's like evolutionary history baking and all this information getting very very good at this predictive process and then at runtime I just kind of do one forward pass and and am able to generate stuff and so you know there might be small tweaks to what we do in order to get the type signature right for example well you know it's not really one forward pass right you know you generate symbol by symbol and so maybe you generate like a whole sequence of thoughts and you only keep like the last bit or something right um but I think that at the very least I would expect you have to make changes like that yeah yeah just exactly how we you said think is the process of generating thought by thought in the same kind of way you like you said keep the last bit the thing that we converge towards you know and I think there's there's another piece which is which is interesting which is this out of distribution generalization right that like thinking somehow lets us do that right that we have an experience a thing and yet somehow we just kind of keep refine our mental model of it this is again something that feels tied to whatever reasoning is and maybe it's a small tweak to what we do maybe it's many ideas and we'll take as many decades yeah so the the assumption they're generalization out of distribution is that it's possible to create new new ideas the pot you know it's possible that nobody's ever creating new ideas and then was scaling GPT 2 to GPT 20 you would you would essentially generalize to all possible thoughts the Aussie was gonna have I think just to play devil's ne how many new new story ideas have we come up with since Shakespeare right yeah exactly it's just all different forms of love and drama and so on okay not sure if you read bitter lesson a recent blog post by Ray Sutton no I have he basically says something that echoes some of the ideas that you've been talking about which is he says the biggest lesson that can be read from so many years of AI research is that general methods the leverage computation are ultimately going to ultimately win out do you agree with this so basically and openly I in general about the ideas you are exploring about coming up with methods whether it's GPT to modeling or whether its opening i-5 playing dota or a general method is better than a more fine-tuned expert to tuned a method yeah so I think that well one thing that I think was really interesting about the reaction to that blog post was that a lot of people have read this as saying that compute is all that matters and it's a very threatening idea right and I don't think it's a true idea either right it's very clear that we have algorithmic ideas that have been very important for making progress and to really build a GI you want to push as far as you can on the computational scale and you want to push as far as you can on human human ingenuity and so I think you need both but I think the way that you phrase the question is actually very good right that it's really about what kind of ideas should we be striving for and absolutely if you can find a scalable idea you'd pour more compute into you pour more data into it it gets better like that's that's the real Holy Grail and so I think that the answer to the question I think is yes that that's really how we think about it that part of why we're excited about the power of deep learning the potential for building an AGI is because we look at the system that exists in the most successful AI systems and we realize that you scale those up they're gonna work better and I think that that scalability is something that really gives us hope for being able to build transformative systems so I'll tell you this is a partially an emotional you know a thing that responds that people often have is computers so important for state-of-the-art performance you know individual developers maybe a 13 year old sitting somewhere in Kansas or something like that you know they're sitting they that might not even have a GPU and or may have a single GPU a 1080 or something like that and there's this feeling like well how can I possibly compete or contribute to this world of AI if scale is so important so for if you can comment on that and in general do you think we need to also in the future focus on democratizing compute resources more more or as much as we democratize the algorithms well so the way that I think about it is that there's this space of a possible progress right there's a space of ideas and sort of systems that will work that will move us forward and there's a portion of that space and to some extent increasingly significant portion in that space that does just require massive compute resources and for that fit I think that the answer is kind of clear and that part of why we have this structure that we do is because we think it's really important to be pushing the scale and to be you know building these large clusters and systems but there's another part portion of the space that isn't about the large scale compute that are these ideas that and again I think that for the a is to really be impactful and really shine that they should be ideas that if you scaled them up would work way better than they do at small scale um but you can discover them without massive computational resources and if you look at the history of recent developments you think about things like began or the VA II that these are ones that I think you could come up with them without having and you know in practice people did come up with with them without having massive massive computational resources alright I just talked to being good fellow but the thing is the initial gaen produce pretty terrible results right so only because it was in a very specific it was because only because they're smart enough to know that this is quite surprising can generate anything that they know and do you see a world there's that too optimistic and dreamer like to imagine that the compute resources are something that's owned by governments and provided as utility actually some extent this this question reminds me of of blog post from one of my former professors at Harvard this guy map Matt Welsh who was a systems professor I remember sitting in his tenure talk right and you know that he had literally just gotten tenure he went to Google for the summer and I then decided he wasn't going back it's academia right and that kind of in his bog post makes this point that look as a systems researcher that I come with these cool system ideas right and I kind of a little proof of concept and the best thing I can hope for is that the people at Google or Yahoo which was around at the time I will implement it and like actually make it work at scale right that's like the dream for me right I built the little thing and they the big thing that's actually working and for him he said I'm done with that I want to be the person who's who's actually doing this building and and deploying and I think that there's a similar dichotomy here right I think that there are people who really actually find value and I think it is a valuable thing to do to be the person who produces those ideas right who builds the proof of a concept and yeah you don't get to generate the coolest possible Ganim ajiz but you invent it again right and so that there's that there's there's a real trade-off there and I think that's a very personal choice but I think there's value in both sides do you think creating AGI something or some new models would we would see echoes of the brilliance even at the prototype level so you would be able to develop those ideas without scale the initial so seeds you know I always like to look at at examples that exist right look at real precedent and so take a look at the June 2018 model that we released that we scaled up to turn into GPT - and you can see that at small scale it set some records right this was you know the devotional GPT we actually had some some cool generations that weren't nearly as amazing and really stunning as the GPT - ones every but it was promising it was interesting and so I think it is the case that with a lot of these ideas do you see prominence at small-scale but there is an asterisk here a very big asterisk which is sometimes we see behaviors that emerge that are qualitatively different from anything we saw it's small scale and that the original inventor of whatever algorithm looks at and says I didn't think it could do that this is what we saw in DotA all right so PPO was was created by John Schulman who's a researcher here and and with with dota we basically just ran PPO at massive massive scale and I there's some tweaks and in order to make it work but fundamentally it's PPO with the core and we were able to get this long-term planning these behaviors to really play out on a time scale that we just thought was not possible and John looked at that and it was like I didn't think it could do that that's what happens when you're at three orders of magnitude more scale contest to that yeah but it still has the same flavors of you know at least echoes of the expected billions although I suspect with GPT is scaled more and more you might get surprising things so yeah yeah you're right it's it's interesting that it's it's difficult to see how far an idea will go when it's scaled it's an open question we've also at that point with with dota and PPO like I mean here's a very concrete one right it's like it's actually one thing that's very surprising about dota that I think people don't really pay that much attention to is the decree of generalization out of distribution that happens right that you have this AI that's trained against other bots for its entirety the entirety of its existence sorry to take a step back and you can't talk through in his you know a story of dota a story of leading up to opening high five and that passed and what was the process of self play it's a lot of training yeah yeah yeah yeah so with donors dota yeah it's a complex video game and we started training we started trying to solve dota because we felt like this was a step towards the real world relative to other games like chess or go right those various free board games where you just kind of have this board very discrete moves dota starts to be much more continuous time so you have this huge variety of different actions that you have a 45 minute game with all these different units and it's got a lot of messiness to it that really hasn't been captured by previous games and famously all of the hard-coded bots for dota were terrible right just impossible to write anything good for it because it's so complex and so this seems like a really good place to push what's the state of the art in reinforcement learning and so we started by focusing on the one versus one version of the game and and and were able to solve that we were able to beat the world champions and that the learning you know the skill curve was this crazy exponential right it was like constantly we were just scaling up that we were fixing bugs and you know that you look at the at the skill curve and it was really very very smooth one it's actually really interesting to see how that like human iteration loop yielded very steady exponential progress and to want one side note first of all it's an exceptionally popular video game this effect is that there's a lot of incredible human experts at that video again so the benchmark the trying to reach is very high and the other can you talk about the approach that was used initially and throughout training these agents to play this game yep and so they person that we used is self play and so you have cue agents they don't know anything they battle each other they discover something a little bit good and now they both know it and they just get better and better and better without bound and that's a really powerful idea right that we then went from the one versus one version of the game and scaled up to four five versus five right so you think about kind of like with basketball where you have this like team sport you know I need to do all this coordination and we were able to push the same idea the same self play to to really get to the professional level at the full thigh versus by version of the game and and and the things I think are really interesting here is that these agents in some ways they're almost like an insect like intelligence right where the you know there's they've a lot in common with how an insect is trained right insect kind of lives in this environment for a very long time or you know the the ancestors of this insect I've been around for a long time and had a lot of experience it gets baked into into into this agent and you know it's not really smart in the sense of a human right it's not able to go and learn calculus but it's able to navigate its environment extremely well and simple they handle unexpected things in the environment that's never seen before pretty well and we see the same sort of thing with our dota BOTS right they're able to in within this game they're able to play against humans which are something that never existed in its evolutionary environment totally different playstyles from humans versus the bots and yet it's able to handle it extremely well and that's something I think was very surprising to us was something that doesn't really emerge from what we've seen with PPO at smaller scale writing the kind of scale we're running the stuff out was you know I could take a hundred thousand CPU cores running with like hundreds of GPUs it's probably about I you know like you know it's something like hundreds of years of experience going into this bot every single real day and so that scale is massive and we start to see very different kinds of behaviors out of the algorithms that we all know and love Dora he mentioned beat the world expert 1v1 and then you didn't weren't able to win 505 this year yeah at the best in the world so what's what's the comeback story what's first of all talk through that does exceptionally exciting event and what's what's the following months and this year look like yeah yeah so well one thing that's interesting is that you know we lose all the time because we we so the dota team at opening I we played the bot against better players than our system all the time or at least we used to it right like you know the the first time we lost publicly was we went up on stage at the International and we played against some of the best teams in the world and we ended up losing both games but we gave them a run for their money right the both games were kind of 30 minutes 25 minutes and that they went back and forth back and forth back and forth and so I think that really shows that we're at the professional level and that kind of looking at those games we think that the coin could have gone a different direction and it could have could have had some wins and so that was actually very encouraging for us and you know it's interesting because the international was at a fixed time right so we we knew exactly what day we were going to be playing and we pushed as far as we could as fast as we could two weeks later we had a bot that had an 80% win rate versus the one that played at ti so the march of progress you know you should think of as a snapshot rather than as an end state and so in fact well we'll be announcing our our finals pretty soon I actually think that we'll announce our final match I prior to this podcast being released Cassell's there should be will be playing will be playing against the the world champions and you know for us it's really less about like that the way that we think about what's upcoming is the final milestone the file competitive milestone for the project right that our goal in all of this isn't really about beating humans at dota our goal is to push the state of the art and reinforcement learning and we've done that right and we've actually learned a lot from our system and that we have I you know I think a lot of exciting next steps that we want to take and so you know kind of a final showcase of what we built we're going to do this match but for us it's not really the success or failure to see you know do do we have the coin flip go in our direction or against where do you see the field of deep learning heading in the next few years what do you see the work and reinforcement learning perhaps heading and more specifically with open AI all the exciting projects that you're working on what is 2019 hold for you massive scale scale I will put a naturist on that and just say you know I think that it's about ideas plus scale you need both so that's a really good point so the question in terms of ideas you have a lot of projects that are exploring different areas of intelligence and the question is when you when you think of scale do you think about growing scale those individual projects so do you think about adding new projects and society today in if you are thinking about adding new projects or if you look at the past what's the process of coming up with new projects and new ideas so we really have a life cycle of project here so we start with a few people just working on a small scale idea and language is actually a very good example of this that it was really you know one person here who was pushing on language for a long time I mean then you get signs of life right and so this is like let's say you know with with the original gbt we had something that was interesting and we said okay it's time to scale this right it's time to put more people on it put more computational resources behind it and and then we just kind of keep pushing and keep pushing and the end state is something that looks like dota or robotics where you have a large team of you know 10 or 15 people that are running things at very large scale and that you're able to really have material engineering and and and and you know sort of machine learning science coming together to make systems that work and get material results that just would've been impossible otherwise so we do that whole lifecycle we've done it a number of times you know typically end to end it's probably to two years or so to do it I you know the organization's been around for three years so maybe we'll find it we also have longer life cycle projects but you know we we will work up to those we have so so one one team that we were actually just starting Illya and I are kicking off a new team called the reasoning team and that this is to really try to tackle how do you get neural networks to reason and we think that this will be a long-term project and we're very excited about in terms of reasoning super exciting topic woody what kind of benchmarks what kind of tests of reasoning oh do you envision what what would if you set back with whatever drink and you would be impressed that this system is able to do something what would that look like not fear improving they are improving so some kind of logic and especially mathematical logic I think so right I think that there's there's there's kind of other problems that are dual to if you're improving in particular you know you think about programming I think about even like security analysis of code that these all kind of capture the same sorts of core reasoning and being able to do some amount of distribution generalization it would be quite exciting if open ai reasoning team was able to prove that P equals NP that would be very nice I be very very very exciting especially if it turns out the P equals NP that'll be interesting too it just it would be ironic and humorous you know so what problem stands out to you is uh the most exciting and challenging impactful to the work for us as a community in general and for open AI this year he mentioned reasoning I think that's that's a heck of a problem yeah so I think reasoning is an important one I think it's gonna be hard to get good results in 2019 you know again just like we think about the life cycle takes time I think for 2019 language modeling seems to be kind of on that ramp right it's at the point that we have a technique that works we want to scale 100 X thousand X see what happens awesome do you think we're living in a simulation I think it's I think it's hard to have a real opinion about it I you know it's actually interesting I separate out things that I think can have like you know yield materially different predictions about the world from ones that are just kind of you know fun fun to speculate about and I kind of view simulation it's more like is there a flying teapot between Mars and Jupiter like maybe but it's a little bit hard to know what that would mean for my life so there is something actionable I'd so some of the best work opening has done is in the field of reinforcement learning and some of the success of reinforcement learning come from being able to simulate the problem you trying to solve so it do you have a hope for reinforcement for the future of reinforcement learning and for the future of simulation like what we're talking about autonomous vehicles or any kind of system do you see that scaling so we'll be able to simulate systems and enhance be able to create a simulator that echoes our real world and proving once and for all even though you're denying it that we're living in a simulation question right so you know kind for the core thereof like can we use simulation for self-driving cars take a look at our robotic system dactyl right that was trained in simulation using the DOTA system in fact and it transfers to a physical robot and I think everyone looks at our dota system the wreck okay it's just a game how are you ever going to escape to the real world and the answer is well we did it with the physical robot the noble could program and so I think the answer is simulation goes a lot further than you think if you apply the right techniques to it now there's a question of you know are the beings in that simulation gonna wake up and have consciousness I think that one seems a lot a lot harder to again reason about I think that you know you really should think about like where where exactly just human consciousness come from and our own self-awareness and you know is it just that like once you have like a complicated enough neural net do you have to worry about the agents feeling pain and I think there's like interesting speculation to do there but but you know again I think it's a little bit hard to know for sure well let me just keep with a speculation do you think to create intelligence general intelligence you need one consciousness and to a body do you think any of those elements are needed or as intelligence something that's that's orthogonal to those I'll stick to the kind of like the the non grand answer first right so the non grand answer is just to look at you know what are we already making work yoga GPG to a lot of people would have said that even get these kinds of results you need real-world experience you need a body you need grounding how are you supposed to reason about any of these things how are you supposed to like even kind of know about smoke and fire and those things if you've never experienced them and GPT two shows it you can actually go way further than that kind of reasoning would predict so I think that the the in terms of doing any consciousness do we need a body it seems the answer is probably not right that we can probably just continue to push kind of the systems we have they already feel general they're not as competent or as general or able to learn as quickly as an aged guy would but you know they're at least like kind of proto AGI in some way and they don't need any of those things now now let's move to the grand answer which is you know if our neural next Nets conscious already would we ever know how can we tell right yeah here's where the speculation starts become become you know at least interesting or fun and maybe a little bit disturbing it depending on where you take it but it certainly seems that when we think about animals that there's some continuum of consciousness you know my cat I think is is conscious in some way right I you know not as conscious as a human and you could imagine that you could build a little consciousness meter right you pointed a cat gives you a little reading we ran a human gives you much bigger reading what would happen if you pointed one of those at a dota neural net and if your training of this massive simulation do the neural nets feel pain you know it becomes pretty hard to know that the answer is no and it becomes pretty hard to to really think about what that would mean if the answer were yes and it's very possible you know for example you could imagine that maybe the reason these humans are have consciousness is because it's a it's a convenient computational shortcut all right if you think about it if you have a being that wants to avoid pain which seems pretty important to survive in this environment I'm and once you like you know eat food then that may be the best way of doing it is to have a being that's conscious right that you know in order to succeed in the environment you need to have those properties and how are you supposed to implement them and maybe this this consciousness is way of doing that if that's true then actually maybe we should expect that really competent reinforcement learning agents will also have consciousness but you know it's a big if and I think there a lot of other arguments they can make in other directions I think that's a really interesting idea that even GPT to has some degree of consciousness that's something is actually not as crazy to think about it's useful to think about as we think about what it means to create intelligence of a dog intelligence of a cat and the intelligence of human so last question do you think we will ever fall in love like in the movie her with an artificial intelligence system or an artificial intelligence system falling in love with a human I hope so if there's any better way to end it on love so Greg thanks so much for talking today thank you for having me you
Info
Channel: Lex Fridman
Views: 68,876
Rating: undefined out of 5
Keywords:
Id: bIrEM2FbOLU
Channel Id: undefined
Length: 85min 7sec (5107 seconds)
Published: Wed Apr 03 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.