Stuart Russell talks about AI and how to regulate it at OECD.AI Expert Forum

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] right so there's a question mark here on the title um because I think there's still real uncertainty about whether the AI systems that we're experiencing are having thoughts or not um so when we think about what AI is right it's obviously about making intelligent machines and for most of the history of the field it's been about uh machines that can be Whose actions can be expected to achieve their objectives this is the a definition borrowed from economics and philosophy the idea of rational behavior and that's dominated AI uh since the beginning and there's lots of different ways to do that search problem solving reinforcement learning and so on but this is essentially all the techniques that we have and the most recent systems are trying to do this simply by copying human behavior which keeps the objectives implicit um and the goal if you go back and look at what the founding files of the field wrote it's pretty clear that the goal is general purpose AI so AI systems that can do anything human beings can do can learn to do it very quickly and will probably exceed human capacities along pretty much all relevant Dimensions once this becomes possible um and uh the question that you might ask this is a question we've had in our textbook since the first edition 1994 is what if we succeed in this goal and it seems sensible to ask because we are now pumping hundreds of billions of dollars into precisely this goal so what happens if we get to this destination that we're all aiming for and I think the hope is that we could have a much better civilization as as Korean Illustrated with her examples because our civilization is made of intelligence um and we got more intelligence we could have a better civilization stands to reason um and just as a lower bound right if you have general purpose Ai and its physical extensions in the form of robots um then it can do all the work all the things that we already know how to do and we know how to deliver a very nice standard of living to a lot of people on Earth uh maybe close to a billion have what you know what the french would call a good standard of living and maybe they set the standard for that um and uh but of course it it's not available to everybody on Earth but with AI we can deliver it at very low cost on a much bigger scale so we could do this for everyone on Earth subject to uh the political and economic considerations that Laurel mentioned was one of his big concerns um and if you calculate the net present value of that since we are oecd you all know what net present value means right so it's the cash the cash equivalent of that increased income stream uh it's about 13.5 quadrillion dollars so that's a lower bound on the cash value of general purpose AI as a technology so when you see that you know there are billions of dollars being invested I think it's about 10 billion a month being invested into specifically AGI startups right so not applied AI startups these are ADI startups whose goal is to create AGI 10 billion a month um that's a minuscule amount right so I guess it says something about how difficult we think the problem is and how what the chances of success are in in the near term but we could have much better civilization right not just replicate what we have at scale but actually deliver much better health care to people uh individualized education of a very high quality for every child on Earth um rapid advances in science and medicine and so on so all of that would be would be great so this is what we are looking for in the upside and um some people think we're already there right that in fact AGI or something like ADI something like general purpose AI is here um I don't think we're that close uh I would say most people feel it's a lot closer than they would have guessed five years ago and there are more recent surveys and they've been done at fairly close intervals over the last decade and the median seems to be getting closer rather than farther away and with nuclear fusion it's always getting further away but with with general purpose Airline it seems to be getting closer so that's interesting um and I would say that the large language models of what everyone's talking about now I think they are a piece of the puzzle they are not the solution and scaling them up uh is not the answer and in fact just last week Sam Altman of openai said scaling them up is not the answer we're not planning to make gpt5 and we need different techniques because this doesn't work so that's a very interesting point at the moment we don't even know what shape this puzzle piece is right we don't know what the what's printed on it to make the jigsaw and we don't know what shape it is so we don't really know what other pieces we need or where to fit it in and these are basic research questions that we still have to answer um on the other hand you've got you know Microsoft is actually very distinguished group of scientists including two members of the national academies in the US who spent several months working with gpt4 and this is their conclusion that it shows Sparks of AGI uh so it you may disagree with that but I don't think you can just completely discount it as being irrelevant because these people have more experience with gpt4 than pretty much any other human being on Earth so we have to at least listen to what they have to say um now of course you all know right and all your friends send you examples of this that's my friend Prasad tadapali who sent me these examples um which is bigger an elephant or a cat an elephant is bigger than a cat very good which is not bigger than the other an elephant or a cat neither an elephant nor a cat is bigger than the other Okay so it's not so much that it made a silly mistake it's that it contradicts itself on a basic fact right and this points to the the question does it know anything at all right so knowing and being able to answer a question correctly and not the same thing right knowing means that you understand that what you know is consistent or inconsistent with something else that you can draw draw conclusions from that and I think there's there's plenty of evidence that the large language models are not building internal models of the world and answering questions with respect to that model and they arrive with inconsistency as a result which I think is actually giving up on an enormous source of power right because taking all the evidence from everything the human race has ever written and building a logically consistent internal model would be far more effective than what the current systems are able to do um so another part of the problem with not just large language models but deep learning in general is that it's based on circuits and any computer scientist knows that circuits while they're useful are a terrible language for representing almost anything you want to write down and we haven't used circuits for computing since before World War II right as soon as we had general purpose programming languages we stopped wiring up circuits to do Computing and we started writing programs instead and we had this idea that actually even the Superhuman go programs had not learned the basic definitions you need to play go what is a group is a group alive or dead those are the things you need to play go at all um and so we actually started doing some experiments and I'm just going to show you a game of Go between um one of our group members Kellen who's a pretty good amateur go player his rating is about 2300 and as you all know uh go programs have defeated uh all human world champions and national champions uh and that was six years ago that they beat the human world champion uh whose rating is around 3 800. uh they are now rated around 5200 so they are massively superhuman and so here we have a player rated 2300 he's going to play a massively superhuman go program rated 5200 and he's going to give that program a nine Stone handicap right which is the kind of handicap you give to a baby when you're teaching a baby a little tiny kid to play Go you give them nine stones to start out with right and it's you know that's just being very very nice okay so so here goes the game and uh the computer is playing black and the human Kellen is playing white and uh so what happens and go is you try to surround territory and you just try to surround your opponent's pieces and capture those pieces so Kellen is building a little white group uh sort of in the in the middle slightly to the bottom right and then black very quickly surrounds that group uh and is trying to kill it and then white is now surrounding the black group so it's making a kind of circular sandwich and black completely ignores this it doesn't understand that those peaches pieces are going to be captured and then white just captures all the pieces and that's the end of the game so this superhuman go program doesn't understand the most basic concepts of good and so when we say the oecd principles AI systems need to be robust and predictable they are nothing like robust and predictable okay um so someone else who asked what happens if we succeed was Alan Turing and he was actually asked this question what if we succeed in 1951 in a lecture and this is what he said it seems horrible that once the machine thinking method had started would not take long to outstrip our feeble Powers at some stage therefore we should have to expect the machines to take control so that's what I've been working on for the last decade is is he right and how do we avoid it so I prefer to ask this in the form of a question rather than a prediction uh we're going to make systems that are more powerful than us how do we retain power over entities more powerful than us forever right that's the question we need to ask and I think that's the the underlying uh question uh of the open letter right it's the open letter is not targeting existing systems it's not targeting research and development it's just saying uh at some point this is going to happen we need to be prepared ahead of time and have Solutions and regulations in place right we're building a planetary defense system against asteroids nobody suggests that we should only build that after the asteroid has destroyed the Earth right that would be silly but many people many people are saying we should only regulate general purpose AI agi's artificial super intelligence after it arrives because before it arrives we won't know what to do right and I think that's equally silly um so one answer to the question how do we retain power is actually to get rid of the standard model for AI so this was the standard model machines are intelligent to the extent their actions can be expected to achieve their objectives and this is exactly the source of the problem because to use AI systems of this type we need to specify the objectives and as everybody knows going back to King Midas and before we cannot specify objectives correctly because we can't write down what it is that the human race wants the future to be like and as you build AI systems that can affect what that future is going to be then you really face a problem if you get it wrong you're setting up a chess match and at least the chess programs we haven't figured out are beat them yet so we get rid of this idea and instead we want AI systems that are beneficial to us right and beneficial means that their actions actually achieve our objectives and that sounds counter-intuitive right how can you build systems that achieve our objectives when our objectives are in us and not in the machine but it turns out that there's a mathematical framework um that's perfectly straightforward that does yield systems with this property and you basically you know here's the Constitution uh the systems have to act in the best interests of humans and they don't know what those best interests are and they know that they don't know what those best interests are and this creates a different type of AI system that is potentially provably beneficial to humans and you can turn it into this not just a constitution it's actually a mathematical framework called an assistance game and we'll actually talk about assistance in the scenario discussion uh so uh so we'll get to see how these ideas play out in practice um and when you solve assistance games the results the AI system you get out by solving this game uh will differ to humans and will for example allow itself to be switched off which is the core to the control problem for AI uh and I I think we should if we figure out how to make this technology uh at scale you know competitive in capabilities to what's already out there uh then it makes sense for us to actually uh regulate that this is how AI systems should be built so let me briefly talk about the large language models and how they work right so very simply they're very big circuits with maybe a trillion parameters and we do about a billion trillion random perturbations to those parameters to get them to imitate human linguistic Behavior um and I know Keith said they're really complex designs I don't think they're that complex this is roughly how they work uh so in order to imitate human linguistic Behavior um they're probably going to have to start to resemble the generating process and the generating process is US humans and we have goals right those goals affect what we say just like I have goals in giving this talk people have goals in in talking on Reddit and and various chat rooms and all the rest um so it seems natural to question whether llms create internal goals to better imitate humans just as if we were teaching a robot soccer player to play soccer by imitating human soccer players it's plausible that it would acquire the goal of scoring goals because that's a good way to imitate what human soccer players are trying to do the answer is we have no idea I asked the authors of that Microsoft paper do these systems learn internal goals and Sebastian Brubeck the first author said we have no idea so we're building systems that they claim exhibits box of AGI they have no idea whether or not those systems have developed their own internal goals and they are unleashing them on hundreds of millions or billions of people so that's another reason why the open letter asks for a pause like okay you've done this much you don't need to do the next more powerful system and remain unregulated um okay I think I'm going to skip over this slide in the interest of time but basically no matter what goals they learn from humans it's probably a bad thing uh because we don't want them to pursue goals we want them to understand what human goals are and help humans achieve them but we don't want them to have those goals just like you know humans like coffee we don't want the machine to drink coffee same idea we don't want them to pursue the goals that humans have but to help humans which is a different thing and if you ask Kevin Roos in the New York Times does dp24 pursue those goals right it for 20 Pages it tried to convince Kevin to marry it right and you get these headlines like creepy Microsoft Bing chatbot urges Tech columnist to leave his wife and so right if you haven't read that conversation I highly recommend reading it so two more recommendations I think um to get AI systems that we do understand that are robust and predictable and do not present an undue safety risk as the oecd principles require I think we have to understand how these systems work and at the moment we have no idea how they work because of the way they are created they're not even designed at all uh and I believe we're going to need a different technology I'm using the catchphrase well-founded AI systems you can think of that as meaning AI systems whose internal principles of operation we understand and can check just like we understand how a nuclear power plant works we understand how you know the avionics on airplane works and we have to otherwise people won't allow us to to deploy them we also are going to need to to deal with the threat of uh Bad actors which Keith referred to as a reason not to regulate I didn't quite follow that argument but Bad actors are a serious problem how do you prevent the deployment of unsafe AI systems we are doing a terrible job with malware and this is going to be much worse so I think we have to switch the way we think about Commission in Computing systems right now everything is permitted unless we know that it's bad right that it matches some known signature for a bad piece of software for a virus then it gets rejected we have to go to a positive permission form where nothing runs unless it's provably safe and I think this this is a very a very Draconian standard but I see no alternative and it's a totally different kind of Stack with machine checkable proofs and lots of other technology like that okay so to summarize I think AI has enormous potential that potential is creating this Unstoppable momentum and we need to change the direction of that momentum towards systems that are provably beneficial to humans and we're going to end up with a field that looks more like Aviation or nuclear power than it does right now so thank you [Applause] [Laughter] [Music]
Info
Channel: OECD. AI
Views: 4,597
Rating: undefined out of 5
Keywords: AI, AI risks, oecdai
Id: D5z4p-Ydoew
Channel Id: undefined
Length: 20min 57sec (1257 seconds)
Published: Mon May 08 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.