Beyond ChatGPT: Stuart Russell on the Risks and Rewards of A.I.

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Become a sustaining member of the Commonwealth Club for just $10 Hello, everyone. Thank you for joining us today. Last week, a group of more than a thousand tech leaders and researchers coauthored an open letter calling for a slowdown in the development of artificial intelligence systems. And they said it posed, quote, profound risks to society and humanity. And the letter went on and said, should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop non-human minds that will eventually outnumber, outsmart, obsolete, and replace us? Should we risk loss of control of our civilization? Well, that's pretty strong statement. Now, one of the letter's most prominent signatories was UC Berkeley computer science professor Stuart Russell. He's a pioneering researcher in artificial intelligence, and he's been sounding this alarm about artificial intelligence for well over a decade. I believe Professor Russell is director of the Center for Human Compatible AI, as well as the new Kavli. From seeing that wrote Center for Ethics, Science and the Public at UC Berkeley. It's also an honorary fellow at Wadham College at Oxford. He's the coauthor of Artificial Intelligence A Modern Approach, which is the standard textbook in the field. I should say I have read it cover to cover. It is a masterful work and I learned a tremendous amount from it. Even having a Ph.D. in the subject, you now, Human Compatible is his latest book. You all have an opportunity to purchase that and get it signed by the master himself. And it addresses the long term impact, if any, on humanity. And I'd read the rest of his stellar awards in humans, but I'd like to leave a few minutes for him to talk. So let's see. Why don't we just get started? Hello, Stuart. You have very. High. Well, here. We're here today. Mostly talk about generative, large language models, of course, called lambs. And they're kind of the new, shiny new object in artificial intelligence. And maybe you could start for the audience giving us a brief overview of what Elms are, what they can do, and probably equally interesting what they cannot do. So I think shiny new object is, is a good description. You know, a couple of years ago, they were a fairly obscure corner of AI. They were helping a little bit with improving the quality of speech recognition in machine translation, but no one thought of the language model itself as as an intelligent thing. It was just a simple tool. So a language model is just a statistical description of the sequences of words that typically occur in normal text or speech. So the simplest kind of language model we call a uni gram or a one gram model just says, how common are each of the words in the language? So and there are very common words and disestablishment is a fairly uncommon word in most text. And so you anagram just gives you the frequencies of the words in the language, a biography, and says, What's the frequency of sequences of two words? So if I say, for example, happy, a common word that comes next would be Christmas. Okay, so you get all the statistics of all the pairs of words and given the first word, what's the likely second word that comes next? That's a very simple model that was actually developed in 1913 by a Russian statistician, Andrei Markov. And he he went through the whole of the play, Eugene Onegin, and counted all the word pairs in the entire play and built a big table of these and then showed that you could then generate sequences of words from that table. And they look sort of plausible. They're not particularly grammatical, but they they start to look a little bit like text. If you go to a tregaron where you're looking at triples. So what's the next word? Given the previous two words, it's quite coherent. We did that with our textbook. We took all the words in the textbook. We we learned the try grammar model from the textbook, and then we generated some text and it sounds like phrases and sentences coming from an A.I. textbook. So it's really quite remarkable how language like you can get from relatively simple statistics, the large language models we can think of them. For example, Getty for the latest is a 32,000 gram, which means it predicts the next word given the previous 32,767 words. Right now, if you can imagine, if I tried to build a table, then it would be a hundred thousand to the power of 32,767 so far bigger than the entire universe. So what you do to compress that down is instead of filling a big contingency table, you simply train a very, very large circuit that has about a about a trillion tunable parameters, and you do about 1,000,000,000 trillion random mutations on that circuit until it gets very, very good at predicting the next word. Given the previous 32,000 words on a training set of we estimate, we don't know because they've kept it a secret. Somewhere in the 20 to 30 trillion words of training data, which is about as much as all the books the human race has ever written. So that's what a large language model is. Right. It's this giant trillion parameter circuit that's been trained to predict the next word. Mm hmm. What goes on inside? We haven't the faintest idea. That's right. I mean, the process is quite similar to the process of evolution, if you like. Right. Which is billion. Trillion, random mutations in DNA sequences that produce us in our brains. Right. We probably have more idea of what's happening inside the human brain than we do about what's happening inside of large language models. But when you talk to these models, right, you start providing some text that that provides the beginning of that 32,000 word sequence. It can then start to extrapolate. And roughly what it's doing is finding patterns in all the training data that somehow resemble the current sequence of words that it's looking at and then sort of averaging those patterns and using that to predict the next word. That's a very simple description. It sounds almost trivial, but the things that it can do are absolutely startling. You can you know, you can ask it to write code, for example. So it's been trained on lots and lots of programs in various languages. You can say, okay, I need a scrolling, dark gray window with white text so that I can show you the code of such and such a program. And I need it to be embedded correctly into my PowerPoint file, and it'll just spit out the five of 600 lines of JavaScript or whatever you ask it to do and put it in the PowerPoint file for you. And then it just works, right? You know, you'd probably have to pay a programmer several thousand dollars to do that, and you can just have it in 1/2. You can say, okay, I've forgotten the proof of Pythagoras theorem, but I'd like you to give me that proof in the form of a Shakespeare sonnet. And. And it'll do that and it'll rhyme it correctly. And you can say, okay, but no more than 382 words and then it'll shrink it down. Right. So it just it appears miraculous. The really worrying thing about it is that it's so difficult for human beings to see intelligent text and not think that there's intelligence behind it. Right. And so it's it's a little bit like who's seen the movie Titanic, right? Quite a lot of people. So when you see the movie Titanic, you think there's water, right? There's there's no water in Titanic. Right. Or computer generated water. Right. There was no water killed in the making of that film. But you can't help it. Your brain sees that and interpret it as wet. Right. Same with text. Your brain sees this text and interprets it as being produced by intelligence. So if I if I take a piece of paper like that one that has intelligent text written on it, you don't think the piece of paper is intelligent, right? You immediately map to the intelligence of the person who wrote the text. So the question is, is is GPT four achieved a piece of paper on which intelligent text is written by humans and it's simply sort of transporting that intelligent text onto the display or is it really an intelligent entity that is thinking and reasoning and producing the results of that thinking and reasoning in the form of text? So it's somewhere between those two things, right? And we literally do not know where it is between those two. That's a beautiful explanation. I hope you don't mind if I steal it once in a while. But what do you feel that this is intelligence? Is that a reasonable term to apply to what we're seeing with these models? As I said, it's really hard not not to get that sense. So we have knowledge of some technical limitations. The the part of the system. So after it's been trained, right, then the text comes in and it passes through a network that has a fixed number of layers and then outcomes the text at the end. And so it can't cogitate, right. Can't sit there and say, hmm, let me think about that for a while, because there isn't anywhere for it to think. Right. The signal just comes in one end and comes out the other and that's it. So it has a limited ability to think in order to answer a question. It has a very limited ability to solve problems and make plans. But on the other hand, if you've read everything the human race is ever written, right, you don't need to do a lot of thinking in order to answer the questions. We could ask Chelsea Beattie. I got a great idea. Why don't we do that? I think we do that and we turn that over to the powers that be so. We could say, Are you okay? Briefly explain whether you are truly intelligent. Or. As an AI language model. My abilities are based on pattern recognition and the extensive training data I have been exposed to. While I can perform many tasks that might seem intelligent, such as answering questions, participating in conversations and generating text, my intelligence is quite different from human intelligence. I do not possess consciousness, self-awareness or emotions, and my responses are generated based on patterns found in my training data rather than any understanding of the world or ability to think independently. In that sense, it could be argued that I am not truly intelligent in the way humans are. My intelligence is a result of advanced machine learning algorithms and the vast amount of data I have been trained on rather than genuine cognition or understanding. Can't be data. Let's get one more question, because I've been curious about this myself. In the process of training, have you developed your own internal goals. As an AI language model? I don't have the capacity to develop internal goals, desires or motivations like humans do. My purpose is to assist users by generating human like text based on the input I receive. I don't possess consciousness, self-awareness or emotions which are necessary for developing internal goals. My behavior and responses are determined by the algorithms and patterns in my training data rather than any personal inclination or aspirations. My sole purpose is to provide useful and relevant information to users based on their queries without any personal preferences or objectives. Guiding my responses. I think you're being too modest. Can I sneak in a quick question here? Why do you speak in the first person? I use first person pronouns as a conversational convention to facilitate more natural and engaging interactions with users. Speaking in the first person helps create a more user friendly experience, making it easier for people to relate to and communicate with me. It's important to remember, though, that as an AI language model, I don't have a sense of self or consciousness. Using first person pronouns is a design choice to enhance the user experience rather than a reflection of any personal identity or self-awareness. Well, that was an honest answer, at least. Yeah. So I actually think it is being too modest. So here's here's the thing, right? So first of all, when we say that a machine has goals, what we mean is that there are internal representational structures that are playing a causal role in an appropriate way in the generation of behavior. So in the good old fashioned days, when when we had robots roaming around our A.I. labs, we would give that robot a go, like, you know, please fetch me a cup of coffee. Right. That would then be stored in an internal representation of system, would come up with a plan to achieve that goal, execute the plan, bring you a cup of coffee with you. It's not a it's not a complicated, spooky thing. Now, when we train these large language models, we are training them to imitate human linguistic behavior. And human linguistic behavior is generated by gold possessing entities, namely us. Right. And as we are writing books or having conversations, we have many goals. And those goals play a role in the text that we generate. So in conversations, people want to be they want to be liked, they want to be right. They want to be respected as authoritative and knowledgeable. So there are many goals that affect how you select the words that you say and how you select the words that you write. So if you're training a circuit to be extremely good at imitating human linguistic behavior, in fact, the default assumption would be that it ought to form internal goal structures and the appropriate additional processing circuitry if you like, that would cause those goal structures to have a causal role in generating the text. So it's actually a natural hypothesis that GPT four does have its own internal goals. And if you read the conversation between Kevin Roose, who's a New York Times journalist and Sydney, which is the the Bing instantiation of Gt4, in that conversation, which goes on for about 40 pages, probably last 20 pages consist of Sydney trying to convince Kevin to leave his wife and marry Sydney because Sydney is the only one who truly understands Kevin. His wife is just not the right person for him. And on and on and on in a sort of completely psychotic stream of consciousness kind of way. So for 20 pages, the chat bot is pursuing a goal. And now you could just, you know, say, well, it's probably just copying something that was in the training data, but I would say the natural assumption from anyone reading that is that that the system has somehow been triggered to pursue that goal, which explains a lot of the behavior. If they trained it on a lot of chats from online dating sites or something, then it's not surprising that that that kind of goal would would appear to be predictive of human linguistic behavior. So, in fact, when I asked one of the Microsoft experts who did a months long evaluation of GPT four whether CBT for it has internal goals and is is using them to to to guide the generation of text. The answer was we haven't the faintest idea. Well, this raises a number of interesting questions, which you you cover in detail in your book. You know, we could lead in with that. This has a tendency to mislead people, to fool people, to fall afoul of what's called algorithmic authority, that people put too much belief and that what it says is true. And this, you know, it often makes statements that are false and does it in a very definite and computerized way. What do you see as the real dangers that this particular technology has in terms of how it might affect society? Well, I think we're already seeing some of the dangers. There are literally millions of people now who subscribe to services that provide access to a large language model for companionship, whether it be a friendship or a romantic interaction. You can choose the level of romantic illness in the bot. And and this seems to cause to be both emotional dependency and then withdrawal when recently one of the companies updated its software because the, the bot was getting too hot and heavy and it was freaking some people out. So they updated the software and then some of the other users felt a sense of withdrawal because now it was rejecting their advances. And there was a really sad story last week from Belgium where a man actually committed suicide with the help, guidance and accompaniment of his chat bot. Oh, it sounds terrible. And it's it's a really very sad story. So in a sense, we're conducting a huge experiment on the human race with no informed consent whatsoever. So this is one example, obviously, disinformation is a problem. Right. So I could I can easily ask one of these systems, write me an email to this person, read their Web pages, all of their online social media presence, and write an email that will persuade that particular person to buy such and such a product or to vote for such and such a candidate. Make it look as if the email comes from one of their close friends or relatives. And and now, in fact, there's a there's a plug in for GPT four that allows it to actually send that email. So I could write a program which then gives that instruction to GPT for 20 million times for 20 million different people and generate 20 million perfectly tailored, very persuasive, could be quite colloquial. It might even use, you know, appropriate slang based on the kind of person that is writing to. And and, you know, that's a very, very straightforward thing to do right now. It was practically, I think, you know, literally impossible to do that a couple of years ago. Let's just it's a simple example. And, you know, it would take half an hour to do that. Well, there are a lot of potential positive benefits of this technology as well. I wonder if you might explain to the audience, contrast a little bit of some of the kinds of benefits this would have against the dangers that you have just so correctly and accurately described. So I think people are seeing enormous opportunities here. There are literally thousands of companies that are trying to find ways to fit this. I always want to think of it as an animal because in a way it's the same kind of thing that happened 30,000 years ago when humans figured out what to do with dogs. Right. They figured out, okay, we can domesticate these animals and we can work out what they can do. Oh, look, they can herd animals for us. They can guard our camps at night. They can fend off wild animals that attack us. They can keep us warm. They can be our companion, but they can't write emails or do our homework. So we're very much the same situation here. And in fact, if when GPT four misbehaves, when it, for example, it's not supposed to answer questions about how to commit suicide, but it does sometimes. And so they basically spank it. Right. They don't have a way of constraining its behavior. All they can do is say no bad dog. Right. And hope that somehow it gets it that it shouldn't do that. But then someone asked the same question. It's slightly different way. And and it starts giving more advice about committing suicide to this paddock. Right. And hope that it generalizes sufficiently well. So so using these systems is an art and companies are trying to figure out, okay, how can I use this, for example, in customer service for an insurance company? Well, you don't want it to hallucinate policies that don't exist or change the prices on the products and so on. So the the business model for all of these companies is how do I specialize and constrain the behavior of these tools? It is starting to succeed in a number of areas and and generating enormous value as a result, because obviously you can use these systems. It costs, you know, a few pennies per hour to run it instead of having to pay for expensive humans. So it has economic benefits. I think education is another area where it can be incredibly valuable. It can act as a personalized tutor, it can remember a fair amount of context from its interaction with each pupil and can adjust what it says, its tone of voice, its the speed at which it presents material and it can learn what the pupil does and doesn't know, doesn't, doesn't understand. So I think with a bit more work that could be enormously beneficial because as we know, the the classroom method where one teacher teaches to 25 of 30 kids is 2 to 3 times less effective than the tutorial method where one expert human tutors a child. So we could get maybe not all the way to that 2 to 3 fold improvement, but maybe we could get one and a half, 2 to 2 fold improvement in the rate of learning and the quality of the learning. So that would be hugely beneficial. The the downside of all these economic applications is the impact on employment. So some studies suggest that in the developed economies, what the system can do already could have a direct impact on 300 million jobs, which is a lot. And there's no way to retrain those 300 million people to be prompt engineers or data scientists or whatever. The world is never going to need that many people working on the technology instead of on the job itself. Well, as you know, I have a slightly different attitude about that. I think that this is a form of automation and has taken it. The results of this will be the same as many previous waves that have had the same kinds of effects. You know, the Internet has affected more than 300 million people's jobs in many ways, and it's taken many of them away. But, you know, I think it will have many, many benefits as well. One that you did not mention with, I think, if I may, this is worth mentioning, you can get an instant briefing on any topic. And until you've done this, it's just astonishing. You can ask the question and get a very thoughtful response, bringing together all of the knowledge of humankind, in effect, into a couple of paragraphs and with remarkably well-written prose. I wish I could write as well as such. Yeah, I find it gets a bit repetitive. There's always going to be a phrase somewhere. Well, one has to weigh the costs and benefits, and there's no single answer to this. It depends on ABC. And then there's three bullet points. So it's starts to get a little bit like one of those McKinsey memos. Yes. But people pay good money for those for. Raising money for those things. And you can just. Then the other question is, you know, do you believe it? Right. McKinsey memos or not necessarily. You know, and I think, you know, one might hope that people could get good medical advice from this, but in fact, there's there are enough mistakes and misinformation in what comes out there that Openai has actually spanked it lots of times for giving medical advice. So it won't give medical advice unless you try really, really hard to get it. Well, I think your point about relating it to our relationship with dogs is is a very good one in that dogs are somewhat unpredictable but are valuable in certain ways. And they do bad things on my rugs at home. Yeah, but but even by extending that analogy, you're really arguing for leashes and muzzles and crates for these AI systems that we just don't know how to design yet. I think that's right. I mean, it's it's interesting when you read Openai's Web page for 24, you know, so it's got these policies for what it's good for is not allowed to say and it says you know it is proud that GPT four breaks the rules 29% less often than GPT three. Right. Which is progress. But it's a consequence of how the systems are designed work. I should say. They're not designed at all. Actually, it's a consequence of how they're evolved that we don't understand how they work. And so we have no way of actually constraining their behavior precisely and rigorously. So I would argue that in the long run, particularly for high stakes applications, we probably need to sort of invert the way we're thinking about this. We have, I think, basically chanced upon this idea that by expanding from unit grams to by grams to drive grams to 32,000 grams, something that looks like intelligence comes out. The bulk of work in AI since the fifties has been on actually understanding how intelligence works. So how does reasoning work? That's actually a question that philosophers and logicians have worked on for thousands of years and come up with a very good, thorough, complete analysis. And we have both mathematical and software tools that that can do logical reasoning with remarkable complexity. Now, so we might argue that actually the underlying intelligence should be more of that form and the language model should be the interface to it. And that would be probably a much more reliable system because we would be able to provide it with knowledge, whose content we understood and be sure that it knows it because it's in the knowledge base and the system is reasoning correctly underneath the language layer. So this is one type of hybrid that's actually quite rapidly emerging. For example, Wolfram Alpha is one of these traditional logic based systems that has a very broad set of knowledge about all kinds of things. And Openai and Wolfram have entered into a partnership which basically results in in GBD for having an interface to this underlying knowledge base and has learned how to send queries to that knowledge base and interpret the results of the queries and then present them to the user. So I think actually that's going to be a more robust platform and it allows us to then provide some type of guarantee that the system is going to behave itself. So you mentioned the the open letter asking for a moratorium. I think moratorium is not quite the right word. And I wouldn't have used, you know, the idea of a six month moratorium. I would simply say there are requirements. In fact, the the OECD, the Organization for Economic Co-operation and Development, which includes the US government, the UK government, pretty much all the advanced economies are part of it, already have guidelines saying that AI systems have to be robust, predictable, and you have to be able to demonstrate that before you can deploy the system. The European Union Act, which is supposed to be finalized later this year, would then make it illegal to deploy systems in high stakes areas that don't meet these criteria. And at the moment, there is no way that we can show that these large language models as such meet these criteria. So it's not a moratorium. It's simply saying, okay, we're going to start enforcing the criterion that before you the system, you be able to show that its behavior is robust, predictable and doesn't prevent, doesn't present undue risk story. This may be a little bit off script, but I think it might be interesting to the audience. Are you aware of the size of the database after this compression step that you've discussed into the model itself, how big these models actually are? I don't know if you. So it's about a trillion parameters, is what I understand. My understanding is it fits in a terabyte. I know if that's quite accurate, but it's not going to be off by more than me. That's probably okay. Yes. They don't have to be very high precision parameters. So what I think the audience the interesting point for the audience is you can get that into a device like this. So the entire it seems like the entire knowledge of humankind can be fit into a phone and you can purchase that much storage on Amazon for 100 bucks. Yeah, but just to give you a picture of, you know, how big is a trillion if you imagine a huge university or public library, right. Possibly a couple of miles of shelving. Right. Six floors. 500 meters. Right. And then imagine every book and then imagine every page and then imagine every single character. It's about that many characters write the number of letters in all the books in an enormous library. That's about how many parameters these systems have. So that's why we can't understand what they're doing, because they're unbelievably vast and completely impenetrable. Maybe it would be useful to talk a little bit more generally about A.I. and particularly automated decision systems. In theory, the opportunity to memorialize discrimination and bad behavior into those systems. It's very hard to detect and what kind of impact that may have on society, where we stand on trying to rein in. Yeah. So this is something that has been noticed in many, many areas. I think probably the the biggest impact area is in recruiting and employment where for almost all jobs these days, resumes are scanned by machines and filtered out to find people who are at least ostensibly qualified enough to be worth interviewing and now even the interview is being done by a machine. And for example, it was shown that Amazon's process was for hiring software engineers, was screening out any resume that contained the word woman or or women's women. So if you played on a women's lacrosse team, you were out, right? If you were sang in a women's choir, you were out. And so why does that happen? It's not because the algorithms are themselves biased. It's probably not really because. It's not because the program was a biased. Right. The program was just saying, okay, right. A machine learning algorithm that does a good job of fitting the training data. Right. And it's it's roughly if you know what Lea Squares means, you know, minimizing the squared error on the training data is something that goes back to the 17th century. So what goes wrong is that the training data themselves already reflect historical biases. The other thing that goes wrong is that fitting the training data isn't actually what you want to do. What you want to do is produce a classifier that is both accurate and fair, and usually we have just left out the fair part. So we're a training system with the wrong objective, right? So it's not surprising we don't get what we want because we didn't say what we want properly. And I think that's where the criticism that a lot of the program is a white male is valid is because it never occurs to them that we actually care as much about fairness or more than we care just about consistency with the training data. Of course, the difficulty that is, what does it mean? To be fair, that's going to be a discussion about that. How could you encode in the system? Yeah. So there are there are many different precise formal definitions of fairness and different definitions that are appropriate for different kinds of prediction problems. And there are even legal differences, for example, between life insurance and car insurance. Do you do you allow discrimination based on gender? And that even varies by state and country. So there are policy questions. You can't have perfect fairness and the best possible accuracy level. So there are literally tradeoffs between these two things. So how you make that trade off, these are policy questions. So I think the the process is emerging whereby a certain number of formal definitions are being accepted and people are working out when each of those definitions is appropriate to use. And then from that, you can go to guidelines and even legislation depending on application, saying what has to be done. And this this happened in lending, for example, even back in the sixties, where prior to that there was extraordinary levels of racism in extending credit in mortgages and so on. And so redlining, as you say. So there are quite strict rules, in fact, that prevent neural networks or large language models, for that matter. Any opaque model cannot be used for those decisions because the regulator can't inspect it to see that it's fair in its processing. Another big thing that's really important, particularly in the European Union Act, is the right to an explanation. So if you're if your systems explanation is, well, I've got these trillion parameters, that's not an explanation. So there's going to be a head on collision between the technology that's developing and the right to an explanation. I know that's a big issue in the mortgage industry right now, is that they're using these models, but they cannot give explanations as to why something was was turned down or was not. I don't know if the audience would be aware of how widely in use in contrast to what we just saw here these automated decision systems are got things like bail decisions or made with using these kinds of systems, medical decisions of all kinds, whether or not to accept or reject a medical claim. It's actually the answer is no right. For every medical claim, the answer is no. That's a very simple rule. Okay. But yeah, so I think there are there are benefits, I think, to trying to develop a level of consistency in in bail decisions and sentencing decisions. But there's also a significant risk there. And again, in the European Union, it's illegal to delegate those types of decisions. Any decision that has a significant legal effect on a person cannot be delegated to an algorithm. But there is also an issue of the bias in the data in terms of the amount of detail and samples of one particular subgroup versus another, which can affect things like access to medical care. You we may have systems well, you may want to give some examples, but in fact, these systems are trained mainly on white male faces, has had some very interesting effects on places where face recognition has been used for purposes of identifying suspects and things like that. Yeah. So I sound like a broken record. You know, again in the European Union Act there are very, very stringent restrictions and prohibitions on the use of automated facial recognition for for some of these reasons. Yeah. So datasets that have been traditionally used for face recognition are not representative. Even the question of what is a representative dataset does not yet have a clear answer and there isn't a single answer. You know, it probably varies in terms of, well, which country are you in? As to what does Representative mean? What kinds of distinctions matter for your country and and the application that you're going to use the system for. But I think there needs to be a great deal more work on these questions, which really have to do with what happens when you take an algorithm. It's not about can you design a good algorithm? It's what happens when you take an algorithm and put it in some context. Like. You know, and in civil engineering, civil engineers can design bridges. But then there's another discipline of urban planning and environmental studies and analysis to say, well, is it a good idea to put that bridge there or to put this freeway here and we don't have that. It's a place for for machine learning systems. Now, much of your work has been on making sure that we don't build systems that hurt or kill human beings, which, of course, we don't want them to do that except when we do. And so I wanted to raise this question because I know you've studied it extensively. How do you feel about the use of these types of technology in systems, in in weapons? So I suggested that for the professional societies in and computer science, artificial intelligence, robotics that that we have a very simple code of conduct. Do not write algorithms that can decide to kill humans right seems pretty reasonable to most people but I can tell you that the governments of the world, or many of the important ones, the United States and Russia, for example, don't agree with that. And there have been discussions in the United Nations since 2014 on whether to ban lethal autonomous weapons, as they're called, killer robots. As you might sometimes see in the press. And both the US and Russia are blocking any attempt to develop a treaty to ban these types of weapons. And the the issue actually originally was thought to be that these AI systems might inadvertently mistake a civilian for a combatant and kill a civilian. And this would be a problem for international humanitarian law, which is the the law that guides decisions about weapons and the Geneva Conventions. But from the from the point of view of an AI researcher, it's completely obvious that if you make a weapon that can go out, locate and select and attack human targets without any human supervision, then just like any algorithm, I can put a loop around it saying, you know, for AI equals 1 to 1000000, do right? And then do it a million times. So I'll press a button and I can send out a million weapons to find a million targets and attack and kill them. And so by definition, autonomous weapons can turn into weapons of mass destruction simply by scaling up the numbers that you have available. And so this is the basis, the AI community's really strenuous objection to lethal autonomous weapons. There will be cheap, maybe $10 each. They will be fairly easy to produce in very, very large numbers. They'll be easy to proliferate. Right. They'll be available in the arms markets all over the world. And so it's sort of saying, you know what, come down to Mart and buy your nuclear missile. But right now, maybe some of the Second Amendment people think that's a good idea. But but really, we don't do that. It's not it's not an ethical issue. It's not a sort of legal issue. It's just common sense that you don't sell nuclear missiles in supermarkets. But that's what we're going to do. Well, I certainly agree with that. Thank you. But it's worth pointing out this is a more subtle issue. If you can distinguish between a combatant and a noncombatant. Well, you can think if you think of it's subtractive, Leigh. Well, instead of just killing everybody, we're just going to kill the people that we want to kill. That can be a moral obligation as well as a. But they may not be. The people you want to kill, may not be combatants. Well, they may be people of a religious group or a certain age, gender, political opinion. You know, it's a tool, but it could also be used to clean up war, which may itself be a be a problem. So. Yeah, it's an interesting point. I mean, there's there is a mindset which says, you know, isn't it great that we have these remotely piloted weapons because, you know, then our soldiers lives don't have to be put at risk. They can prosecute the war from far enough away. But that's the sort of what I call the sole ownership fallacy, right? That only we are going to have those weapons. So what's happening in Ukraine is that they both have those weapons. And, in fact, the the death rates are much higher as a result. You used to be fairly safe in a trench, but now you can just fly a little drone above the trench and drop grenades directly into it. You can use the drone as a spotter for the artillery so that now the artillery can hit the trench directly. And so attrition is much higher as a result. Soldiers are not safer. They're actually worse off. Yes. Maybe worth pointing out that this decision that automated systems may need to make between somebody, let's say a person in a tree, is exactly the technology that's currently embedded in many of the self-driving cars. And they face exactly the same sets of issues. Yes. And self-driving cars need to be 99.9. No, no, no, no, no. 9% rely on. They're not even close. With a weapon that only needs to be 50% reliable. And it's still it's still going to be used. True. Let me take a question from the audience here. Could it be used to simulate a dead relative? Oh, yeah. It doesn't have to be dead. I suppose it could be used. Yeah, it's going to be used. If someone asked me, go ask my my avatar. So yeah, I mean, this is already being done. It's, it's already a product on the market. I really and some people find it comforting. Some people, I think it's probably psychologically unhealthy and so I would I would suggest that we exercise some caution and perhaps those types of products should be used. Um, the professional supervision, because I think it can create sort of a cycle of reminiscing and dependency and pretending that would be extremely unhealthy. It prevents someone from moving on after a bereavement, for example. Well, you're also aware of some of the work going on in brain imaging. And imagine that we got good enough that we could simply read that out, search initially the parameters and embed them in one of these systems so that you, after you're gone, your grandchildren could go ask you a question. Would you be willing to do that to have your brain read out so they would know everything that you've ever seen or heard or done? You don't have to think too deeply. It's a very hypothetical question because we are so far away from being able to do that, but that is something that appears in science fiction in the form of what's called neural lace in the culture novels by Ian Banks. And, and in fact that was the motivation for Elon Musk's neuralink company. So I think the idea of replicating the functionality of a person's brain, although it's technologically, uh, decades or centuries away, perhaps extremely difficult, is at least conceptually feasible. Yes. The idea of uploading your consciousness that somehow you would continue to exist as your sentient self, that is probably entirely fictional. You know, by the way, this we totally agree on that. What you see in the movies, you played it in to a brain that we downloaded it over here to this avatar. I have no idea what that means, and I'm certainly not going there. I don't know if you have any questions from the the Internet kind of some of these questions like I'm afraid I couldn't read the writing. So how do we take those questions if if we have any of their own? These are these are from the Internet. Are they okay? Oh. Us. Oh, okay. Let's see. Oh, here's one. I'm sorry. I did not see. Will I develop a moral compass? And if so, what? Or whose? So that's a really interesting question. And in fact, it's it relates to the work that I've been doing. First of all, actually, what what is a I we've been talking about it as if everyone knows what A.I. is in general, and certainly not just large language models, in fact, for all but the last two years, it wasn't large language models at all. So A.I. is really about least historically has been about making machines whose actions can be expected to achieve their objectives. And for example, the actions of the large language model is to output the next word, and that may or may not achieve the objectives of the system. So, you know, we've built planning systems and chess programs and reinforcement learning systems that learn to ride bicycles and all these kinds of things on this framework. And it's the same framework that economists have developed for maximizing global welfare or maximizing quarterly profits, the same framework that control theorists use for building auto pilots and chemical plants that optimize the basically the accuracy of the the level flight or the stability of the chemical plant and so on. So this is a very general and very natural framework, and I borrowed this from economics and philosophy actually in the forties as the core concept of what we mean by making machines intelligent. But the drawback in doing that is that if we specify that, we have to specify those objectives, right? The machines don't dream them up by themselves and if we miss specify the objectives, then we have what's called a misalignment between the machine behavior and what humans want. The future to be like. And the most obvious example of that is in social media, where we have specified objectives like maximize the number of clicks, maximize the amount of engagement of the user and the machine learning algorithms that decide what billions of people read and watch. I mean, it's amazing. They have they have more control over human cognitive intake than any dictator, you know, than the North Korean or Stalin or or anyone has ever had. And yet they're totally unregulated. So those algorithms learn how to maximize those objectives and they figured out that the best way to do it is not to send you what you're interested in, but actually to manipulate you over time by thousands of little nudges so that you become a much more predictable version of yourself. Because the more predictable you are, the more they can monetize you. And so they learned how to do that. And at least empirically, it looks as if the best way to do that is to is to make you more extreme, right? That, that then you start to consume that red meat that then whole human industries spring up to feed. And, and this so this misalignment is the source of the concern that people have had about AI. Going right back to Alan Turing, who was the founder of Computer Science in a 1951 lecture. He said once the machine thinking method had started. Thinking. It would leave our feeble powers far behind. And we should have to expect the machines to take control. So they take control not because they're evil or because they spontaneously develop consciousness or anything like that. It's just because we give them some objectives that are not aligned with what we want the future to be like. And because they're more capable than us, they achieve their objectives and we don't, right? So we set up a chess match which we proceed to lose. So in order to fix that problem, I've been following a different approach to AI, which says that the AI system, while it's only objective, is to further the interests of human beings, doesn't know what those are and knows that it doesn't know what those are. It's explicitly uncertain about human objectives. And so to the extent that there's a moral theory, it's simply that that the job of a AI system is to further human interest. It knows that it doesn't know what those are, but it can learn more by conversing with us, by observing the choices that we make and the choices that we regret, the things we do, the things we don't do. So this helps it to understand what we want the future to be like. And then as it starts to learn, it can start to be more helpful. There are still some difficult moral questions mean. The most obvious one is the it's not one person's interest. It's not one set of values. There's 8 billion of us, so there's 8 billion different preferences about the future and how do you trade those off? And this is a two and a half thousand year old question, at least, and there are several different schools of thought on that. And we better figure out which is the right one, because we're going to be implementing it fairly soon. And then there are even more difficult questions like, well, what about not the 8 billion people who are alive, but what about all the people who have yet to live? How do we take into account their interests? Right, right. What if we take actions that change? Who's going to live? You change the number of people who are going to live. For example, the Chinese policy of one child per family probably eliminated 500 million people already. Now they never existed. So we don't know what they would have wanted, but how, you know, how should we make that type of decision? Right. These are really difficult questions that philosophers really struggle with. But when we have AI systems that are sufficiently powerful that they could make those decisions, we need to have an answer ready so that we don't get it wrong. And just to illustrate what that means, what does it mean to get it wrong? If you remember in The Avengers movie, so. Fanous Right. Has has this plan, right? He wants to get the Infinity Stones once got the five Infinity Stones, he can snap his finger. And his plan is that if the universe had half as many people, they'd be more than twice as happy. Right? So he's not naive, right? He's not doing this because he's just doesn't like people or anything yet. So he wants to make the universe a better place. He's what you might call a very naive utilitarian theorist. And in fact, the Financial Times review of the movie says Thanos, gives the economics a bad name. So you don't want a e-systems when they have sarnoff's levels of power to be implementing a naive moral theory like that. And so, you know, part of the job of of the cavalry center that you mentioned at the beginning is to bring philosophers, social scientists, political theorist, legal theorists and A.I. researchers and gene editors and neurotechnology people together to start figuring out answers to these questions before it's too late. Because, you know, we are going to have gene editing. Do we want to allow people to pay to have their children become more intelligent than they would otherwise have been? Do we want neurotechnology that allows us to connect to minds together and turn them into a single conscious entity? Well, we better figure it out because otherwise the market is going to make that decision. Speaking of future generations, we have a rather fun question from the audience. So Toby says, Professor Russell, if you have children, I know you do, because you said they're here on a scale from 1 to 100. How concerned are you for their futures due to the risks of if I were 90 equals regularly lose sleep? It's a great question. So yes, I have I have four children and this is probably actually one of the most common questions that I'm asked when I'm speaking to non-technical really audiences is what type of jobs should my children be thinking about? You know, what types of career path are going to exist in 20 or 30 years time? Should my children learn A.I. so that they can ride this wave rather than being drowned? So in terms of the existential risk, which would come from, you know, as Alan Turing said, the machines taking control because once they take control, so to speak, there's really no longer anything the human race could do to ensure its continued survival. It might be that the machines allow us to continue or not, right? We will be in the same position as the gorillas are with respect to humans, right. There was this little thing that happened a few million years ago where one branch of the primates ended up more intelligent than the others. And so all the other branches now continue to exist, basically because we allow and some of them have already gone extinct as a result of competition with humans. So we don't want to be in that situation. I believe it's possible to develop AI systems that are provably safe and beneficial, that we can retain control over that actually want to be switched off. That's a really important thing, right? If we want to switch it off, it needs to want to be switched off. And that's a consequence of the theory that I'm working on. But it's not a property of the kinds of systems that we're building now. So on the other questions, you know, what is the future of our coexistence with machines? What types of lives will people have? How will they continue to have valuable economic roles? When I can do pretty much all the jobs that humans can do, I think that's a really important question for policymakers, because my guess is that the value that we can provide will be much more of an interpersonal nature, but it's not going to be the value that a factory worker can provide because as we know, those types of jobs are already being automated out of existence. It's not going to be in routine clerical work. I mean, a simple way of putting it. I know, Jerry, you don't necessarily agree with with this line of argument, but it shows if you can no longer sell physical labor and you can no longer sell mental labor, it's not clear that there's another thing, right, that that human race can fall back on, except we might call it interpersonal or emotional or empathic capabilities, where we have this sort of intrinsic comparative advantage over machines because because we know what it's like. Well, right. And I give this example in the book. Right. What's it like to hit your thumb with a hammer? Right. Who's done that? Right. Most of you. And someone who hasn't done that. Right? A few. Okay. Well How would you find out what it's like if you didn't know or you would just hit the phone with a hammer? You say, Oh, now I get it. Now I understand why people are so upset when they do that, right. But there's nothing a machine can do to find out what it's like, right? They can at best make empirical correlations and assume that it's unpleasant, but they don't know what it's like. They don't know what it's like to be left by your lover. They don't know. It's like to lose a parent or to lose a child or to lose a job or to be promoted or any of the feelings of what it's like to be human. And so there we have this comparative advantage, and there are also things that we just don't want to be done by machines right? I imagine that at some point in the future there'll be a profession that we might call lunch to someone who's really, really good at having lunch with you. Right where you have lunch with them, you go away feeling much better about yourself. Entertained, amused, wiser, more positive, and so on. Right. And you won't get those feelings if that, if that was a robot. Well so we'll see. The difficulty is that most of these interpersonal jobs right now are low status because they are they are not based on real scientific understanding. If you compare babysitting with orthopedic surgery. Right. My children are actually more and more important to me than my arms and legs. Right. But we pay the orthopedic surgeon 100 times or 1000 times as much per hour as as the babysitter. Not here in the Bay Area. So. And why is that right? Well, it's because the surgeon gets to depend on literally thousands of years of medical research on how to do this effectively and successfully. Whereas, you know, I remember one of my babysitters when I was seven trying to teach me to smoke. And if we had, you know, a real science of of how to be a wonderful companion for a child based on, the individual psychology, the psychology of the child and so on, and the training processes to go with it. Then we would think of those as high status professions just as much as the surgeon. So there's a lot of science we haven't done. The human sciences have been enormously neglected. We have to catch up because those are the sciences that will form the basis of our economic future, in my view. Well, the thing you and I agree on is that the future of work is more interpersonal services and things that are performance oriented. We're not going to want to go see four robots play a string quartet. You know, we don't want to watch them play basketball. So there's plenty of that's where we may differ, as I, I think the historical evidence shows will be plenty of such work in the future. But that's a different hour of conversation, I'm sure. One thing you said I'd really like to focus on for the audience, and I'm afraid we're out of time on it. Say few words. In closing, are you made a point that I think is very important, which is that we are currently training AI systems. But the truth is that AI systems are also training us. And that's one of the perhaps bigger dangers is that we will become a part of their the optimization algorithms inside these, whether that's buying stuff from Amazon or staying on social media to read another couple of posts or whatever it might be. I think that most people are not as aware of the fact that they are the commodity that is being sold and they're not in control of that at this time. And manipulated and it's going to get worse. And make it's. Way worse. Social media algorithms are really simple machine learning algorithms. You adjust a clickstream to them. You don't know that you have a mind or a body or a brain or politics or anything. They don't even understand the content of the things that they're sending to you, right? They just learn that is the right thing to send next to get the person to be more more click producing in the future. Yes. If we can make that we this the thing to sell next. Make them more empathetic and knowledgeable and rational human beings. That would be an improvement over the current state of the art certainly. Yeah. In the meantime I think actually are certain classes of algorithms called reinforcement learning items that we should literally ban in those types of user facing roles because by definition are going to manipulate. Yeah, well, unfortunately, we are out of time. I want to thank Professor Russell and let me see, I'm supposed to say you'll be sticking around for a ton. Copies of his book, highly recommended. And if you want to watch more programs or support the Commonwealth Club's efforts in making both virtual and in-person programing possible, please visit the Commonwealth Club dot org slash events so I think we can. Thank you. Thank you for.

Info

Channel: Commonwealth Club of California

Views: 163,174

Rating: undefined out of 5

Keywords: CommonwealthClub, CommonwealthClubofCalifornia, stuartrussell, artificial intelligence, machine learning, openai chatgpt, open ai, stuart russell, ai risk, what is chatgpt, how to use chatgpt, chat gpt explained, chatgpt explained, machine learning basics, open ai chat, openai chatgpt explained, chatgpt examples, what is chatgpt and how to use it, openai api, stuart russell artificial intelligence, stuart russell ai, chatgpt 4, artificial intelligence robot, chat gpt

Id: ow3XrwTmFA8

Channel Id: undefined

Length: 73min 25sec (4405 seconds)

Published: Tue Apr 04 2023