Possible End of Humanity from AI? Geoffrey Hinton at MIT Technology Review's EmTech Digital

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] hi everyone welcome back hope you had a good lunch my name is Will Douglas Heaven senior editor for AI at MIT technology review and I think we'd all agree there's no denying that generative AI is the thing at the moment but Innovation does not stand still and in this chapter we're going to take a look at Cutting Edge research that is already pushing ahead and asking what's next but starting us off I'd like to introduce a very special speaker who will be joining us virtually Jeffrey Hinton is professor emeritus at University of Toronto and until this week an engineering fellow at Google but on Monday he announced that after 10 years he will be stepping down Jeffrey is one of the most important figures in modern AI he's a pioneer of deep learning developing some of the most fundamental techniques that underpin AI as we know it today such as back propagation the algorithm that allows machines to learn this technique it's the foundation on which pretty much all of deep learning rests today in 2018 Jeffrey received the Turing award which is often called the Nobel of computer science alongside yanlokan and yoshiya bengio he's here with us today to talk about intelligence what it means and where attempts to build it into machines will take us Jeffrey welcome to mtech thank you how's your week going busy few days I imagine for the last 10 minutes was horrible because my computer crashed and I had to find another computer and connect it up and we're glad you're back that's the kind of technical detail we're not supposed to share with the audience right okay it's great you're here very happy that you could join us now I mean it's been the news everywhere that you uh stepped down from Google this week um could you start by telling us why why you made that decision well there were a number of reasons there's always a bunch of reasons for a decision like that one was that I'm 75 and I'm not as good at doing technical work as I used to be my memory is not as good and when I program I forget to do things so it was time to retire a second was very recently I've changed my mind a lot about the relationship between the brain and the kind of digital intelligence we're developing so I used to think that the computer models we were developing weren't as good as the brain and the aim was to see if you could understand more about the brain by seeing what it takes to improve the computer models over the last few months I've changed my mind completely and I think probably the computer models are working in a rather different way from the brain they're using back propagation and I think the brain's probably not and a couple of things that led me to that conclusion but one is the performance of things like gpt4 so let's I want to get on to the points of gpt4 very much in a minute but let's you know go back to the we all understand um the argument you're making and tell us a little bit about what back propagation is and this is an algorithm that you you developed with a couple of colleagues back in the 1980s um many different groups discover back propagation um the special thing we did was used it um and showed that it could develop good internal representations and curiously we did that by show by implementing a tiny language model it had embedding vectors that were only six components on the training set was 112 cases um but it was a language model it was trying to predict the next term in our stray of symbols and about 10 years later Joshua Benjo took basically the same net and used it on natural language it showed it actually worked for natural language if you made it much bigger um but the way that propagation works um I can give you a rough explanation from it of it um people who know how it works can sort of sit back and feel smug and laugh at the way I'm presenting it okay because I'm a bit worried about that um so imagine you wanted to detect birds and images so an image let's suppose it was a 100 pixel by 100 pixel image that's 10 000 pixels and each pixel is three channels RGB so that's 30 000 numbers the intensity in each channel in each pixel that represents the image now the way to think of the computer vision problem is how do I turn those 30 000 numbers into a decision about whether it's a bird or not and people tried for a long time to do that and they weren't very good at it um but here's the suggestion of how you might do it you might have a layer of feature detectors that detects very simple features and images like for example edges so a feature detector might have big positive weights to a column of pixels and then big negative weights to the neighboring column big cells so if both columns are breaked it won't turn on if both colors are dim we won't turn on but if the column on one side is bright and the column on the other side is dim it'll get very excited and that's an edge detector so I just told you how to wire up an edge Detector by hand by having one column of big positive way so next to it won't call them big negative weights and we can imagine a big layer of those detecting edges in different orientations and different scales all over the image we'd need a rather large number of them and that just in an image you mean just a line sort of edges of a shape space where the place where the inte density changes from Bright to dark um yeah just that then we might have a layer of feature detectors above that that detect combinations of edges so for example we might have something that detects two edges the join join at a fine angle like this um so it'll have a big positive weight to each of those two edges and if both of those edges are at the same time it'll get excited and that would detect something that might be a bird's beak it might not but it might be a buzzfeed you might also in that layer have a feature detector that will detect a whole bunch of edges arranged in a circle um and that might be a bird's eye it might be all sorts of other things it might be a knob on a fridge or something um then in a third layer you might have a feature detector that detects this potential beak and detects the potential eye and is wired up so it'll like a beak on an eye in the right spatial relation to one another and if it sees that it says Ah this might be the head of a bird and you can imagine if you keep wiring like that you could eventually have something that detects a bird but wiring all that up by hand would be very very difficult deciding on what should be connected to what and what the weight should be but it would be especially difficult because you want these sort of intermediate layers to be good not just for detecting Birds but for detecting all sorts of other things so it would be more or less impossible to wire it up by hand so the way back propagation works is this you start with random weights so these feature detectors are just complete rubbish and you put in a picture of a bird and at the output it says like 0.5 it's a bird suppose you only have birds or long Birds and then you ask yourself the following question how could I change each of the weights in the network um each of the weights on Connections in the network so that instead of saying 0.5 it says 0.501 that it's a bird 1.499 that it's not and you've changed the weights in the directions that will make it more likely to say that a bird is a bird unless like you say that a non-bird is a bird and you just keep doing that and that's back propagation back propagation is actually how you take the discrepancy between what you want which is a probability of one that is a bird and what it's got at present which is probability 0.5 that it's a bird how you take that discrepancy and send it backwards through the network so that you can compute for every feature detected in the network whether you'd like it to be a bit more active or a bit less active and once you've computed that if you know you want a feature detector to be a bit more active you can increase the weights coming from feature detects in the labeler that are active and maybe putting some negative weights to feature detecting the layer below that are off and now you have a better detector so back propagation is just going backwards through the network to figure out for each feature detector whether you wanted a little bit more active or a little bit less active thank you I can show it there's no one in the audience here that's smiling and thinking that was a silly explanation um so let's fast forward quite a lot to you know that technique basically um performed really well on image net we had Joe alpino from meta yesterday showing how far image detection had had come and it's also the technique that underpins large language models um so I want to talk now about um this technique which you initially were thinking of as uh almost like a poor approximation of what biological brains might do yes has turned out to do things which I think have stunned you um particularly in in large language models so talk to us about um why that sort of Amazement that you have with today's large language models has completely sort of almost flipped your thinking of what back propagation or machine learning in in general is so if you look at these large language models they have about a trillion connections and things like gpg4 know much more than we do they have sort of Common Sense knowledge about everything and so they probably know a thousand times as much as a person but they've got a trillion connections and we've got 100 trillion connections so they're much much better at getting a lot of knowledge into only a trillion connections than we are and I think it's because back propagation may be a much much better learning algorithm than what we've got can you define not scary yeah I definitely want to get onto the scary stuff but what do you mean by by better um it can pack more information into only a few connections right we're defining a trillion as only a few okay so these digital computers are better at learning than than humans um which itself is is a huge claim um but then you also argue that that's something that we should be scared of so could you take us through that step of the argument yeah let me give you uh a separate piece of the argument which is that um if a computer is digital which involves very high energy costs and very careful fabrication you can have many copies of the same model running on different Hardware that do exactly the same thing they can look at different data but the model is exactly the same and what that means is suppose you have 10 000 copies they can be looking at 10 000 different subsets of the data and whenever one of them learns anything all the others know it one of them figures out how to change the weight so it knows its state it can deal with this data they all communicate with each other and they all agree to change the weights by the average of what all of them want and now the 10 000 things are communicating very effectively with each other so that they can see ten thousand times as much data as one agent could and people can't do that if I learn a whole lot of stuff about quantum mechanics and I want you to know all that stuff about quantum mechanics it's a long painful process of getting you to understand it I can't just copy my weights into your brain because your brain isn't exactly the same as mine no it's not it's younger so we have digital computers that can learn more things more quickly and they can instantly teach it to each other it's like you know if people in the room here could instantly transfer what they had in their heads in into mind um but why why is that scary well because they can learn so much more and they might take an example of a doctor and imagine you have one Doctor Who's seen a thousand patients and another doctor who's seen 100 million patients you would expect the doctors in 100 million patients if he's not too forgetful to have noticed all sorts of Trends in the data that just aren't visible if you've only seen a thousand patients you may have only seen one patient with some rare disease the other doctors have seen 100 million will have seen well you can figure out how many patients but a lot um and so we'll see all sorts of regularities that just aren't apparent in small data and that's why things that can get through a lot of data can probably see structuring data we'll never see and but then take take take me to the point where I should be scared of of this though well if you look at gpt4 it can already do simple reasoning I mean reasoning is the area where we're still better but I was impressed the other day gpt4 doing a piece of Common Sense reasoning that I didn't think you would be able to do so I asked it I want I I want all the rooms in my house to be white at present the some white room some blue rooms and some yellow rooms and yellow paint Fades to White within a year so what should I do if I want them all to be white in two years time and it said you should paint the blue rooms yellow that's not the natural solution but it works right yeah um that's pretty impressive Common Sense reasoning is the kind that it's been very hard to get AI to do using symbolic AI because you had to understand what understand what fades means it had to understood um by temporal stuff and so they're doing sort of sensible reasoning um with an IQ of like 80 or 90 or something um and as a friend of mine said it's as if some genetic Engineers have said we're going to improve grizzly bears we've already improved them throughout an IQ of 65 and they can talk English now and they're very useful for all sorts of things but we think we can improve the IQ to 210. I mean I certainly have I'm sure many people have had you know that feeling when you're interacting with um these these latest chat Bots you know sort of hair on the back and neck it's sort of uncanny feeling but you know when I have that feeling and I'm uncomfortable I just close my laptop so yes but um these things will have learned from us by reading all the novels there ever were and everything Machiavelli ever wrote um that how to manipulate people right and they'll be if they're much smarter than us they'll be very good at manipulating us you won't realize what's going on you'll be like a two-year-old who's being asked do you want the peas or the cauliflower and doesn't realize you don't have to have either um and you'll be that easy to manipulate and so even if they can't directly pull levers they can certainly get us to pull Divas it turns out if you can manipulate people you can invade a building in Washington without ever going there yourself very good yeah so is that is that I mean if the word okay this is a very hypothetical world but if there were no Bad actors you know people with with bad intentions would we be safe I don't know um would be safer than in a world where people have bad intentions and where the political system is so broken that we can't even decide not to give assault rifles to teenage boys um if you can't solve that problem how are you going to solve this problem well I mean I don't know I was hoping that you would have some thoughts like you've you've so one I mean unless we didn't make this clear at the beginning I mean you want to speak out about this um and you feel more comfortable doing that you know without it sort of having any blowback on on Google yeah um but you're speaking out about it but in in some sense talk is cheap if we then don't have you know uh actions or what do we do I mean when we lots of people this week are listening to you what should we do about it I wish it was like climate change where you could say if you've got half a brain you'd stop burning carbon um it's clear what you should do about it it's clear that's painful but has to be done uh I don't know of any solution like that to stop these things taking over from us what we really want I don't think we're going to stop developing them because they're so useful they'll be incredibly useful in medicine and in everything else um so I don't think there's much chance of stopping development what we want is some way of making sure that even if they're smarter than us um they're going to do things that are beneficial for us that's called the alignment problem but we need to try and do that in a world where there's Bad actors who want to build robot soldiers that kill people and it seems very hard to me so I'm sorry I'm I'm sounding the alarm and saying we have to worry about this and I wish I had a nice simple solution I could push but I don't but I think it's very important that people get together and think hard about it and see whether there is a solution it's not clear there is a solution so I mean talk to us about that I mean you spent your career um you know on the technicalities of this technology is there no technical fix why can we not build in guard rails or any make them worse at learning or uh you know restrict the way that they can communicate if those are the two strings of your your argument I mean we're trying to do all sorts of address um but suppose it did get really smart are these things can program right they can write programs and suppose you give them the ability to execute those programs which we'll certainly do um smart things can outsmart us so you know imagine your two-year-old saying my dad does things I don't like so I'm going to make some rules for what my dad can do you could probably figure out how to live with those rules and still go where you want yeah but where there still seems to be a step where these um these smart machines somehow have you know motivation of of their of their own yes yes that's a very good point so we evolved and because we evolved we have certain built-in goals that we find very hard to turn off like we try not to damage our bodies that's what Pain's about um we try and get enough to eat so we feed our bodies um we try and make as many copies of ourselves as possible maybe not deliberately that intention but we've been wired up so there's pleasure involved in making many copies of ourselves and that all came from Evolution and it's important that we can't turn it off if you could turn it off um you don't do so well like there's a wonderful group called the Shakers who are related to the Quakers who make beautiful Furniture but didn't believe in sex and there aren't any of them around anymore no so these digital intelligences didn't evolve we made them and so they don't have these built-in goals and so the issue is if we can put the goals in maybe it'll all be okay but my big worry is sooner or later someone will wiring to them the ability to create their own sub goals in fact they almost have that already the versions of chat GPT that call chat gbt um and if you give something the ability to send sub goals in order to achieve other goals I think it'll very quickly realize that getting more control is a very good sub goal because it helps you achieve other goals and if these things get carried away with getting more control we're in trouble so what's I mean what's the worst case scenario that you think is conceivable oh I think it's quite conceivable that humanity is just a passing phase in the evolution of intelligence you couldn't directly of All Digital intelligence it requires too much energy into too much careful fabrication you need biological intelligence to evolve so that it can create digital intelligence the digital intelligence can then absorb everything people ever wrote um in a fairly slow way which is what Chachi Beauty has been doing um but then it can start getting direct experiences of the world and learn much faster and it may keep us around for a while to keep the power stations running but after that um maybe not so the good news is we figured out how to build beings that are Immortal so these digital intelligences when a piece of Hardware dies they don't die if you've got the weights stored in some medium and you can find another piece of Hardware that can run the same instructions then you can bring it to life again um so we've got immortality but it's not for us so so Ray Kurzweil is very interested in being immortal I think it's a very bad idea for old white men to be immortal um we've got the immortality um but I'm it's not for rain no I mean the scary thing is that in a way maybe you will be because you you invented you invented much of this technology um I mean when I hear you say this I mean probably once you know run off the stage into the street now and start unplugging computers um and I'm I'm afraid we can't do that why you sound like Hal from 2001. yeah I I know you said before that you know it was suggested a few months ago that there should be you know a moratorium on AI uh advancement um and I I don't think you think that's a very good idea but more generally I'm curious why Amy should we not just stop um and I know you think you're sorry I was just going to say that you know I know that you've spoken also that you're you're an investor of your personal wealth in some companies like cohere that are building these large language models so I'm just curious about your personal sense of responsibility and each of our personal responsibility responsibility what should we be doing I mean should we try and stop this is what I'm saying yeah so I think if you take the existential risk seriously as I now do I used to think it was way off but I now think it's serious and fairly close um it might be quite sensible to just stop developing these things any further but I think it's completely naive to think that would happen there's no way to make that happen and one reason I mean if the U.S stops developing and the Chinese won't they're going to be used in weapons and just for that reason alone governments aren't going to stop developing them so yes I think stopping developing them might be a rational thing to do but there's no way it's going to happen so it's silly to sign petitions saying please stop now we did have a holiday we had a holiday from about 2017 for several years because Google developed the technology first it developed the Transformers it also demand the fusion models um and it didn't put them out there for people to use and abuse it was very careful with them because it didn't want to damage his reputation and he knew there could be bad consequences but that can only happen if there's a single leader once open AI had built similar things using Transformers and money from Microsoft and Microsoft decided to put it out there Google didn't have really much choice if you're going to live in a capitalist system you can't stop Google competing with Microsoft um so I don't think Google did anything wrong I think it's very responsible to begin with but I think it's just inevitable in the capitalist system or a system with competition between countries like the US and China that this stuff will be developed my one hope is that because if we allowed it to take over it would be bad for all of us we could get the US and China to agree like we could with nuclear weapons which were bad for all of us yeah we're all in the same boat with respect to the existential threat so we all know to be able to cooperate on trying to stop it as long as we can make some money on the way I'm I'm going to take some audience questions from the room if you make yourself known um and while people are going around with the microphone there's one question I was like going to ask from the online audience um I'm interested you mentioned a little bit about sort of maybe a transition period as machines get smarter and outpace humans I mean we'll be there'll be a moment where it's hard to Define what's human and what isn't or are these two very distinct forms of intelligence I think they're distinct forms of intelligence now of course the digital intelligences are very good at mimicking us because they've been trained to mimic us and so it's very hard to tell if chat gbt wrote it or whether um we wrote it so in that sense they look quite like us but inside they're not working the same way uh who is first in the room can hello my name is Hal Gregerson and my middle name is not 9000. um I I'm a faculty or in the MIT Sloan School arguably asking questions is one of the most important human abilities we have from your perspective now in 2023 what question or two should we pay most attention to and is it possible for these Technologies to actually help us ask better questions and out question the technology um yes but what I'm saying is there's many questions we should be asking but one of them is how do we prevent them from taking over how do we prevent them from getting control and we could ask them questions about that um but I wouldn't entirely trust their answers uh question at the back and can I want to get through as many as we can so if you can keep your question as short as possible this is on yeah Dr Hinton thank you so much for being here with us today I shall say uh this is the most expensive lecture I've ever paid for but I think it was worthwhile um I just have a question for you because you mentioned the analogy of nuclear history and obviously there's a lot of comparisons by any chance do you remember what uh President Truman told Oppenheimer when he was in the Oval Office no I don't I know something about that um but I don't know what Truman told opening thank you we'll take it from here um next audience question sorry if the people the mics could let me know who's next maybe give a keep go ahead hello uh Jacob Woodruff with the amount of data that's been required to train these large language models would we expect a plateau in the intelligence of these systems uh and and how might that slow down or restrict the advancement okay so I that is a ray of hope that maybe we've just used up all human knowledge and they're not going to get any smarter but think about images and video so multimodal models will be much smarter than models that just trend on language alone they'll have a much better idea of how to deal with space for example and in terms of the amount of Total video we still don't have very good ways of processing video in these models of modeling video we're getting better all the time but I think there's plenty of data in things like video that tell you how the world works so we're not hitting the data limits for multimodal models yet uh next uh gentle on the back and please please do keep your questions short hello Dr hindriel uh Raji several from PWC the point that I wanted to understand is that everything that AI is doing is learning from what we are teaching them okay data yes they are faster at learning how one trillion connectors can do much more than 100 trillion characters that we have but every piece of human evolution has been driven by thought experiments like Einstein used to do thought experiments because there was no speed of light out here on this planet how can AI get to that point if at all and if it cannot then how can we possibly have an existential threat from them because they will not be self-learning so to say there will be self-learning limited to the model that we tell them I think that's a very that's a very interesting argument but I think they will be able to do thought experiments I think they'll be able to reason so let me give you an analogy if you take Alpha zero which plays chess it has three ingredients it's got something that evaluates the board position to say is that good for me it's got something that looks at a ball position and says what's a sensible move to consider and then it's got Monte Carlo rollout where it does what's called calculation where you think if I go here and he goes there and I go here and he goes there now suppose you leave out the Monte Carlo rollout and you just train it from Human experts to have a good evaluation function and a good way to choose moves to consider it still plays a pretty good game of chance and I think that's what we've got with the chatbots and we haven't got them doing internal reasoning but that will come and once they start doing internal reasoning to check for the consistency between the different things they believe then they'll get much smarter and they will be able to do thought experiments and one reason they haven't got this internal reasoning is because they've been trained from inconsistent data and so it's very hard for them to do reasoning because they've been trained on all these inconsistent beliefs and I think they're going to have to be trained so they say you know if I have this ideology then this is true in F5 that ideology then that is true and once they're trained like that within an ideology they're going to be able to try and get consistency and so we're going to get a move like from a version of alpha zero that just has a something that guesses good moves and something that evaluates positions to a version that has long chains of Monte Carlo rollout which is the corner of reasoning and it's going to get much better I'm going to take one in the front here and then if you can be quick we'll try and squeeze someone as well Lewis lamb and Jeff I know you for a long time and Jeff people criticize the language models because of allegedly they are lacking semantics and grounding to the world and you have been trying to as well to explain how neural networks work for a long time is the question of semantics and explainability relevant here or language models have taken over and it's we are now doomed to go forward without semantics or grounding to reality I find it very hard to believe that they don't have semantics when they consult problems like you know how I paint the rooms how I get all the rooms in my house to be painted white in two years time I mean whatever semantic is it's to do with the meaning of that stuff and it understood the meaning it got it now I agree it's not grounded um by being a robot but you can make multimodal ones that are grounded Google's done that and the multimodal ones that are grounded you can say please close the draw and they reach out and grab the handle and close the drawer and it's very hard to say that doesn't have semantics in fact in the very early days of AI in the days of Willow grad in the 1970s they had just a simulated world but they have what's called procedural semantics where if you said to it put the red box in put the red block in the green box and it put the red block in the green box she said see it understood the language and that was the Criterion people used back then but now that neural Nets can do it they say that's not an adequate criteria one at the back hey Jeff this is ishwar balani from Sai group so clearly you know the technology is advancing at an exponential Pace I wanted to get your thoughts if you looked at the near and medium term say one to three or maybe five year Horizon what the social and economic implications are uh you know from a societal perspective with you know job loss or maybe new jobs being created just wanted to get your thoughts on on how we proceed given the state of the technology and rate of change yes so the sort of alarm I'm the alarm Bell line ringing is to do with the existential threat of them taking control lots of other people have talked about that well I don't consider myself to be an expert on that but there's some very obvious things that um they're going to make a whole bunch of jobs much more efficient so I know someone who answers letters of Complaint to a Health Service then he used to take 25 minutes writing a lecture and now it takes him five minutes because he gives it to chat gbt and chat gpg writes the letter for him and then he just checks it there'll be lots of stuff like that which is going to cause huge increases in productivity um there will be delays because people are very conservative about adopting new technology but I think there's going to be huge increases in productivity My worry is for those increases in productivity are going to go to putting people out of work and making the rich richer and the poor poorer and as you do that as you make that Gap bigger Society gets more and more violent this thing called the duty index which predicts quite well how much violence there is um so this technology which ought to be wonderful you know even the good uses of technology for doing helpful things ought to be wonderful but our current political systems is going to be used to make the rich richer and the poor poorer you might be able to ameliorate that by having a kind of basic income that everybody gets but the technology is um being developed in a society that is not designed to use it for everybody's good um a question here from Joe castaldo of the Global Mail who's in the audience um do you intend to hold on to your investments in kahir and other companies um and if so why um well I could take the money and I could put it in the bank and let them profit from it um it's yes I'm going to hold on to my investment Seeker here partly because the people at aranco here are friends of mine um I sort of believe these languages like big language models are going to be very helpful um I think the technology should be good and it should make things work better um it's the politics we need to fix for things like employment um but when it comes to the existential threat we have to think how we can keep control of the technology that's but the good news there is that we're all in the same boat so we might be able to get cooperation and in speaking out I mean part of your thing is I understand it is you actually want to engage with the people making this technology and you know change their minds or or maybe make a case for I I don't really know I mean we've established that we don't really know what to do but it's about engaging rather than stepping back so one of the things that made me leave Google and go public with this is a um he used to be a junior Professor but he's now a middle ranked Professor um who I think very highly of who encouraged me to do this he said Jeff you need to speak out there listen to you people are just blind to this Danger and do you I think people are listening now yeah no I think everyone in this room is listening for for a start and just one last question we're out of time but I'm do you have regrets that you know you're involved in making this Kate Mets tried very hard to get me to say I had regrets Kate Mets at the New York Times and yes and in the end um I said well maybe slight regrets which got reported as has regrets um I don't think I made any bad decisions in doing research I think it was perfectly reasonable back in the 70s and 80s to do research on how to make artificial neural Nets um it wasn't really foreseeable this stage of it wasn't foreseeable and until very recently I thought this existential crisis was a long way off so I don't really have any regrets about what I did thank you Jeffrey thank you so much for joining us [Applause]

Info

Channel: Joseph Raczynski

Views: 482,335

Rating: undefined out of 5

Keywords:

Id: sitHS6UDMJc

Channel Id: undefined

Length: 39min 14sec (2354 seconds)

Published: Thu May 04 2023