Murray Shanahan: The Future of Artifical Intelligence - Schrödinger at 75: The Future of Biology

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] we're moving slightly away from the biology just for a brief section and we're going to talk more about artificial intelligence and then how it links with biology in the brain in this session so our first speaker is Mary Shanahan who's professor of cognitive robotics at Imperial College London he's also a senior research scientist at google deepmind which is the world leader in artificial intelligence research that's on a scientific mission to push the boundaries of AI to solve any complex problem without needing to be taught how so hopefully they'll make all of our lives a little bit easier he's also on the external advisory board for Cambridge Centre for the study of existential risk he graduated from Imperial with a first in computer science in 1984 he obtained his PhD from Cambridge University in 1988 and since then he's carried at work and published on artificial intelligence robotics logic dynamical systems computational neuroscience and philosophy of mind his book embodiment and the inner life was a significant influence on the film ex machina and he was a scientific adviser on that movie his current book the technological singularity runs through various scenarios following the invention of an AI that makes better versions of itself and rapidly it competes humans so I hope that that that's a nice kind of lead on from Isaac Asimov where he brought us with AI in the 60s so please join me in welcoming professor at Mary Shannon [Applause] okay thank you very much for inviting me to this remarkable event and it's a particular honor to be in such esteemed company okay so my plan here is to situate my field of artificial intelligence in time so so that has my talk has a nice simple structure I'm going to talk about the past the present and the future okay so first of all first of all the past well it's common to trace the origins of the field of artificial intelligence to a conference that took place in the summer of 1956 the so called Dartmouth conference and this was organized by John McCarthy and John McCarthy coined the term artificial intelligence and he coined the term in this document which was the proposal for this this summer school and it's very interesting to look back on this proposal because it's remarkable in in many ways now you probably can't read this so I'm just gonna read it out to you he says we propose that a two-month ten man's study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College the study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it an attempt will be made to find how to make machines use language form abstractions and concepts solve kinds of problems now reserved for humans and improve themselves we think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer and this is this is a terrific document I have to say in many reasons but I think it's safe to say that that John McCarthy at that time seriously underestimated the difficulty of the problems that he that he was setting out him he with Marvin Minsky and Rochester and Claude Shannon and father of information theory were all the co-authors of this proposal and I just want to kind of actually have a couple of quotes from this remarkable proposal it's it's only three or four pages long but it really did anticipate the major paradigms of research in the field of artificial intelligence so in particular one the predominant paradigm in the 1980s or up to an including the 1980s was so called symbolic artificial intelligence which is all about manipulating language like representations and carrying out logic like inference so so in the proposal it says a large part of human thought consists of manipulating words according to rules of reasoning and rules of conjecture for this point of view forming a generalization consists of admitting a new word and some rules whereby sentences containing it imply and are implied by others and that's really the essence of that one whole school of artificial intelligence that then dominated the field for a very long time but there was a compose a competing sort of paradigm which is the connectionist paradigm much closer to neural networks not just closer to to being inspired by the workings of the brain and that wasn't very much anticipated in the proposal as well so first of all it the the proposal discusses discusses machine learning so he says probably a truly intelligent machine will carry out activities which may best be described as self-improvement some schemes for doing this have been proposed and a worth further study so so here we are quite a few decades on and we're still studying those those things and indeed many of the really largest challenges that face the field of artificial intelligence today were anticipated in this 1955 proposal for the conference the following year 1956 so for example elsewhere in this little short document it says how can a set of hypothetical neurons be arranged so as to form concepts partial results have been obtained but the problem needs more theoretical work you know it's I mean it certainly does you know and that was 1955 and it's still very much the case a number of types of abstraction can be distinctly defined and several others less distinctly a direct attempt to classify these and to describe machine methods of forming abstractions from sensory and other data would seem worthwhile now that is almost a perfect description of what is one of the main challenges to the whole neural network paradigm of machine learning learning today and that was from 1955 so it's very interesting in going looking forward also to look back it's quite informative ok now people will be very aware of course and it's no doubt one of the reasons that I'm here that that AI is a very hot topic and has attracted a great deal of attention recently but it hasn't always been this way and and over the the decades following the Dartmouth project the Dartmouth conference in 1956 there's been a whole series of cycles of hype and disappointment and hype and disappointment and the the troughs of disappointment are often called AI winters and really there was to AI winters so following the early successes that came after the Dartmouth project the first AI winter occurred in the sort of in the early 70s and coincided with the publication of the light Hill report in the UK and other kind of somewhat do- assessments of the outlook of the field and and there was a there was a week lull in in in interest and in investment and so on oh yeah so when I first when I first produce this slide I did I I didn't have anything on the on the y-axis here and we the rules of my lab is that you you know just don't ever come to me with a graph where the axes are not labeled you know it's an absolutely fundamental rule and I realized as I was preparing for this talk that I had violated my own fundamental lab rule so I have to put something on you know to label the the y-axis I think what the heck is the y-axis labeling you know representing well it's sort of something like scientific credibility and media interest and investment all of those things they all seem to be highly correlated but don't ask me to tell you what the causal relations among those things are okay so then following this first AI winter there was then a great surge of investment partly driven by the Japanese and the Japanese 5th generation project which led to a lot of interest in so-called expert systems then that didn't really quite live up to expectations and there was a second AI winter that in in the 90s and it's really only from the early 2010's that things really started to take off again and things have started to take off largely thanks to the successes of an approach to machine learning called deep learning which is based in neural networks so that's what I'm going to talk a little bit about now so that leads us to to the present or you'll notice by the way that this that this that this curve goes off into the stratosphere and I'm not gonna I'm not gonna you know predict exactly what happens to it after that point ok so so now I want to just actually so some of you of course will be very familiar with how neural networks artificial neural networks work but the majority won't so I want to actually give you a little brief sort of tutorial on artificial neural networks about explaining how they how they work and the kinds of successes that they've led to to today so so a neural network is a collection of these very simple computational elements which are artificial neurons and artificial neurons essentially computes the weighted sum of a number of inputs then they apply in nonlinearities such as a hyperbolic tangent or a rectifying linear unit or something like that so they various choices applying non-linearity and then their output goes on to the next or to all the neurons that they're connected to now in the typical arrangement that we see in in in today's neural networks we see a series of layers so you'll see on the right-hand side here it shows a typical feed-forward Network so into this network and the left-hand sides that you have an input layer and that's that's where whatever input you're putting in arrives and then you have a number of hidden layers and here I'm only showing one which are also neurons and they're connected to output layers which also contain a number of neurons now I'm only showing one hidden layer but the whole point of a so-called deep neural network and which is where the much-hyped phrase deep learning comes from is that there are is that there are many many layers that that will learn to represent features of increasing abstraction and so that's the key to one of the keys to to why deep learning has been so successful and it's a feed-forward network so the flow of information is strictly from left to right in this diagram I haven't shown their arrows on the connections but it's there's no there are no feedback connections and so in that respect it's it's pretty different from from the kinds of networks we find in the brain that for example Danny Bassett was talking about earlier on today okay so that's the that's what a neural network looks like and well what can we do with that kind of neural network well we can train it to - given a whole set of input-output pairs we can train it to perform a computation that we might want it to perform for example to take an image and to label that image and the way that we so for example to label that image with what's in the image so a cat a dog car and so on so here's the canonical example that you learn in in in neural networks 101 which is this handwritten digit data set and so the idea here is that on the left hand side you see so ignore the one that has training data for a moment and just look at the the digitized the pixelated number two so you present an image which is in this case 28 by 28 pixels - a set of input neurons that's your input layer which is the leftmost layer then you have a number of hidden layers in this case I've just got one which might contain say 30 neurons and then you have a number of you have an output layer and each neuron on the output layer is represents the is you're going to train it to represent the digit that you that appears in the in the image so in this particular case you're going to want so they're going to be ten of these output neurons in this particular case so that you recognize the digits not to nine and in this particular case you're going to want the digit that represents the number two to produce a high value and for all of the others to produce a low value and that indicates that what you've got there is the number two so the way that these things are trained sorry about the flicking backwards and forwards is oh yes okay so as you'll see on this the left hand diagram here as I point out the way learning works is by adjusting all of the weights on the incoming connections to a neuron so this neuron here this be represented by this circle is going to compute the weighted sum of its inputs if you recall and these w's are the weights so they're the strengths of the incoming connections and the way you train the neuron is by adjusting these weights up and down and the way it works is that that so suppose that we present this digit number two as the input to this neural network and then the output end where you start off with just a randomly network with random weights then the output is not going to be right most of the time it's going to be it's going to tell you that it's a number that it isn't and what you do is you work out the error in the output and then you take the error in the output and then what you really want to know is you want to know how to adjust all the weights to make this network a little bit better at recognizing that particular image so it's all about some mathematically speaking it's all about understanding gradients it's all about understand the gradient of the of the error with respect to the weights in your network and then having worked that out using calculus using your calculus so use a nice expression for that because all of this is differentiable then then you can adjust the weights a little bit to make it a little bit better at recognizing this particular image and of course if you just do that once that's not going to going to do very much so what you what you really want to do is you want to have a very very large amount of training data and that's what we see on the left a little sample of the training data and the training data has to be come with labels so you have to know what the correct answer is and then you feed all of this training data through your neural network and with each little example you adjust the weights a little bit adjust the weights a little bit adjust the weights a little bit and the network gradually gets better and better and better at recognizing digits and then of course you want to test it on examples it's never seen before and eventually you get something that's very very good at recognizing recognizing these handwritten digits so that's so that's the way sort of neural networks work and it's a tremendously useful [Music] tremendously useful bit of technology and one of the reasons it's become useful is because we these days we can we can throw an enormous amount of computation at this problem and we can train on very very large quantities of data so with this a simple example like this you have these little simple digits then it's not very difficult but what we really want is more realistic images of you know we have objects in them realistic sort of photographs and you want to be able to label them to do that to learn on a larger data set then you need a lot of data and a lot of computation and that's really what led to the breakthrough that gives us the puts us where we are today is being able to apply a large amount of computation to a large amount of training data okay so that's neural network learning reinforcement learning is a different kind of line that's basically learning by trial and error so here we see the the idea is that you've got an agent which is constantly getting observations from the environment and getting rewards and carrying out action and the idea is that this little agent you want it to learn to carry out the actions that will maximize its reward over time and that's called reinforcement learning so it's the kind of learning we all do if we're sort of you know learning to play tennis or something like that you think are that stroke worked and that's even unconsciously our brain is saying oh that's I'll do that one again and so that's sort of reinforcement learning now a great the great contribution that deepmind made back way back in about 2014 was to put these two ideas together the idea of a deep neural network so that's one of those neural networks of the sort that I described with many many layers and to use that as the bit that's in the middle of this agent that works out what actions to perform given a particular input to maximize the reward over time so that's so-called deep reinforcement learning putting those two ideas together and putting those two ideas together led to some pretty spectacular results so the the first was the ability was to build a little network that was able you know fairly medium sized these days certainly not a large network by today's standards that that was able to learn to play these retro Atari video games like space invaders and breakout and so on and this system just took as its input the raw pixels and the reward signal and learned how to get to you know sort of the level of a top player in these games for a large number of games just sort of it would be given a completely new game and then we would start training from scratch and after a certain amount of time what would be really good at that game and basically the same technology with some additional search was then used on quite a lot of engineering of course was used then and applied to the game of Go which as you probably all know deep - alphago then defeated the world sort of top go players which is you know a spectacular achievement okay so that's so we've had the past and that sort of the present but what about the future well so this technology but that's been developed this kind of deep learning technology neural networks machine learning does have you know considerable applications it has a great many uses for example in manufacturing scientific discovery transport healthcare marketing Google's quite interested in marketing so there are many applications of this this sort of technology but AI systems deployed in all of these sectors today's AI systems are all specialists so so basically a new system needs to be put together the architecture design it needs to be trained from scratch for each application and this is very different from human intelligence so human intelligence humans are generalists and a general intelligence is able to perform a huge variety of tasks so the same this very same system can kind of can apply itself to I you know asked humans can apply itself to an enormous number of different tasks and some very often with very very little new data or new learning we can apply ourselves to a new situation or new tasks that we've never seen before now the original goal of the fields the original goal of the the field that was set out and was in the minds of those founders like John McCarthy back in the 1950s was very much to build this kind of artificial general intelligence now they didn't use that term in those days that this term artificial general intelligence is a recent introduction to clearly distinguished they're sort of holy grail of building a general AI from the more specialist applications that we have around at the moment but that was certainly what they had in mind and it's the kind of goal that that that you know the most ambitious AI researchers still have in mind today but there are quite a number of obstacles to achieving that kind of general artificial intelligence and to be honest we don't really know exactly what those obstacles are today and if we did we'd be a little bit further along the road to to addressing them but I like to list so my favorite three that I like to list as I see it are these so nonsense abstract concepts and creativity so common sense what I mean by common sense is an understanding of the everyday world and the consequences of actions that we might carry out in in this everyday world so just things like moving things around picking things up the things that we consequences on people as well so the things that we say and social consequences all of those kinds of things so having a deep understanding of that which which when I say deep it's the kind of thing that any child has by a certain age and endowing computers or that kind of common sense understanding of the everyday ordinary world is something we still haven't got there yet so our abstract concepts so this is related a related challenge so so the ability to look to build an abstract and acquire an abstract concept what I mean by that is is the ability to distill the essence of past experience and apply it in radically unfamiliar contexts so just for example the concept of a room or and this morning John Okeefe was talking about place as being a very abstract concept and indeed it is a very abstract concept or the concept of a room of a building of a table even something to sit on those are all you know quite abstract concepts and we don't at the moment and know we know how to build networks that will take a picture of those things and will often label it correctly but those systems don't have any real understanding of what those objects are for how they what kind of role they take in ordinary human life and and you know the real semantics of those are those those things so we still don't really know how to build build neural networks or any other kind of AI system that's able to to acquire that kind of abstract concept from the ground up from from the raw data of interacting with the world and then there's creativity which characterizes the productive yet open-ended propensity to try out new actions and ideas and so there's a kind of thing that I'm not thinking about the kind of thing that Picasso or Schrodinger were good at I'm thinking about the kind of thing that children are amazing at in just just you know the first few years of their life so those are the obstacles now now as you know that I've got two minutes left and as you know there are no questions allowed in this so I'm just gonna give you the answers to all the questions which I know I always get at the end of these talks so the first why build a GI tool so current AI systems as I alluded to in the previous slide they lack a true understanding of the world and and what it contains so you know a medical diagnosis system doesn't really know what a person is and and that lack of real understanding of what a person is it's going to manifest itself in flaws and shortcomings and mistakes you know eventually and it's similarly an autonomous vehicle doesn't really know what a car is in in this in the sense that we that we do and general intelligence goes hand in hand with true understanding and that's a serious motive for trying to build build you know general intelligence and in every domain where today we apply a specialist AI and an artificial general intelligence would certainly be more capable and more robust for having a more having a real understanding of the world around it and applying AGI to a new domain would also not require the significant engineering effort that it does that it does today and finally an artificial general intelligence would be able to make creative connections across domains unlike today's specialists I now here cut 14 seconds left according to this but you're going to I'm going to go 30 seconds over to answer the questions that you're all the people I always get ok so the first one will AGI be safe well I have to say that the the public can be very much misled by a certain alarmist AI narrative and we really don't know what AGI will be like if and when we manage to build it but I can tell you one thing that I don't think is applicable which is the science fiction scenario where AI becomes self-aware and destroys humanity to protect itself that's very anthropomorphic version of artificial general intelligence so I really do feel that the Terminator images in articles about artifice intelligence should be banned and this is the only one that you're likely to see in any of my talks where it's grayed out with the word banned across it however there are serious issues to do with AGI well many serious issues that are much more urgent to do with AI ethics but that's not the topic of this this storm but there are some serious issues to do with AGI safety and they're really more to do with how we specify to a very powerful AI exactly what we want it to do in such a way that there won't be unintended negative side effects much as there are in in the in the story of The Sorcerer's Apprentice so that's a much more appropriate depiction of AI safety and then last slide the other question I always get is when will AGI happen and I'm sorry to disappoint you but although many people speculate I really think no one knows so I'm not going to give you a date and I and don't believe any dates that you hear that people that people put on on this now of course you know somebody's going to be right right but right they're just going to get lucky so so we really don't know and this is how I can't kind of tend to characterize the situation so imagine a time line and I think there's an there is we probably need a certain number of conceptual breakthroughs to achieve artificial general intelligence and we don't even know what those breakthroughs are or how many there are moreover we really don't know how long it will take to achieve those breakthroughs or what the you know what the intervals are between these these breakthroughs and between them there'll be periods of steady progress and I think we're probably at the moment in a period of steady progress and we were awaiting some unknown sequence of conceptual breakthroughs to happen before we achieve AGI so so the long and the short of it is if you are worried about AGI safety well you don't need to panic just yet so I'll leave it there thank you very much [Applause] [Music]
Info
Channel: Trinity College Dublin
Views: 3,175
Rating: undefined out of 5
Keywords: Trinity College Dublin, Trinity, TCD, University, University of Dublin, Dublin, Ireland
Id: FwKu9Tj2dRs
Channel Id: undefined
Length: 28min 8sec (1688 seconds)
Published: Tue Nov 20 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.