(atmospheric music swells) - This is a fascinating
story we have for you of a senior Google engineer who says one of the company's
artificial intelligence systems has become a sentient being. - He believed one of the company's artificial
intelligence chatbots had become sentient. - Engineer Blake Lemoine, says a chatbot project he was
working on called LaMDA, can express thoughts and feelings equivalent to that of a child. - Google has rejected claims that one of its programs
had advanced so much that it had become sentient. - That's I think the big issue, right? Is that a lot of people
get bogged down deciding, well, what does sentient actually mean? - Hey everyone, it's David Bombal back
with a very special guest. Mike, welcome. - Oh, thanks for having me. - So, Mike, I've seen a lot of your videos
on YouTube's Computerphile, millions of views on some of
the topics that you've done, but can you just introduce
yourself to the audience for people who might not
have seen those videos or don't know what you're doing, 'cause you were telling me you're offline, YouTube isn't your main
thing; you do more than that. - That's right. Yeah. So actually, in some sense,
YouTube is an aside for me. It's just something I did 'cause
I thought it would be fun. I'm an academic at Nottingham,
Associate Professor, and I work teaching security. I research teaching AI
and computer vision. It just happened that we
have some ties at Nottingham to Brady and Sean who do things like Numberphile and Computerphile. And so Computerphile was kind
of fledgling, (indistinct). It had, it was a bit established when I, when I started doing them and it kind of just took off really, I think because I did topics
on security and AI and things, people thought those were interesting and so I get a lot of views on those now, but it is still a bit peculiar when people say hello to me, because I just turn up
and do normal things that the rest of the time. - [David] So you get stopped on the street and-
- It has happened. - Yeah. My wife is never
impressed when that happens. She just thinks this is ridiculous. But you know, it happens
from time to time. I do really enjoy it. And I get lots of emails
from people saying, "Thanks for your videos. I enjoy them." And that's why I do it. That's what it's for, for me. - And I loved what you said offline that you meet, someone said that they started computer science because of you that's what's they-
- I've had a couple of emails like that and that's, those are the best emails, right? 'Cause I want people to
learn about computer science. I love computers. I'm a massive geek. Basically, I program for fun. And the more people do that,
the more it's a win for me. So, you know, if I can encourage a few
people by doing videos, that's what I really want to do. - That's fantastic. I, one of the videos I watched, obviously in preparation
for this interview is this recent video that
you've put out about AI. - Yeah. - And that's gonna be the topic that we wanna talk about today. So let me lead with this. I get these kind of emails all the time. "David, is it worth me
studying cybersecurity?" "David, is it worth me studying computers because AI gonna take all the jobs away?" And I think movies over
the years, like, you know, there's been so many of these movies where the robots take over and, this, can you talk about this? And you can go into the
details if you like, but I think this sort of recent event that you spoke about in your video, hasn't helped the conversation at all. So can you tell us about that? Yeah? And what your thoughts are
about, you know, what happened. - Yeah. So no, I absolutely agree, that it didn't help the
conversation about was, okay? I think when in my video that was kind of what I tried to end with was basically it doesn't, I, you know, in some sense nuances of where this AI is, doesn't interest me that much. All I know is it's not where
they're suggesting it is, at least that's, you
know, that's what I think. I think, I suppose at the moment AI is very application driven, right? So a lot of it is supervised. There is, there is other ways of doing it, but a lot of it's supervised, which means that you have
some kind of training set with some inputs and some outputs that you're trying to
get the model to learn. And then you just train the
model until that happens. That can work really, really, really well. And so for my own research, I do this on things
like image segmentation, where I'm trying to
find objects in images, and you know, medical image segmentation and things like this. But, you know, in practice, if I then take that network and try and run it on
street scenes, it won't work because it's not trained on street scenes. It doesn't know what they are. It hasn't got any ability to
go, "Oh, it's a street now," you know, and take what
it's learned somewhere and apply it somewhere else. You know, retraining a network is really the only way to do it. And that involves even more data, right. So I don't think at the moment,
it's realistic to suggest that there's going to be
some general intelligence that can just do all of our jobs, right? You know, you've seen GitHub Copilot that just produces text, code? And sometimes it will
produce a useful function and sometimes it'll produce
a function full of bugs that you've gotta then spend time fixing. And have you actually saved any time? I dunno: the jury's out, I think. I so, I wouldn't worry at
the moment. I'm not worried. I mean maybe designing
things to replace myself is a huge mistake, but I
don't think we there yet. - So I mean, tell us, just for
people who haven't seen it, haven't seen your video and like haven't read
perhaps what's going on. There's this Google person now- - LaMDA? - Yeah-
- Yep. - So what is LaMDA and what
was he basically saying? - Google LaMDA is a, it's what we call a large language model. So it's basically a very,
very large neural network designed in a certain way. They're all designed in a very similar way and it has more parameters in it than we've ever seen
really in a model, right? Or GPT-3 is also very, very big. And so really what this brings to table is not so much something new that we've never seen before in AI. It's just, it's this huge, you know, orders of magnitude bigger than the kind of networks I would use to do complex imaging tasks. And what they've basically done is they've trained this
model to read a sentence and then predict what
the next word will be. And so you could imagine, that if you wanted to do this by hand and you had infinite resources, you could just look at every sentence that's ever
been written by humans and work out for any
given, let's say, 10 words, what the next word will always be. And if you did that and you had that list of
all the possible inputs, you'd do pretty well
at generating sentences because at the end of the day, that's what people say, right? This is, you've got it on record as what they've said in the past, you can just say those things again. And so this model is one of those, this model is one where
you put in some sentences. So you might put in a sentence that says, what do you think about quantum physics? And then what the model will do is predict the next likely word. And it will probably say, well I'm gonna start by
saying, "What I think is," and then generate some plausible
text on quantum physics because people have written
about quantum physics before and that data is in the training set. What it hasn't done is learned
what quantum physics is or connect it to an internet resource that has information on quantum
physics and looked it up. So, in some sense, it's
a bit like the, you know, in Star Trek, you've got
the computer you can talk to and they often ask the computer
to do things like, you know, put the shields up or whatever. - Computer dim lights. - It's like that, but it's not connected to any
kind of anything on the ship. So it just talks to you and talks back, but it never actions anything. It never, it never has, you know, it's just going from
the text in the training set. And I think that's something
that's perhaps lost a bit in when it's discussed. It's not connected to anything. It doesn't even have a memory, basically, and so it can't reflect on past experience because it has no place
to store past experience, it has no record of those events. And so when it produces sentences, that look really, really interesting, they're actually just really
interesting sounding sentences, you know? And I think so anyway,
I mean, if I sort of, I digress slightly, but you know, in this particular
case, what happened was, someone from Google, who I think was in the ethics department, I don't think he was actually responsible for developing this AI, basically said, "Look at
this chat I've had with it. Don't you think it's sentient?" basically is what he said. And the answer I think for me, and pretty much everyone
who understands these models was, "No, no it's not." And what, and I think the thing that
bothered me most about it was not particularly
one person saying this 'cause he's very entitled to
his opinion, right, you know, I think it was that the media
took it massively seriously and it was all over everywhere:
"Is this the next thing?" And that bugs me somewhat because I don't think it
helps the conversation like you say, right? People start, people who don't know what a big language model is, are gonna be a bit worried about this. And there's really no reason
at this time to be worried. And that bothers me slightly, which is why I do my videos to
try and tell people about it. - I mean, the problem is the
movies predict this happening. And then people see this stuff in the news and it's like, "It's the end of the
world, and end of my job. Arnold and the robots
are gonna take over." So it really doesn't have to be like, it doesn't, it really doesn't help. And I like what you said, I mean, in your video, which I'll link below, you said something that
I thought was hilarious. You said, "Can Python
functions get lonely?" - Yeah. - Can you explain what
you were saying by that? - Yeah. So the, one of the comments in
the original chat transcript between this researcher and his friend, and his colleague, and this LaMDA AI, was, "Do you get lonely?" And it's spouted off a whole paragraph about how lonely it is. And it doesn't make any sense because it's a function call, right? So you put in your words
at the top, it runs what is essentially a
big transformer network, which is pre-trained on all this data, and then it spits out words at the bottom, which you read, and then
it stops executing, right? Because there's no kind of ongoing process like there is in my, I
mean, I like to think that when I'm not immediately
saying something to you, I'm still, there's something
going on in there, right? Maybe, I mean, you know,
I can't prove it to you, but this is not the case, you know? And it's just like, when you run, I mean, I made a joke about it,
but when you run, you know, reverse string in Python, you don't worry that it gets
lonely the rest of the time because it's not executing. That's just some code that executed. It finished executing and it
just lies dormant in memory doing absolutely nothing of interest. And that's, for me, kind of
what this model is doing. If they developed a model that
was always on in some way, like maybe it was always doing something and it had memory and it had storage, I could, I still probably would think I would need some convincing
that it had any kind of, you know, higher level thought process, but at least it would
be plausible, you know? It would sort of think, "Well, at least it's got
something going on in there," but I don't think it's designed that way. It's designed as a very,
very big reverse string. And, you know, I don't worry about those
things being sentient. - Yeah, but it's crazy. I mean, I mean the, well, I mean in my opinion,
because they, kind of, they were implying that
this AI or whatever was like a human, or
equivalent to a human, and it seems like that's quite a stretch, but in, you know, popular culture, that's what people equate to AI it seems. - Yeah. That really, that's I
think the big issue, right? Is that a lot of people
get bogged down deciding, "Well, what does sentient actually mean?" And that doesn't interest me because when anyone uses the word, they're not using it in
a different definition, they're using it in a
definition we think of as like Terminator from Skynet, you know, this researcher wasn't saying, "Oh, I think it's sentient,
but I define sentient as something like a slightly
convoluted if statement," right? He was saying, "I think it's like a person, and it's got memories, and it's got experiences,
and it gets lonely, and it needs a lawyer."-
- Feelings, and that's not - Yeah. And, without with any, with
zero evidence to support this and indeed not so much evidence is just, it doesn't even make sense. So I think you have to
be extremely careful using the word sentient, not because you might have
a different definition, but because everyone has the
actual same definition, right? Which is actual, you know,
human level cognitive ability, but, you know, which, so I don't spend a lot of time worrying about what the
definition of sentience is, because if I go to someone
in a conversation and say, "This is sentience," I
think we both understand implicitly what that
means to me to say that. And so I don't, I think that arguing about the
definition is a bit silly because we actually all secretly
agree on the definition. - Yeah. I mean, I think for the
general population, I mean, I'm not into the AI piece like you are, and that's why you, I
wanna talk to you about it. You know, I just think people go off
movies and popular culture. That's sort of what people,
that's the impression they get and that's why it was so
big on the news perhaps. But can you explain AI
versus machine learning and like, what is machine learning? What is AI? And perhaps just take
us down the road now. - Yeah. Okay. - Like teach us sort of
the basics of this stuff. - Yeah. So AI is misused in the sense
that it's now a capsule. And I will admit, I do
that to an extent myself. And it's partly because I'm lazy, right? I think it's because, it means I don't have to define the
exact thing that I'm doing. Any given time, car engines
are slightly different, but they all at the moment,
I mean, I say they're all, combustion engines all
do much the same thing, even though one's got more cylinders and one's got fewer cylinders and one has a turbo and one doesn't, you don't say you don't well, I don't go on about those details. I just say, "I've got a car and it goes." So AI, I think, is a catch all that
includes machine learning. So you've got AI as a big
kind of thing of stuff with loads of stuff in it. And even my maze solving video
where I just do very simple, looking around the corridors of the maze would be defined in
some sense as AI, right? But the Dykstra algorithm that we use to do network routing and things and other similar algorithms, you could define them in some ways as AI, because they adapt to messages coming in and they change weights
and paths and things. But we wouldn't go as far as
to say they were, you know, anywhere, you know, "smart"
in some sense, right? So I think AI is quite a broad term. And then there are things
like genetic algorithms, evolutionary algorithms, which
do slight different things. They are arguably less popular or less prevalent perhaps will
be the right way to put it, but they also come under
the umbrella of AI. So AI is a very big umbrella term, which basically encompasses most, most things where you could imagine it was sort of intelligent. And then in that, you've
got machine learning. And machine learning is just the idea that you want to try
and program a computer without having to program it, essentially. You wanna give it some input examples or some other mechanism
from which to learn, and it comes up with its own
rules for what it's gonna do. So a decision tree is a good
example of a very simple, conceptually simple
machine learning approach, where you have some kind of data and every time you make a decision, you just split it in two. So maybe you're trying to
analyze financial data, to decide whether people get
a new credit card, right? So the first decision you make is "Have they ever defaulted
on a credit card?" Yes goes this way, no goes this way. And then the next decision is, "Okay, what's the current credit limit?" It's above 7,000 it goes this way, below 7,000 goes this way. And you just split this data
into two, and two, and two, until at the end, you get the actual nodes that
have the decisions on, right? And it's machine learning, because what you can do is you can, you can basically create this tree, but actually change the
numbers and values in it and the decisions based on the data. So you can say, "Well, actually, maybe 7,000
doesn't work that well. We're gonna have it at 6,500 and change the thresholds and things. And you can do this all automatically in the training process. So that's the kind of
thing we're talking about with machine learning. Now what happens of course, is there's a big push in deep learning, which I can also talk about, but-
- Yeah, that'd be great. - Yeah, but-
- 'Cause I mean, we just hear the, I just
hear these buzzwords, I mean, preparing for this interview, just like buzzword after
buzzword after buzzword. And I think a lot of us, you know, who are not in this sort of
field, but are interested in it. So yeah, if you can define as much in like the-
- Yeah. Yeah. Sure, so-
get rid of all the, rubs that among all the - The brunt, the fi-
- So you've got, - Yeah, sorry, you've got AI, right? which is right here, in some subset of that
is machine learning, which includes what I would kind of call traditional machine learning, like support vector machines, decision trees, random forests, right? These are all, linear regression even, where you are just fitting
a line to some data. And then we have things like
slightly more complicated like artificial neural
networks, which it, they, I mean, they kind of take some
inspiration from our brains, but I would be very careful saying that, do you know what I mean? I think, you know, to suggest it's like our brain is iffy. And that's what-
- Yeah, yeah. Sure. But people don't
- keep in mind, because that's what they're called. And then what we've
basically done recently is we've made them much, much bigger. And we've introduced other terms
like convolutional networks and transformers and things, but for the sake of,
you know, this sentence they're just bigger, deeper networks that can learn more impressive functions. So they can map that input to
that output more effectively. 'Cause that's what you wanna try and do. You've got some data, you've got some predictions
you need to make on that data. And your hope is that
once you've trained it, some new data comes along and you can make some
good predictions, right? I mean, I, let's think through
an example, suppose I want to do MRI segmentation for
medical imaging, right? So I have 50 patients, some of whom unfortunately
have some kind of illness, some of whom don't,
and I train the network to try and find the
ones that have illness, my hope is that when I then sort of fix that network in place, and bring in some new patients,
it will be able to say whether they have that illness or not. That's the idea. And will have done that by
basically reconfiguring itself based on the examples I
gave it to begin with. - So doing a technology example, it could be something like
spotting, is this a virus or is it just-
- That's exactly right. And in fact, you know, modern antivirus' will include some kind of machine
learning element probably. So, you know, you might
have features derived from, so what we usually put
into the front of a network is something we call features, which is our way of just
saying input data, right? So sometimes you've crafted those features like you've chosen, what you think is interesting
features to give the network and sometimes you'll
just shove something in, like, you know, an antivirus you could, you could choose things like how many system calls
does it make or, you know, how many bytes is executable or how many of this particular character does it have in the executable? And you could choose those features 'cause you think they
are indicative sometimes of malware or not malware. You could stick them in some kind of a machine learning approach with a load of examples and then say, "Right now, change your weights and change your rules, internally, so that on this training set, your prediction is as
accurate as possible," right? And so let's say you do that. You have a hundred thousand
malware and regular samples. You give it to your AI and you just over and over again, say, "Right, you got that one
wrong, reconfigure yourself so that next time you get a
bit better at predicting it." You do that over and over
again, and the hope is then, that when a new virus comes
along that you've never seen those same sort of shall we say suspicious
things exist in it, and the network flags that up. That's the idea. - So the training data is like the, the stuff you give it initially, which would be this a hundred thousand like virus and not virus. - Yes. Yes. - And then you, when you
say weight, you like, it's basically saying like, if it makes like a hundred
system calls rather than 10, or you set some kind of threshold, is that right?
- Yeah. So it, okay. So in a decision tree
or something like that, that's what would happen. There would be some kind
of threshold decision basis at some point during it. For a neural network it's a
little bit more complicated. What you actually do is you treat all these
weights just as numbers, and you just calculate mathematical functions
based on those numbers. So what you might do is multiply all of those
numbers that come in, by some weights, let's say you
multiply one of 'em by two, and one of 'em by negative
four, and one of 'em by a half. And then you add them all
up, and what that does is take a different amount of each one, and then you repeat that
process over and over again, to try and basically learn a complicated mathematical function. That's really the only thing it does. You know, you are essentially trying to
fit a really complicated curve through the data, essentially, so that you can distinguish
between real and fake malware, or, you know, regular
executables and malware. And, and so the weight, when I say weight, what I'm really talking about
is the parameters of my model, which influence its mathematical function. - So the, and then you would adjust the weights and the mathematical functions based on the result that it get, did it correctly determine
that this was malware or-
- Yeah, exactly. So, so let's suppose we
were doing malware, right? So we think one, an output of one means
it's definitely malware and an output of zero means
it's definitely not malware. An output of 0.5 is not very useful to us because we don't know. What we do is we put in, a piece of malware or
many pieces of malware, we run through let's say
our deep neural network or whatever it is we are running and it produce a value
between zero and one. And then we say, "Well, look, you gave us a value of 0.7, but actually it was malware this time, so you've got an error of 0.3. I wanted you to produce more 0.3 higher for that one than you did. So can you adjust your
mathematical function to next time when I put that malware in produce a value of one
and not a value of 0.7?" Now, if you do that
for one malware sample, it's gonna be the worst
machine learning ever, because you're just gonna
give it something else and it's gonna go, "I dunno
what you mean," right? Yeah, because it's this is nonsense. So you have to give it a lot of data. And, and I guess the,
what you're trying to do is calculate the best
average mathematical function that does the best job it
can in the general case of all of these malwares, right? Massively optimizing one
malware is not useful because it's not gonna generalize; it's not gonna apply in real
world to some new malware. So you put in 10, 20, a hundred different
malwares at the same time, and all of them are trying
to go to one or go to zero, and you are trying to change the weights simultaneously do all of
those at the same time, that's what machine learning does basically for a neural network. The process for actually doing this, it's complicated to describe,
but it's fairly intuitive. What you do is you, if you, because all these weights were
involved in the calculation, you put your features in for your malware, you go all the way forward
through the deep learning or the network, then you
calculate your error, and then you go backwards adjusting the weights based
on what you just found out, essentially. And so if a weight doesn't have
any impact on the decision, because let's say it just
sets everything to zero, you won't adjust that weight
'cause it's not useful. You will only adjust the ones because you're calculating the influence that each of these
weights has on the error, you adjust all the ones that
have the biggest impact. And so the network will kind of try and find its
way towards a good function, you know, and we use a process called stochastic gradient
descent often to train this. So what we're doing is we're picking random
malwares and putting them in, and that will, it will
often get them wrong, right? Because it's never seen
any of these things before. And so over time, maybe you just nudge it
slightly in a better direction. And then over many thousands of looks it slowly converges on something that actually makes reasonable decisions. That's, you know, that's the idea. So it's a long process. - And is this what you
would call supervised, or is it, this? - Is, yeah, this is definitely supervised. So supervised is where you
have your labeled (indistinct) what we, you know, our,
we have labels for data. So we're putting our data in, we have some labels against
which we can compare, and that means that we have some idea of how right or wrong the
network is in any given case, and that's very, very useful. And the major-, despite
what people might say, the majority of deep
learning or machine learning is supervised learning because
it gets results the quickest. If I want to detect some illness in MRI, having examples of that illness is gonna be much, much easier. - So, Mike, supervised learning,
if I understand it right, is you giving it examples
of, like you said, actual malware or actual, like
in your MRI scans problems, and then you're supervising
that it got it right, and then you're correcting it? - Yes. And it makes
things much easier, right? So the majority of machine
learning is supervised because it is simpler and easier to do. If you work in applied
areas like me, where you're trying to get things to
work really, really well, if you work in industry, a lot
of what you're trying to do is just minimize that error term. You're trying to get as
close to good predictions in the, for the majority of cases. So getting some examples is gonna get you to converge on that
much, much more quickly. This is, you know, distinct from something
like weekly supervised or unsupervised learning. And there's lots of different variants. So unsupervised learning is
you don't have any labels. Like maybe the data is too big or the data is too hard to annotate or no one can agree on
what the labels are, and so the best you're gonna be able to do is kind of partition the
data into plausible groups. So you can say, "Well, look. We don't know exactly
what all these things are, but we know that this group
is distinct from this group," and that's unsupervised. So an example would be suppose
you work for an online shop and you have a load of data on what different customers have bought. One thing you might do is
start trying to group customers into some kind of plausible groups based on roughly the things they, and they're not all gonna have
bought the exact same thing, so it's not gonna be trivial,
but they might have bought, so someone's buying
mostly dog related stuff and someone's buying
mostly technical gadgets. And then what you can do is say, "Well, look, I put all these
people in the tech group, and this guy bought this really nice new microphone for his camera, so I'm gonna recommend that now to other people in the group." And maybe I get a few hits and I sell a few cameras that way. But you can get much
more complicated in this, but that is an example of
perhaps unsupervised learning, where you don't need to have some kind of label for everyone. You don't need to have
labeled me ahead of time as a tech enthusiast, you just need to look at
the stuff I've been buying and know it's the same as
all these other people, and know that that's interesting, right? Rather than we know exactly what it means. - That's a great example. So in other words, you didn't tell the machine
who the people were. It discovered that based on
their, the patterns of data, right? - Yeah. And it didn't really
even discover who they were. It mostly just grouped them, and that allowed us to make decisions based on the fact they were grouped. Now, as it happens, I've given this group a
label of tech enthusiasts, but of course you don't
need to even know that. You just need to know that on average, they buy more TVs than everyone else, so maybe send them emails
about TVs, you know, it's that kind of idea. You can still do supervised learning and other forms of learning
with stuff like marketing and, and recommended systems and things. But you might imagine that that could be one way you would do it. And I think it's a good example. - The problem I see, like from listening to you,
is reality versus the movies or reality versus the news cycle, because you always hear
about Google doing like, like teaching a machine to play chess or whatever the games are, and it just like magically gets this done. And it teaches itself kind of like, not even knowing what
the rules of the game are-
- Yeah. So that is a, that's something called reinforcement
learning a lot of the time. Reinforcement learning is
still supervised learning. It's just that you get
the labels as you go from playing the game. So the way it works is, you know, what you might do, is you
play a random game of chess, where you literally move at random, right? And you lose. And so you get a strong
suggestion that maybe next time don't do that, right? That was stupid. So now you move slightly less
at random than you did before, but it's still pretty bad, and you lose again but learn a bit, and this is basically how they train it. So what you do is you play
millions and millions and millions of games of chess. And every time it goes well, you just learn a little something about what was better than that time and what the time before. We're still talking about a network which is a big mathematical
function, right? So we're still talking about
something that has weights, that you adjust, so that when
you put an input state in, you get the best desirable
output state, which in this case, of course, is you won more
often than you didn't. For me, it, I mean, these are fascinating, 'cause they they're trained
in a very different way to the way I would train a network. I'd come up with labeled
data and I put it in like, and I use the examples. With reinforcement learning you have to start trying
to give it rewards, which is where it gets
its labeled data from. So is it better that you go 25 moves in chess before you lose? Or is it better that you checkmate regardless of how long it takes, right? You know, 'cause you might end
up in a stalemate, you know, there's things with playing
chess where you might say, "Well look, these other
goals are also important," or something like this. And so you can spend a lot of time thinking about different ways
you could train the network, which I think is really interesting. - Perhaps I'm misinterpreting it, but it sounds like the
hype cycle versus reality, there's a big disconnect. Like the people have this vision that the robots are
gonna take over, but you, you don't think that's gonna happen like
anytime soon, right? - Yeah. I mean, I, well, the funny thing is like, I did a lecture once
where I said to everyone, "You know the SHA-1 hash function is absolutely fine," right? And then the next, the
next day, Google released their two PDFs that had
the same SHA-1 hash, right? Now that's embarrassing when
that happens as lecturer, you know? (David laughing) So I, you know, I don't wanna say, you know-
- You don't want to predict, yeah-
- I don't want to predict it could never happen. What I would say is that the, something that's really,
really good at Go, or something that's really,
really good at chess is really, really good at
chess, and that is it, right? It will do nothing else, right? As far as I can tell, human chess players are
also good at other things. And we don't have that
generalizability yet. And I don't-
- the AI- the AI thing, sorry, sorry to interrupt. Is this, this AGI the difference between
like specialized knowledge and AI?-
- And I mean again we could get bogged down in
what the definition means, but I think the artificial
general is urgent to most people watching is
just something that kind of is a bit like a human, right. And certainly is very, very general. So you could say, "Right, this now is a
totally different game, learn to play it," and it
would go off and play it. And it would still
remember how to play chess and it could play all the games, you know, and it's just super, super impressive. That doesn't exist. Will it exist? Nah, I dunno. I mean, I think that if we keep
making these models bigger, we'll probably get to a
point within a few decades where they are very impressive
at a lot of different tasks, but I still am not convinced
yet that we've got any real strategy to get past the idea of just you need to like
have a load of data, right? Or a load of, play a load of games. My daughter can have a go at playing a semi-coherent game of chess, just having been told the rules of chess. I mean, she didn't, you know, let's say she's not gonna
winning any competitions, right? Not yet, but she didn't need to play a
million games against herself to work out what to do, right? There's something that she's doing, that is much, much more impressive than what this AI is doing. That isn't to say the AI
isn't incredibly impressive. It's just very different. I do think that the hype cycle is very different to what we
actually see on the ground, which is that basically a lot
of the time, I mean, you know, aside from playing games
and reinforcement learning and large language models, the majority of what people are doing is trying to find objects, segment images, and these things is mostly
done in the supervised way, and they don't generalize
but we don't care because we were trying to find those specific objects so that's good. And if we needed 'em to do something else, we'll retrain them to do something else. - Yeah. 'cause my next question, I think you've already given us the answer and maybe you can just elaborate is what is AI really good at, compared to you know, it just seems
like it's like automation, automation has its place,
but you still, it takes like, it's just correct me if I'm wrong but it seems to take away like low
level tasks that are boring and monotonous, or
difficult for a human to do and then humans can
concentrate on other things. What is AI really good at? And where do you see it going? - Yeah. So AI is that, automation is
exactly what you right on, but with the caveat that
you've gotta have found a good way to train it to automate. It won't just automate stuff. Yo can't just stick it
on a production line and say automate that for me because it, we won't know what to do. So yeah, from my point of view, what AI's really good at
is, so before I worked in, you know, at machine
learning and deep learning just was a normal computer
vision researcher, right? And so I was off, you know,
this is like early 2010, something like this time before I mean literally
deep learning appeared in about 2014. And before that we didn't have it, right? There were some networks, but no one was really
paying attention to them and everyone was just doing normal stuff; what I would describe as image processing. So if I wanted to find
something in an image, what I would be trying to do is come up with rules in my head about what I needed to do to that image to find those objects, and then I would implement
those rules in code. So I'd say, okay, first of all, goals like we're trying to find, you know, something in MRIs. So first find all the bright pixels. Now find all the bright pixels
that form a continuous blob that's of this size, you know, and I, I start and I try and design an algorithm to find whatever it was I was finding through these if statements
and rules, right? It's just code. And what machine learning lets me do is not worry about these new rules because the problem you have, if you do, if you do it by just coding,
is you get stuck in edge cases. You get stuck on the, you solve 90% of the issues pretty quickly because 90% of the images are trivial. And then that 10%, you
just will never solve because they're just, they don't apply the normal
rules that everything else does. And you know, if you are looking at sort
of medical diagnosis AI, or program, that's a huge problem, that you're just gonna miss 10% because you couldn't
deal with the edge cases. And so from my point of view, coming from image analysis,
that was what it let us solve. It allows you, because it's mathematical function
is very, very complicated. It can learn the edge cases, if you give it sufficient numbers of them. So you just, so actually a lot of the time when I work with biologists or medics, and they present me images, I'll say, "These all very nice, but
have you got any worse ones? Have you got any really bad ones?" Because the more, because the
more bad stuff we give it, the better it will get at at working when those things come along. If you train your AI on a 3- or a 7-Tesla MRI
scanner, which is super clear, it won't work when you run it on a 1.5. You know, so maybe you want to get samples from all the different scanners. You know what I mean? It's these kind of
decisions, it actually means that the problem is no longer one of, which if statements do I need
to write to get this to work, it's now, what kind of data and
how do I present the data to this network to get it to work, right? And that, so it becomes much more about
the input and output problem than it becomes about
what you do in the middle, which it just learns. - That's great. I mean, I just wanted to see
if I understand those terms. I see terms like artificial intelligence, machine learning, neural
networks, and deep learning. We've covered all of those. Is that right? - Yeah. So I mean, to go into some deep learning, what I would say in terms of
a definition of deep learning is, you know earlier I said that you might derive features
for your problem, right? So I suppose you're trying to sell cars. What you might do is you might come up with some properties of cars that are relevant
to its purchase price. So you might say, "Okay, how many cylinders has it got? How many, how much horsepower has it got? Has it got leather seats, right? Has it got air conditioning?" And you would have all these features and you would come up with a list of, let's say a hundred different
properties of a car, and you would stick them in some AI, decision tree, neural
network, doesn't matter, and then it would spit
out a value for you, and you would train it
on a bunch of examples, and you would hopefully have
a system that could really nicely predict the value of cars, right? Now, the problem is that suppose I've missed out a feature that's absolutely crucial
to the value of cars. Suppose I forgot to put in the engine size and it turns out that
90% of the car's value is on how big the engine is, right? And so I've given it bad data then, right? And then I have to go back
and have to put data in again, and I have to train it all again. And, you know, it's a waste of time. And what will actually happen if you tried to implement a system where you'd missed out features, is it would never work
as well as you hoped. And a car would come along that looked good on the
features I did give it, but actually had a really small engine and it would massively overvalue
it or something like this, or undervalue it and you give away a really
nice car for almost free. What deep learning does is something called
representation learning. That's (indistinct) because it's deeper, it has the power to
also learn the features as well as the decision
based on those features. So you might say, "Well, I can't bother to
decide all these features so I'm just gonna dump the raw specs or a picture of the car in at the front, and have it determine for me, the value." And it would be looking at the size, the model, shape, the color,
the size of the wheels, and it would do all this and it would extract the features
first inside the network, and then it would use
that to make the decision. So deep learning is often described as just the same network, but deeper. But actually it's a different, I think, a different paradigm where you're basically no longer
handcrafting what you put in, you're just shoving all of it in and it works out what's
useful and what's not. - And so you've explained
neural networks already. Is that right? - Yeah. I mean, mean, so a neural network. Yeah. So I, we talked about how a neural network
calculates a weighted sum. So it takes some features at
one layer and it weights them and then it calculates the sum
of those for the next layer. And we have something called
an activation function in there as well, which allows the, it basically makes the function
a lot more complex, right? It makes it nonlinear. It makes it learn more powerful things. Modern deep networks actually
have additional operations like convolutions and pooling operations, which work on grids of data often, right? It doesn't have to, but
you know, often may do. So what you might do is instead of calculating a
weighted sum of all the features, you might slide a filter over the image to calculate filters every location, and so it's like a sort
of a map of activations. And then you might repeat that
process over and over again. So what deep networks
are capable of doing, convolutional networks, is determining features
across the whole image, or across the whole of the data stream and then repeating that
process over and over again. That's how they develop their
representation learning. They use the filters to
create interesting information before they make a decision. - You teach security at university, but you're doing a lot
of the AI stuff as well. I think the question a lot
of people will be asking, including myself, is, "Do I need to learn some
kind of programming language? And which language would
it be? Would you recommend? And do I need to learn like
a whole bunch of math?" Because it sounds like math is one of the, or maths, as we say in the UK is, is something that you have to, it, 'cause you have to learn, is that right? To get into-
- You, you know, having some idea of what's
going on mathematically helps from an intuition
point of view, right? 'Cause I understand the
back propagation process, which is how the actual
weights are adjusted. And that allows me to
understand what would happen if I connect two bits of network together in a weird shape or something like this. But, in practice actually, day-to-day running of a deep network doesn't really involve any maths. And there is some
disagreement in the community about whether you really
need to know math at all. You know, I sort of go back and forth. I sometimes think it's useful and I sometimes think it's not. I certainly don't think people should be, if they don't like math, should
be put off from having a go, because I'm always an advocate
for have a go at something. You might really enjoy it. Right? What I would say is, that actually running a neural network doesn't require a lot of maths. It just requires a bit
of Python basically. So that's the language you normally use. Python, and I have a love/hate
relationship with Python. And I think that sometimes I just wanna
declare what my types are and stop having runtime errors a half an hour into something. But what, what they've done is they've
got a lot of libraries like TensorFlow and PyTorch to
operate, that sit in Python, and then they very quickly
go down into C and CUDA for fast matrix multiplications, which is all the stuff that goes on behind the scenes in
these neural networks. So they're very, very quick because they're not implemented
end-to-end in Python, but Python gives you a very convenient and nice way of doing all this, you know, load in the images, it just
appears as a kind of array. You know, you might have a list of images that you use for your data set, and then you put that into
a network and so on, right? You, you know, a lot of it's
just inputting outputting lists and dictionaries
like the rest of Python and so it makes things quite easy to use. You know, you'll have a look at Python and but Python and for me
is a nice enough language, in the sense that it's
fairly easy to pick up particularly if you
already know a language. It's often a language people recommend you start with anyway, because it's fairly relaxed about syntax and just you making a total mess of it. So that's, you know, that's always good, but doing, going from knowledge of Python to having implemented a deep network will not take you very long. You will understand
everything the first time, but you can get, give it a go and you can watch it training and you can start to, you can start to pick
up on what's going on, and then you can make
a change to the network and maybe improve your
performance slightly. - Do you have to write it from scratch or like it's TensorFlow or something that like Google have created the-
- Exactly, they do a huge amount of heavy lifting, right? Which is one of the
reasons why you can kind of get away with not having all
this mathematical background. So, I mean, I use PyTorch
mainly, and in PyTorch, it handles all of the
weights and learning for you. So you say, "I want my network to
have this many layers and I want my layers to be like this. And I want it to take
an image of this size and turn it into a 10-class
classification problem," where I'm picking cats
and dogs and airplanes, or what have you. And then it just trots off and does it, and it just goes, it goes,
puts the images in it, it retrains the network,
and it puts the images in, and it retrains the
network, and it iterates. And you can watch your learning rate, so you watch your loss function go down as it gets better and
better every iteration. And so eventually you
can then just deploy it in some sort of production
codes or whatever. And maybe without, maybe test it first! (David and Mike both laugh) But, you know, like it does a huge amount. There's a lot of mathematics
behind the scenes, not all of it particularly complicated, but it's definitely a lot of it. And it's all massively paralyzed on a GPU and, you know, so you can
actually get away with a few dozen lines of code to get a pretty nifty
neural network going. - Let me say that's good
to hear because know, when you start talking about
the ins and out, it's like, this sounds so complicated. So it's like PyTorch is just a library or something that you would import and then just, you just
send some commands to it. Yeah? - Torch started off as a
machine learning library in, well, it was written in C presumably, but, and CUDA, but it was for Lua. And again, that's another
language I have a, should we say a very strong,
mixed opinions about, however, since then, TensorFlow
came along in Python, I think it was seen as Python's more convenient for
the majority of developers, and so PyTorch, spawned
off Torch basically, and is now the dominant library for this. So TensorFlow is Google,
and PyTorch is Facebook AI, or Meta AI, I suppose it is now. - And that's the one you
would start with, yeah, if you were starting? - I, yeah. So this is a, people have
different opinions on this. I think that the, - Just give us your opinion
because you know, we, I just, sorry to interrupt, I
just want to put it this way. I like to have paths, like when I talk to experts
like yourself, it's like, "Okay, I'm new now. How do I go from like, knowing nothing to like, at least getting started?" If so, if you anything you can help me-
- Yeah, Well, I mean, I tell you what. - like knowledge, whatever would be great. - Yeah. Yeah. I would start with PyTorch personally. My, from a research point of
view, PyTorch is more flexible, which helps me, but it also doesn't require
a lot of lines of code to get running. And it also does a nice thing where it doesn't hide
away all of the details. There's just enough detail in there that you can kind of type
away and it will kind of work, but you do see the network
going forward, and learning, and optimizing the weights,
and things like this. There's a few lines of code, that do that, that you can kind of
look at and go, "Hmm." And then you kind of pick
these things up, right? It's not a case that you
just type PyTorch dot train, and pass it your input data
and then it just does it and you have no idea what
happened, which I like, because that wouldn't be fun, right? But also you wouldn't learn anything. So I like PyTorch from
that, for that reason. It also has a load of examples. So if you go on the, if you go on the GitHub repository for PyTorch or Torchvision,
you get the whole, you've got all the like core networks for a big, from the literature in there. And you've also got some
examples of simple data problems and things like this that
you can run from end-to-end, and just basically run the file and it will start training a network. And then you can delve in and see what it is it's actually doing. - Do you need, you, I
think you mentioned a GPU, do you need specific hardware or can you just run the single laptop? - You need, you really
need a, so PyTorch is, uses CUDA, right? So you really could do with using a, I don't know if PyTorch support
OpenCL; I can't remember. Ideally you would have
access to a CUDA enabled GPU that would make this
process much, much faster. So as I mentioned, the
back end of PyTorch, and most of these deep learning libraries is written in C, and CUDA, and it's just massively
parallelized matrix modifications most of the time. And that is something that you don't wanna be
doing on a CPU, right? You can, for very small
networks, run it on a CPU. So if you download the
simplest PyTorch example and you run it on a CPU, it will run okay and you'll be able to see what happens. Anything with images, anything where the dimensionality is high, you're gonna be waiting half an hour for it just to finish one pass and we won't get anything done. One other thing you might
like to try is Google Colab. So Google Colab is, is Google's public,
Jupyter notebook-style, laboratory environment that
actually provides limited time, but fair use access to GPUs
to have a go at these things. It's a great place to go. And you can also download
loads of Colab notebooks, existing implementations to test them out. That's a great place to start. You know, I'm a big fan of Google Colab. I think that as a platform
it's really, really useful and you can actually pay, I mean, I'm not, I don't work for Google Colab, you can pay a small
subscription to get access to, higher access or more preference access, more, should we say higher
priority access to GPUs. That's what we, you know, you can get. So it's like fair use normally. So if you, if you use it a lot, you might have to wait for
a half a day or something. - I mean, in the best case scenario, I'd come and attend one of your classes, but not everyone's gonna
be able to do that. Do you have books, or online courses, or stuff that you would
personally recommend- - Yeah, so-
- or suggest? - So, I mean, what I always
recommend to people is, is Andrew NGs Coursera
course on machine learning is a great place to start. Right now, it's lower level. So Andrew Ng is very well known in the machine learning community. He's, you know, he's done
a load of great work. His Coursera course is really good. It's quite mathematical, right? So that isn't necessarily a problem. You just have to go in
knowing that's gonna happen. But what it does do, is it gives you a lot of information on stuff that we haven't
really talked about. So things like watching
your learning rate, your loss function go down, right? So if you, if you draw
a graph of your loss, which is your error at
the end of your network, over time what should happen,
is it gets better and better, it goes down. But it might not go down. It might do sort of do this. A lot of machine learning is
understanding what that means and what you could try and
do to rectify that problem. You know, for your first
day of machine learning it's not important, but it, over time, some of the concepts that you talk about in this machine learning
course will come in handy. And there's a book by Yoshua
Benjio called, "Deep Learning," which also, again, a lot of maths in it, but it covers a lot of the core concepts. Personally, I'm a kind of, I've always been a kind of
learn by doing kind of a person. - Yeah, exactly. Exactly. - So, in what I like to do is just get on the PyTorch
or the TensorFlow tutorials and just start running some
stuff and see what happens. And if you know, Python
or you know any language that's even plausibly similar to Python, you're gonna have a great time doing that. - I think, especially for
a lot of the audience, if they're starting out with this, let's say there's younger people
who starting their careers. And, and I spoke about this
in the beginning about, people are worrying that this
will take their jobs away, but I'm assuming there's,
whenever I see the hype cycle, there seems to be a lot
of demand for AI skills. - Huge, huge demand. - Yeah. - Yeah. There's a huge demand. So I would say there's,
there's kind of, you know, you've got your different
levels of sort of data analyst, right? So you've got people who are
pretty good at a spreadsheet up to people who are working, trying to train self-driving
cars and things. I suppose, if I'm being sort of a bit, bit random in my choices
of job description and, you know, you got
anywhere in between, there's huge demand everywhere. So, you know, if you have any
kind of data analysis ability, if you can look at a table of data and start to pick out patterns and start to work out what's going on and make predictions on that data, that's a really useful skill to have in lots and lots of jobs. And it's a very, very, very, very popular thing that people have. So a lot, we have a lot of graduates who graduate with a few
modules in machine learning and a few modules in data
analysis and things like this and they're in a really strong position. These things are not, you know, you can learn these things yourself. So, you know, you can go in, I've got a data analysis course. It's not very long, obviously, 'cause you know, YouTube videos, but I have some data analysis videos, there are lots of data analysis videos. - On your YouTube channel. Yeah? - Yeah. On our YouTube channel Computerphile, we have like a 10 part
series on data analysis, which is just kind of like a taster, but you can have a go at that. There's lots of stuff on data analysis. Data analysis and modeling
and machine learning in some ways go hand in hand. It's often good to have a little bit of a look at both of them because you know, cleaning
data, for example, like you get, if you get a spreadsheet of data that doesn't make any sense, it's unwise just to stick that straight into a neural network
and see what you get out because there could be
some complete, you know, it could be missing values,
there could be errors, there could all just have
hugely different scales of data. These are all things to think about. So some knowledge of
how to prepare that data for let's say a downstream
task like machine learning is a really useful thing
to know how to do as well. - I love that you're
teaching at the university, you're teaching security,
cybersecurity type stuff, but you're also doing AI. So there, do you see that,
like that's a really good mix and I'm assuming based on what
you've just said, you know, it's a really good idea
if you are into cyber, or want get into cyber to, you know, add this to your skillset. - Yeah. I mean I would be hard
pressed to find any career that wouldn't be at
least helped a little bit by knowing some data analysis
some machine learning because just, it just
comes up a lot, right? You know, and also, I mean, as you know, we already spoke about how people can be misled by
the hype cycle, right? And you will be much
more resistant to this if you understand how these things work and that's gonna put
you in a good position. Yeah. I think that, so I, as
it happens, I teach security. I partly, I find it really interesting so I try and cling onto
that module with, you know, with a vice grip, and not
let anyone else have it. I also teach cryptography
at university as well. - We need to get you back for
some more interviews, man. - Yeah. Right. So yeah, by all means. But so those are subjects I find, I don't actively research day-to-day, but I do find very, very interesting and I do have some collaborations with, 'cause we are actual security researchers working at Nottingham and lots of places we have good collaborations with them. There is obviously
machine learning involved in quite a lot of security, because it's one of many
strategies for detecting malware or for anomaly detection, or, you know, any smart system that's doing
something that, hopefully, you don't have to program
all the rules yourself. So yeah, it does help. I've got, I think I've got a project, an undergraduate student starting, who's gonna look at malware detection with a bit of machine learning as well. And so she can bring the
knowledge of the malware. I can bring the knowledge
of the, of mostly the AI, you know, and it'll be great. - Mike, I always like
to ask this question. If you were talking to your younger self, let's say you were 18 or, you
know, I dunno, let's say some, not everyone is 18 who
watches these videos, but let's say they were
25, 30, or whatever. What would you advise someone to do based on, you know, what you've seen? - I think if you are, if
you're really interested in a career in cybersecurity, or a career in machine learning, it's worth noting that
not everyone has a degree that does those things
and that's fine, right? It's also fine if you do have a degree, I see people saying, "Well, you don't need
a degree for this," or, "You do need a degree for this." I actually think, learn the skills, right? And then you get a job
based on your experience and it's, you know, and you're
gonna have a great time. I think that, again, it's not one of these
debates I like to get into because everyone has their own career path that they wanna follow. If you're, if you did a degree in
something completely different and you've worked in a job
you're not really enjoying and you wanna try something new, I think that's absolutely fine: have a go. There's so many resources online, that there weren't 20, 30 years ago that there are, you know,
people doing interviews and videos on different topics that you can just watch and learn about. And, as I say, I'm a very hands-on person. If I wanna try and learn a skill, I'm just gonna try and do it, and it will probably go
really wrong the first time. So I think that practice, and this is true of coding as well, I think I'm big, big believer that coding is mostly practice. People say, "Well how did you
know that was gonna be a bug?" 'Cause I've seen it so many times before, you know, like, because
it happens all the time. I think, yeah, that
would be what I would do, is find something you love
doing and do more of that. You know, I program at home for fun and it's partly 'cause I find it fun. And also sometimes I
wanna learn something new. I did a video a year or two
ago on the Enigma machine. I don't need to program the
enigma machine for my job. I just thought it was super interesting and I just sat at home and did it. And I learned quite a lot actually about the whole process
and the history of it, by just having to implement the thing. And so I think, yeah, I think, crack on and learn,
would be what I would do. - I love that. I mean, and I just have to
say this, you are Doctor Mike, you got PhD. - Yeah. - Is that right? - Yeah. - In what, in, what was it? - In computer vision. - So I mean, it what I
really love about this and this is just my opinion, so I don't wanna put you on
the spot, but I love that you, as someone with a PhD, are not excluding people who perhaps never
had that opportunity. And, and I love that you're
encouraging everyone, you know, just to go for it. Don't let your limitations or-
- Yeah. - Or lack of resources, stop you. So. - I mean, as it happens, like I was an, I was a pretty average
student at school, right? I mean, I didn't do much. There wasn't much in terms
of computer science in, at school when I was younger. It was, you know, "Let's
use Microsoft Word," and, "Let's try that out." And so I didn't, I barely
did any computing at all. I could only a little, I
could only program a tiny bit when I arrived at university. Loads of people arrived at university with huge program experience, but, and loads of people who arrive
with no program experience. And we always say to them, "You'll all be the same
in the end," right? Like that's the whole point of a degree. And it's the whole point of what we teach. I think there's never too late to, to get into computers and learn
about programming and stuff. I try and teach people
to program all the time. I mean, not all of them are
interested, which is annoying, but, you know, so like if it was up to me, all my family would be able to program, 'cause I'd giving them extra lessons, but some of them wanna do
other things, apparently. But yeah, I'm not a gate, I
don't wanna be a gatekeeper because that's not gonna get more people doing cool computer stuff. There are some things where a massive specialism
is important, right? You know, I'm not proposing to go into a hospital and start surgery on people because you need a lot of
training to do these things. But I also think that if
someone wanted to be a surgeon they should crack on and do the training. You know, I think you
can learn those skills and we require, if you're
gonna work at university, we usually require a PhD and that's something that
universities require, but there's a great deal I don't know about the real world and industry that people who are watching
will know way more about the me and that's also fine, right? You know, everyone's
got their own expertise. So I like to learn from those people and hope I can teach 'em a bit about the things I know about. - I love that. I love that. Another thing, I mean, I said 18, but I get a lot of pushback
sometimes on these videos and I'm not sure if you've
heard this question before, "Am I too old to start learning AI?" - No, no. I mean, consider also that
the majority of academics who are using AI aren't
18-year-old, fresh graduates. They are researchers but
been doing it for decades because you know, so we've all had to learn
it from scratch as well. Like I say, deep learning
only appeared in 2014 so it's been a mad rush since then. There's loads of scope to learn. And I don't think it takes,
to get a little bit going it doesn't take that many hours, you know, if you wanna do something, you know, around your job or whatever it is, your current life situation
is, I think it's doable. - I love that. Any closing thoughts? - No, I think I hope people
found it interesting, right? And I'm happy to come back and talk about more topics in detail. But I think that, you know, I love my job and telling
people about stuff that I think is interesting. So I would encourage those
people to go off and, and look into it in a bit
more detail and have a go. Just download a PyTorch
tutorial, and start running it and you'll train a deep network. And then when someone goes, "All this deep learning's a bit scary." You can go, "Well, actually I did that last week and it wasn't that difficult." Yeah. That's what I'd suggest. - So for everyone watching, please put in the comments below topics that you would like us to discuss. Definitely wanna try and get Mike back. So let us know what you
want us to talk about. Computerphile has a
lot of fantastic videos that Mike has created. So go and have a look at those. I'll link some of those below.
Please give us your feedback. Mike, thanks so much. - Thanks so much. Lovely to be here. (upbeat music plays)