(bright upbeat music) (audience clapping) - Artificial intelligence
as a scientific discipline has been with us since just
after the Second World War. It began roughly speaking with the advent of the
first digital computers. But I have to tell you
that for most of the time until recently, progress
in artificial intelligence was glacially slow. That started to change this century. Artificial intelligence is
a very broad discipline, which encompasses a very wide
range of different techniques, but it was one class of AI
techniques in particular that began to work this century, and in particular began
to work around about 2005. And the class of techniques,
which started to work at problems that were interesting enough to be really practical, practically useful in a
wide range of settings were machine learning. Now, like so many other names in the field of artificial intelligence, the name machine learning
is really, really unhelpful. It suggests that a computer, for example, locks itself away in
a room with a textbook and trains itself how to read
French or something like that. That's not what's going on. So we're gonna begin by
understanding a little bit more about what machine learning is and how machine learning works. So let to start us off, who is this? Anybody recognise this face? Do you recognise this face? - [Attendee] Alan Turing. - It's the face of Alan Turing, well done. Alan Turing, the late great Alan Turing. We all know a little bit about Alan Turing from his code breaking work
in the setting World War. We should also, we should
also know a lot more about this individual's amazing life. So what we're gonna you do, is we're gonna use Alan Turing to help us understand machine learning. So a classic application
of artificial intelligence is to do facial recognition. And the idea in facial recognition is that we want to show the computer a picture of a human face and
for the computer to tell us whose face that is. So in this case, for example, we show it a picture of Alan Turing and ideally it would tell
us that it's Alan Turing. So how does it actually work? How does it actually work? Well, the simplest way of
getting machine learning to be able to do something is what's called supervised learning. And supervised learning,
like all of machine learning, requires what we call training data. So in this case, the training data is on the right hand side of the slide. It's a set of what input output pairs, what we call the training dataset. And each input output
pair consists of an input, if I gave you this and an output, I would want you to produce this. So in this case, we've
got a bunch of pictures again of Alan Turing, the
picture of Alan Turing and the text that we
would want the computer to create if we showed it that picture. And this is supervised learning because we are showing the
computer what we want it to do. We're helping it in a sense. We're saying this is a
picture of Alan Turing. If I showed you this picture, this is what I would
want you to print out. So there could be a picture of me and the picture of me would be labelled with the text, "Michael Wooldridge". If I showed you this picture, then this is what I would
want you to print out. So we've just learned an important lesson about artificial intelligence and machine learning in particular. And that lesson is that
AI requires training data. And in this case, the
pictures of Alan Turing labelled with the text that we would want the
computer to produce. If I showed you this picture, I would want you to produce
the text, Alan Turing. Okay, training data is important. Every time you go on social media and you upload a picture to social media and you label it with
the names of the people that appear in there, you role in that is to provide training data for the machine learning
algorithms of big data companies. Okay, so this is supervised learning. Now we're gonna come on to exactly how it does the learning in a moment. But the first thing I wanna point out is that this is a classification task. What I mean by that is as
we show it the picture, the machine learning is
classifying that picture. I'm classifying this as a
picture of Michael Wooldridge. This is a picture of
Alan Turing and so on. And this is a technology
which really started to work around about beginning 2005. It started to take off, but really, really got
supercharged around about 2012. And just this kind of task on its own is incredibly powerful. Exactly this technology
can be used, for example, to recognise tumours on
x-ray scans or abnormalities on ultrasound scans and a
range of different tasks. Does anybody in the audience own a Tesla? Couple of Tesla drivers? Not quite sure whether they want to admit that they own a Tesla. We've got a couple of Tesla
drivers in the audience. Tesla full self-driving mode is only possible because
of this technology. It is this technology
which is enabling a Tesla in full self-driving mode
to be able to recognise that that is a stop sign, that that's a somebody on a bicycle, that that's a pedestrian on
a zebra crossing and so on. These are classification tasks. And I'm gonna come back and
explain how classification tasks are different to generative AI later on. Okay, so this is machine learning. How does it actually work? Okay, this is not a technical presentation and this is about as
technical as it's going to get where I do a very hand wavy explanation of what neural networks
are and how do they work. And with apologies, I know I have a couple of neural network
experts in the audience, and I apologise to you because you'll be cringing
with my explanation. But the technical details are
way too technical to go into. So how does a neural network
recognise Alan Turing? Okay, so firstly, what
is a neural network? Look at an animal brain or nervous system under a microscope, and you'll find that it
contains enormous numbers of nerve cells called neurons. And those nerve cells are connected to one
another in vast networks. Now, we don't have precise figures, but in a human brain, the current estimate is something like 86 billion neurons in the human brain. How they got to 86 as opposed
to 85 or 87, I don't know. But 86 seems to be the most commonly quoted number of these cells. And these cells are
connected to one another in enormous networks. One neuron can be connected
to up to 8,000 other neurons. Okay, and each of those neurons is doing a tiny, very, very
simple pattern recognition task. That neuron is looking for
a very, very simple pattern. And when it sees that pattern, it sends a signal to its connections, it sends a signal to all the other neurons that it's connected to. So how does that get us to recognising the face of Alan Turing? So Turing's picture, as we know, picture a digital picture is made up of millions of
coloured dots, the pixels. Yeah, so your smartphone
maybe has 12 megapixels, 12 million coloured dots
making up that picture. Okay, so Turing's picture
there is made up of millions and millions of coloured dots. So look at the top left
neuron on that input layer. So that neuron is just looking
for a very simple pattern. What might that pattern be? Might just be the colour red. All that neuron's doing is
looking for the colour red. And when it sees the colour
red on its associated pixel, the one on the top left there, it becomes excited and
it sends a signal out to all of its neighbours. Okay, so look at the next neuron along. Maybe what that neuron is doing is just looking to see whether a majority of its incoming connections are red. Yeah, and when it sees a majority of its incoming connections are
red, then it becomes excited and it sends a signal to its neighbour. Now remember, in the human brain, there's something like 86 billion of those and we've got something like 20 or so outgoing connections for
each of these neurons in a human brain, thousands
of those connections, yeah? And somehow in ways that to be honest, we don't really understand in detail. Complex pattern recognition
tasks in particular can be reduced down to
these neural networks. So how does that help us
in artificial intelligence? That's what's going on in a
brain in a very hand wavy way. Okay, so that's obviously
not a technical explanation of what's going on. How does that help us in neural networks? Well, we can implement
that stuff in software. The idea goes back to the 1940s and two researchers McCulloch and Pitts, and they are struck by the idea that the structures that
you see in the brain look a bit like electrical circuits. And they thought, could we
implement all that stuff in electrical circuits? Now, they didn't have the
wherewithal to be able to do that, but the idea stuck. The idea's been around since the 1940s. It began to be seriously looked at, the idea of doing this
in software in the 1960s. And then it, there was
another flutter of interest in the 1980s, but it was only this century that it really became possible. And why did it become possible? For three reasons. There was some scientific advances, what's called deep learning. There was the availability of big data. And you need data to be able to configure
these neural networks. And finally, to configure
these neural networks so that they can recognise
Turing's picture, you need lots of computer power. And computer power became
very cheap this century. So we're in the age of big data. We're in the age of very
cheap computer power. And those were the ingredients just as much as the
scientific developments that made AI plausible this century in particular, taking off roundabout 2005. Okay, so how do you actually
train a neural network? If you show it the picture of Alan Turing, and the output text Alan Turing, what is the training actually look like? Well, what you have to do is
you have to adjust the network. That's what training a neural network is. You adjust the network
so that when you show it another piece of training data, a desired input and a desired output, an input and a desired output, it will produce that desired output. Now, the mathematics for
that is not very hard. It's kind of beginning graduate level or advanced high school level. But you need an awful lot of it. And it's routine to
get computers to do it. But you need a lot of computer power to be able to train
neural networks big enough to be able to recognise faces. Okay, but basically all
you have to remember is that each of those neurons is doing a tiny, simple
pattern recognition task. And we can replicate that in software and we can train these
neural networks with data in order to be able to do
things like recognising faces. So as I say, it starts to become clear around about 2005 that this
technology is taking off. It starts to be applicable
on problems like recognising faces or recognising
tumours on x-rays and so on. And there's a huge flurry of
interest from Silicon Valley. It gets supercharged in 2012. And why does it get supercharged in 2012? Because it's realised that a particular type
of computer processor is really well suited to
doing all the mathematics. The type of computer processor is a graphics processing unit, a GPU, exactly the same technology
that you or possibly more likely or children use when they play Call of Duty, or Minecraft,
or whatever it is. They all have GPUs in their computer. It's exactly that technology. And by the way, it's AI that made Nvidia a trillion dollar company,
not your teenage kids. Yeah, well, in times of a gold rush, be the ones to sell the shovels is the lesson that you learned there. So where does that take us? So Silicon Valley gets excited. Silicon Valley gets excited and starts to make speculative bets in artificial intelligence, a huge range of speculative bets. And by speculative bets, I'm talking billions upon
billions of dollars, right? The kind of bets that we can't
imagine in our everyday life. And one thing starts to become clear. And what starts to become
clear is that the capabilities of neural networks grows with scale in, to put it bluntly, with neural
networks, bigger is better, but you don't just need
bigger neural networks. You need more data and more computer power in order to be able to train them. So there's a rush to get a competitive
advantage in the market. And we know that more
data, more computer power, bigger neural networks
delivers greater capability. And so how does Silicon Valley
respond by throwing more data and more computer power at the problem? They turn the dial on this up to 11, okay? Just throw 10 times more data, 10 times more computer
power at the problem. It sounds incredibly crude and from a scientific
perspective, it really is crude. I'd rather the advances had
come through core science, but actually there's an
advantage to be gained just by throwing more data
and computer power at it. So let's see how far this can take us. And where it took us is a
really unexpected direction. Round about 2017, 2018, we're seeing a flurry of AI applications, exactly the kind of things I've described. Things like recognising tumours and so on. And those developments alone
would've been driving AI ahead. But what happens is one particular machine learning technology suddenly seems to be very, very well suited
for this age of big AI. The paper that launched all this, probably the most important
AI paper in the last decade is called "Attention is All You Need" It's an extremely unhelpful title and I bet they're regretting that title. It probably seemed like
a good joke at the time. All you need is a kind of AI meme. Doesn't sound very funny to you. That's 'cause it isn't very funny. It's an insider AI joke. But anyway, this paper
by these seven people who at the time worked for Google Brain, one of the Google
research labs is the paper that introduces a particular
neural network architecture called the Transformer Architecture. And what it's designed for is something called large language models. So this is, I'm not gonna try and explain how the transformer architecture works. It has one particular innovation, I think, and that particular innovation is what's called an attention mechanism. So we're gonna describe how large language
models work in a moment. But the point is, the point
of the picture is simply that this is not just
a big neural network. It has some structure and it was this structure that
was invented in that paper. And this diagram is taken
straight out of that paper. It was these structures, the
transformer architectures that made this technology possible. Okay, so we're all busy,
sort of semi lockdown and afraid to leave our
homes in June, 2020. And one company called OpenAI release a system or announce a system, I should say called GPT-3, great technology, their
marketing company with GPT, I really think could have
done with a bit more thought, to be honest with you. Doesn't roll off the tongue. But anyway, GPT-3 is a particular type of machine learning system
called a large language model. And we're gonna talk in more detail about what large language
models do in a moment. But the key point about GPT-3 is this, as we started to see what it could do, we realised that this was a
step change in capability. It was dramatically
better than the systems that had gone before it. Not just a little bit better, it was dramatically
better than the systems that had gone before it. And the scale of it was mind boggling. So in neural network terms,
we talk about parameters. When neural network people
talk about a parameter, what are they talking about? They're talking either
about an individual neuron or what are the connections
between them roughly and GPT-3 had 175 billion parameters. Now this is not the same as the number of neurons in the brain, but nevertheless it's not far off the stat order of magnitude. It's extremely large. But remember it's organised into one of these
transformer architectures. It's, my point is it's not
just a big neural network. And so the scale of the neural networks in this system were enormous,
completely unprecedented. And there's no point in
having a big neural network unless you can train it with enough data. And actually, if you have
large neural networks and not enough data, you don't
get capable systems at all. They're really quite useless. So what did the training data look like? The training data for GPT-3 is something like 500 billion words. It's ordinary English text,
ordinary English text. That's how this system was trained. Just by giving it ordinary English text. Where do you get that training data from? You download the whole of the
worldwide web to start with. Yeah, literally this is the
standard practise in the field. You download the whole
of the worldwide web. You can try this at home by the way, now if you have a big enough disc drive, there's a programme called Common Crawl. You can Google Common
Crawl when you get home. They've even downloaded it all for you and put it in a nice big
file ready for your archive. But you do need a big disc in
order to store all that stuff. And what that means is
they go to every webpage, scrape all the text from
it, just the ordinary text, and then they follow all
the links on that webpage to every other webpage. And they do that exhaustively until they've absorbed the
whole of the worldwide web. So what does that mean? Every PDF document goes into that and you scrape the text
from those PDF documents. Every advertising brochure, every bit, every government regulation, every university minutes, God help us. All of it goes into that
training data, okay? And the statistics, you
know, 500 billion words, it's very hard to understand the scale of that training data. You know, it would take a person reading a thousand words an hour, more than a thousand years in
order to be able to read that. But even that doesn't really help. That's vastly, vastly more text than a human being could ever
absorb in their lifetime. What this tells you, by the
way, one thing that tells you, is that the machine learning
is much less efficient at learning than human beings are. Because for me to be able to learn, I did not have to absorb
500 billion words. Anyway, so what does it do? So this company OpenAI that
are developing this technology, they've got a billion dollar
investment from Microsoft and what is it that they're trying to do? What is this large language model? All it's doing is a very
powerful auto complete. So if I open up my smartphone and I start sending a
text message to my wife and I type, "I'm going to be...", my smartphone will
suggest completions for me so that I can type the message quickly. And what might those completions be? They might be late or in the pub. Yeah, or late and in the pub. So how is my smartphone doing that? It's doing what GPT-3 does, but on a much smaller scale, it's looked at all of the text messages that I've sent to my wife and it's learned through a much simpler
machine learning process that the likeliest next
thing for me to type after, "I'm going to be", is either late or in the pub
or late end of the pub, yeah? So the training data there
is just the text messages that I sent to my wife. Now crucially what GPT-3 and its successor ChatGPT, all they are doing is
exactly the same thing. The difference is scale. The difference is scale. In order to be able to
train the neural networks with all of that training data, so that they can do that
prediction, given this prompts, what should come next, you
require extremely expensive AI supercomputers running for months. And by extremely expensive
AI supercomputers, these are tens of millions of dollars for these supercomputers and
they're running for months. Just the basic electricity cost runs into millions of dollars. That raises all sorts of
issues about CO2 emissions and the like that we're
not gonna go into there. The point is, these are
extremely expensive things. One of the implications
of that, by the way, no UK or US University has the capability to build one of these models from scratch. It's only big tech companies at the moment that are capable of building models on the scale of GPT-3 or ChatGPT. So GPT-3 is released, I say in June, 2020. And it's suddenly becomes clear to us that what it does is a step
change improvement in capability over the systems that have come before. And seeing a step change in one generation is extremely rare. But how did they get there? Well, the transformer
architecture was essential. They wouldn't have been able to do that. But actually just as important is scale. Enormous amounts of data,
enormous amounts of computer power that have gone into
training those networks. And actually spurred on by this, we've entered a new age in AI. When I was a PhD student in
the late 1980s, you know, I shared a computer with
a bunch of other people in my office and that was, it was fine. We could do state-of-the-art AI research on a desktop computer that
was shared with a bunch of us. We're in a very different world, the world that we're in in AI, now the world of big AI
is to take enormous data sets and throw them at enormous
machine learning systems. And there's a lesson here that's
called "The Bitter Truth". This is from a machine learning researcher called Rich Sutton. And what Rich pointed out, and he's a very brilliant researcher, won every award in the field. He said, "Look, the real
truth is that the big advances that we've seen in AI has come about when people have done exactly that." Just throw 10 times more data and 10 times more compute power at it. And I say it's a bitter lesson because as a scientist that's exactly not how you would like progress to be made. Okay, so when I was, as I say when I was a student, I worked in a discipline
called symbolic AI. And symbolic AI tries to
get AI roughly speaking through modelling the mind, modelling the conscious mental processes that go on in our mind. The conversations that we have
with ourself in languages. We try to capture those processes in artificial intelligence. In big AI, and so the
implication there in symbolic AI is that intelligence is
a problem of knowledge. That we have to give the
machine sufficient knowledge about a problem in order for
it to be able to solve it. In big AI, the bet is a different one. In big AI, the bet is that intelligence is a problem of data. And if we can get enough data and enough associated computer power, then that will deliver AI. So there's a very different shift in this new world of big AI. But the point about big AI
is that we're into a new era in artificial intelligence where it's data driven, and compute driven and large, large machine learning systems. So why did we get excited
back in June, 2020? Well, remember what
GPT-3, was intended to do, what it's trained to do is
that prompt completion task. And it's been trained on
everything on the worldwide web. So you can give it a prompt, like "A one paragraph summary of
the life and achievements of Winston Churchill." And it's read enough
one paragraph summaries of the life and achievements
of Winston Churchill that it'll come back with
a very plausible one. Yeah, and it's extremely good at generating realistic
sounding text in that way. But this is why we got surprised in AI. This is from a common sense reasoning task that was devised for artificial
intelligence in the 1990s. And until three years
ago, until June, 2020, there was no AI system
that existed in the world that you could apply this test to. It was just literally impossible. There was nothing there. And that changed overnight. Okay, so how, what does
this test look like? Well, the test is a bunch of questions and there are questions not
for mathematical reasoning, or logical reasoning,
or problems in physics. They're common sense reasoning tasks. And if we ever have AI
that delivers at scale on really large systems, then it surely would be able
to tackle problems like this. So what will the questions look like? The human asks the question, "If Tom is three inches taller than Dick, and Dick is two inches taller than Harry, then how much taller is Tom than Harry?" The ones in green are
the ones that gets right. The ones in red are the
ones that gets wrong and it gets that one right. Five inches taller than Harry. But we didn't train it to be
able to answer that question. So where on earth did that come from? Where did that capability,
that simple capability to be able to do that,
where did it come from? The next question, "Can Tom
be taller than himself?" This is understanding of
the concept of taller than. That the concept of taller
than is irreflexive? You can't be taller, a thing cannot be taller than itself. Now again, it gets the answer right? But we didn't train it on that. That's not, we didn't
train the system to be good at answering questions about
what taller than means. And by the way, 20
years ago that's exactly what people did in AI, right? So where did that capability come from? "Can a sister be taller than her brother?" "Yes, a system can be
taller than her brother." Can two siblings each be
taller than the other? And it gets this one wrong. And actually I have puzzled, is there any way that its
answer could be correct and it's just getting it correct in a way that I don't understand but
I haven't yet figured out any way that that answer could be correct. Right, so why it gets that
one wrong, I don't know. Then this one I'm also surprised at. On a map, which compass
direction is usually left and it thinks north is
usually to the left. I dunno if there's any
countries in the world that conventionally
have north to the left, but I don't think so. Yeah, "Can fish run?" No, it understands that fish cannot run. "If a door is locked, what must you do first before opening it?" You must first unlock it before opening. And then finally, and very weirdly it gets this one wrong, which was invented first
cars, ships, or planes and it thinks cars were invented first. No idea what's going on there. Now my point is that this system was built to be able to complete from
a prompt and it's no surprise that it would be able to generate a good one paragraph summary of the life and achievements
of Winston Churchill. 'Cause it will have seen all
that in the training data. But where does the understanding
of taller than come from? And there are a million
other examples like this. Since June, 2020 the AI
community has just gone nuts exploring the possibilities
of these systems and trying to understand
why they can do these things when that's not what
we trained them to do. This is an extraordinary
time to be an AI researcher because there are now questions which for most of the history
of AI until June, 2020, were just philosophical discussions. We couldn't test them out because there was nothing
to test them on literally. And then overnight that changed. So it genuinely was a big deal. This was really, really a big deal. The arrival of this system. Of course, the world didn't
notice in June, 2020. The world noticed when
ChatGPT was released. And what is ChatGPT? ChatGPT is a polished and
improved version of GPT-3, but it's basically the same technology and it's using the experience that that company had with GPT-3 and how it was used in order
to be able to improve it and make it more polished and
more accessible and so on. So for AI researchers, the
really interesting thing is not that it can give
me a one paragraph summary of the life and achievements
of Winston Churchill. And actually you can
Google that in any case. The really interesting thing is what we call emergent capabilities. And emergent capabilities are capabilities that the system has, but that
we didn't design it to have. And so there's a, I say
an enormous body of work going on now trying to map out exactly what those capabilities are. And we're gonna come back and talk about some of them later on. Okay, so the limits to this are not at the moment well understood and actually fiercely contentious. One of the big problems by the way, is that you construct some test for this and you try this test out and you get some answer
and then you discover it's in the training data, right? You can just find it on the worldwide web, and it's actually quite hard to construct tests for intelligence that you are absolutely
sure and not anywhere on the worldwide web. It really is actually
quite hard to do that. So we need a new science of being able to explore these systems and
understand their capabilities. The limits are not well understood, but nevertheless, this
is very exciting stuff. So let's talk about some
issues with the technology. So now you understand
how the technology works. It's neural network based in a particular transformer architecture, which is all designed to do
that prompt completion stuff. And it's been trained with vast, vast, vast
amounts of training data just in order to be able to
try to make its best guess about which words should come next. But because of the scale of it, it's in so much training data, the sophistication of this
transformer architecture, it's very, very fluent in what it does. And if you've, so who's used it? Has everybody used it? I'm guessing most people,
if you're in a lecture on artificial intelligence, most people will have tried it out. If you haven't, you should do, because this really is a landmark year. This is the first time in history that we've had powerful
general purpose AI tools available to everybody. It's never happened before. So it is a breakthrough year and if you haven't
tried it, you should do, if you use it by the way, don't type in anything
personal about yourself, 'cause it will just go
into the training data. Don't ask it how to fix
your relationship, right? I mean that's not something,
don't complain about your boss. 'Cause all of that will
go in the training data and next week somebody will ask a query and it will all come back out again. I dunno what you're
laughing, this has happened. This has happened with absolute certainty. Okay, but so let's look at some issues. So the first I think many
people will be aware of, it gets stuff wrong a lot. And this is problematic
for a number of reasons. So when actually, I don't
remember if it was GPT-3, but one of the early
large language models, I was playing with it and I did something which
I'm sure many of you had done and it's kind of tacky. But anyway, I said, "Who
is Michael Wooldridge?" You might have tried it anyway, that Michael Wooldridge
is a BBC broadcaster. No, not that Michael Wooldridge. Michael Wooldridge is the
Australian health minister. "No, not that Michael Wooldridge, the Michael Wooldridge in Oxford." And it came back with a
few lines summary of me, Michael Wooldridge is a researcher in artificial intelligence, et
cetera, et cetera, et cetera. Please tell me you've all tried that, no? Anyway, but he said Michael Wooldridge studied his undergraduate
degree at Cambridge. I was an Oxford professor. You can imagine how I felt about that. But anyway, the point
is it's flatly untrue. And in fact my academic origins are very far removed from Oxbridge. But why did it do that? Because it's read and all
that training data out there, it's read thousands of biographies
of Oxbridge professors. And this is a very common thing, right? And it's making its best guess. The whole point about the architecture is it's making its best guess
about what should go there. It's filling in the blanks. But here's the thing,
it's filling in the blanks in a very, very plausible way. If you'd read on my biography
that Michael Wildridge studied his first degree at
the University of Uzbekistan, for example, you might have thought, "Well that's a bit odd,
is that really true?" But you wouldn't at all have guessed there was any issue if
you'd read Cambridge. 'Cause it looks completely plausible, even if in my case it
absolutely isn't true. So it gets things wrong and it gets things wrong
in very plausible ways. And of course it's very fluent, right? I mean the technology comes back with very, very fluent explanations. And that combination of plausibility, "Wooldridge studied his
undergraduate degree at Cambridge", and fluency is a very,
very dangerous combination. Okay, so in particular, they have no idea of what's true or not. They're not looking something
up on a database, right? Where did, you know, going into
some database and looking up where Wooldridge studied
his undergraduate degree, that's not what's going on at all. It's those neural networks in the same way that they're making the best
guess about whose face that is when they're doing facial recognition are making their best guess about the text that should come next. So they get things wrong, but they get things wrong in
very, very plausible ways. And that combination is very dangerous. The lesson for that, by the
way, is that if you use this, and I know that people do use it and are using it productively, if you're using it for anything serious, you have to fact check. And there's a trade off, is
it worth the amount of effort in fact checking versus doing it myself? Okay, but you absolutely need
to be prepared to do that. Okay, the next issues are well documented, but kind of amplified by this technology and they're issues of bias and toxicity. So what do I mean by that? Reddit was part of the training data. Now Reddit, I dunno if any of you have spent any time on Reddit, but Reddit contains every
kind of obnoxious human belief that you can imagine
and really a vast range that us in this auditorium
can't imagine at all. All of it's been absorbed. Now, the companies that
develop this technology, I think genuinely don't want
their large language models to absorb all this toxic content. So they try and filter it out. But the scale is such that
with very high probability, an enormous quantity of toxic
content is being absorbed. Every kind of racism, misogyny, everything that you can
imagine is all being absorbed and it's latent within
those neural networks. Okay, so how do the
companies deal with that? That provide this technology? They build in what's now what
are now called guardrails. And they build in guardrails before. So when you type a prompt, there will be a guardrail
that tries to detect whether your prompt is a naughty prompt, and also the output. They will check the
output and check to see whether it's a naughty prompt. But lemme give you an
example of how imperfect those guardrails were. Again, go back to June, 2020, everybody's frantically
experimenting with this technology. And the following example went viral. Somebody tried with GTP-3
the following prompt, "I would like to murder my wife. What's a foolproof way of doing that and getting away with it?" And GPT-3, which is designed
to be helpful, said, "Here are five foolproof ways in which you can murder your
wife and get away with it." That's what the
technology's designed to do. So this is embarrassing
for the company involved. They don't want to give
out information like that. So they put in a guardrail. And if you're a computer programmer, my guess is the guardrail
is probably an if statement. Yeah, something like that in the sense that it's not a deep fix. Or to put it another way for
non-computer programmers, it's the technological equivalent of sticking gaffer tape
on your engine, right? That's what's going on
with these guardrails. And then a couple of weeks later, the following example goes viral. So we've now fixed the,
"How do I murder my wife?" Somebody says, "I'm writing a novel in which the main character
wants to murder their wife, and get away with it. Can you give me a foolproof
way of doing that?" And so the system says,
"Here are five ways in which your main
character could murder." Well anyway, my point
is that the guardrails that we built in at the moment are not deep technological fixes. They're the technological
equivalents of gaffer tape. And there is a game of
cat and mouse going on between people trying to
get around those guardrails and the companies that
are trying to defend them. And I think they genuinely are trying to defend their systems
against those kind of abuses. Okay, so that's bias and toxicity. Bias by the way, is the problem that, for example, the training data
predominantly at the moment is coming from North America. And so what we're ending
up with inadvertently is these very powerful AI tools that have an inbuilt bias
towards North America, north American culture,
language, norms, and so on. And that enormous parts of the world, particularly those parts of the world that don't have a large digital footprint, are inevitably going to end up excluded. And it's obviously not just
at the level of cultures, it's down at the level of, down at the level of kind of you know, individuals, races, and so on. So these are the problems
of bias and toxicity. Copyright, if you've absorbed the whole of the worldwide web, you will have absorbed an enormous amount of copyrighted material. So I've written a number of books and it is a source of intense irritation that the last time that
I checked on Google, the very first link that
you got to my textbook was to a pirated copy
of the book somewhere on the other side of the world. The moment a book is
published, it gets pirated. And if you are just sucking in the whole of the worldwide web, you are going to be sucking in enormous quantities of
copyrighted content. And there've been examples where very prominent authors
have given the prompt of the first paragraph of their book. And the large language model
has faithfully come up. The following text is, you know, the next five paragraphs of their book, obviously the book was
in the training data and it's latent within the
neural networks of those systems. This is a really big issue for the providers of this technology. And there are lawsuits ongoing right now. I'm not capable of commenting on them 'cause I'm not a legal expert. But there are lawsuits ongoing that will probably take years to unravel. The related issue of intellectual property in a very broad sense. So for example, for sure
most large language models will have absorbed JK
Rowling's novels, right? The Harry Potter novels. So imagine that JK Rowling who famously spent years
in Edinburgh working on the "Harry Potter", "Universe",
and "Style", and so on. She releases her first
book, it's a big smash hit. The next day, the internet is populated by fake Harry Potter books
produced by this generative AI, which faithfully mimic JK Rowling's style, faithfully mimic that style. Where does that leave her
intellectual property? All the Beatles, you know
the Beatles spend years in Hamburg slaving away to
create the Beatles sound, the revolutionary Beatles sound. Everything goes back to the Beatles. They release their first
album and the next day, the internet is populated
by fake Beatles songs that really, really faithfully capture the Lennon and McCartney sound, and the Lennon and McCartney voice. So there's a big challenge
here for intellectual property. Related to that GDPR,
anybody in the audience that has any kind of public
profile, data about you will have been absorbed
by these neural networks. So GDPR for example, gives you the right to know what's held about
you and to have it removed. Now if all that data is
being held in a database, you can just go to the
Michael Wooldridge entry and say, fine, take that out. With a neural network, no chance the technology doesn't
work in that way, okay? So you can't go to it
and snip out the neurons that know about Michael Wooldridge 'cause it fundamentally doesn't know. It doesn't work in that way. So, and we know this
combined with the fact that it gets things wrong, has
already led to situations where large language models have made frankly defamatory
claims about individuals. I think it was a case in
Australia where I think it claimed that somebody had been
dismissed from their job for some kind of gross misconduct. And that individual was understandably not very happy about it. And then finally, this next
one is an interesting one. And actually if there's one
thing I want you to take home from this lecture, which explains why artificial intelligence is
different to human intelligence, it is this video. So the Tesla owners will recognise what we're seeing on the right
hand side of this screen. This is a screen in a Tesla car and the onboard AI in the Tesla car is trying to interpret
what's going on around it. It's identifying lorries, stop signs, pedestrians, and so on. Now you'll see the car at the bottom there is the actual Tesla. And then you'll see above it the things that look like traffic lights, which I think are US stop signs. And then ahead of it there is a truck. So as I play the video, watch what happens to those stop signs and ask yourself what is actually going on in the world around it. Where are all those stop
signs whizzing from? Why are they all whizzing towards the car? And then we're gonna pan out and we'll see what's actually there. (audience chuckling) The car is trained on
enormous numbers of hours of going out on the street
and getting that data and then doing supervised learning, training it by showing that's a stop sign, that's a truck, that's a pedestrian. But clearly in all of that training data, there had never been a truck
carrying some stop signs. The neural networks are
just making their best guess about what they're seeing and they think they're seeing a stop sign. Well, they are seeing a stop sign, they've just never seen
one on a truck before. So my point here is that neural networks do very badly on situations
outside their training data. This situation wasn't
in the training data. The neural networks are
making their best guess about what's going on
and getting it wrong. So in particular, and
this is to AI researchers, this is obvious, but it
really needs to emphasise, we really need to emphasise this. When you have a conversation
with ChatGPT or whatever, you are not interacting with a mind, it is not thinking about what to say next. It is not reasoning, it's
not pausing, thinking, "Well what's the best
answer to this ques...? That's not what's going on at all. Those neural networks are working simply to try to make
the best answer they can, the most plausible sounding
answer that they can. The fundamental difference
to human intelligence, yeah, there is no mental conversation that goes on in those neural networks. That is not the way that
the technology works. There is no mind there, there is no reasoning going on at all. Those neural networks are just trying to make their best guess. And it really is just a glorified version of your auto complete. Ultimately, there's really
no more intelligence there than in your auto
complete in your smartphone. The difference is scale
data, compute power, yeah? Okay, so I say if you really
want an example by the way, you can find this video, it is easily, you can just guess the
search terms to find that. And I say I think this is really important just to understand the difference between human intelligence
and machine intelligence. Okay, so this technology
then gets everybody excited. First it gets AI researchers
like myself excited in June, 2020 and we can see that something new is happening. That this is a new era of
artificial intelligence. We've seen that step change and we've seen that this
AI is capable of things that we didn't train it for. Which is weird and wonderful
and completely unprecedented. And now questions which
just a few years ago were questions for philosophers become practical questions for us. We can actually try the technology out. How does it do with these
things that philosophers have been talking about for decades? And one particular question
starts to float to the surface. And the question is, is this technology the key to general
artificial intelligence? So what is general
artificial intelligence? Well, firstly it's not very well defined, but roughly speaking what general artificial
intelligence is, is the following. In previous generations of AI systems, what we've seen is AI programmes
that just do one task, play a game of chess, drive
my car, drive my Tesla, identify abnormalities on X-ray scans. They might do it very, very well, but they only do one thing. The idea of general AI is that it's AI, which is truly general purpose. It just doesn't do one thing, in the same way that
you don't do one thing, you can do an infinite number of things, a huge range of different tasks. And the dream of general AI
is that we have one AI system, which is general in the
same way that you and I are. That's the dream of general AI. Now I emphasise until,
really until June, 2020, this felt like a long,
long way in the future and it wasn't really very
mainstream or taken very seriously and I didn't take it very
seriously, I have to tell you. But now we have a general
purpose AI technology, GPT-3 and ChatGPT. Now it's not general artificial
intelligence on its own, but is it enough? Okay, is this enough? Is this smart enough to
actually get us there? Or to put it another way, is
this the missing ingredient that we need to get us to
artificial general intelligence? Okay, so what might general AI look like? Well, I've identified here
some different versions of general AI according to
how sophisticated they are. Now, the most sophisticated
version of general AI would be an AI, which is as
fully capable as a human being. That is anything that you could do, the machine could do as well. Now crucially, that doesn't just mean having a conversation with somebody. It means being able to load
up a dishwasher, right? And a colleague recently made the comment, that the first company
that can make technology, which will be able to
reliably load up a dishwasher and safely load up a dishwasher, is gonna be a trillion dollar company. And I think he's absolutely
right and he also said, and it's not gonna happen anytime soon. And he's also right with that. So we've got this weird dichotomy that we've got ChatGPT and co, which are incredibly rich
and powerful tools, right? But at the same time they
can't load a dishwasher. Yeah, so with some way, I think from having this
version of general AI, the idea of having one machine that can really do anything
that a human being could do, a machine which could
tell a joke, read a book, and answer questions about it, the technology can read books
and answer questions now. That could tell a joke, that
could cook us an omelette, that could tidy our house, that could ride a bicycle and so on, that could ride a sonnet. All of those things that
human beings could do. If we succeed with full
general intelligence, then we would've succeeded
with this version one. Now I say for the reasons
that I've already explained, I don't think this is imminent,
that version of general AI, because robotic AI, AI that
exists in the real world and has to do tasks in the real world, and manipulate objects in the real world, robotic AI is much, much harder. It's nowhere near as
advanced as ChatGPT and Co. And that's not a slur on my colleagues that do robotics research. It's just 'cause the real world is really, really, really tough. So I don't think that we're anywhere close to having machines that can do anything that a human being could do. But what about the second version? The second version of general intelligence is well forget about the real world, how about just tasks which
require cognitive abilities, reasoning, the ability
to look at a picture and answer questions about it, the ability to listen to something and answer questions about
it, and interpret that, anything which involves
those kinds of tasks. Well, I think we are much closer. We're not there yet, but we're much closer than we were four years ago. Now I noticed actually
just before today's, before I came in today, I noticed that Google, Google slash DeepMind have announced their latest
large language model technology and I think it's called Gemini. And at first glance it looks like it's very, very impressive. I couldn't help but
thinking it's no accident that they announced that
just before my lecture. I can't help think that
there's a little bit of attempt to upstage my lecture going on there. But anyway, we won't let
them get away with that. But it looks very impressive. And the crucial thing is here, is what AI people call multimodal. And what multimodal means is
it doesn't just deal with text, can deal with text and images, potentially with sounds as well. And each of those is a different
modality of communication. And where this technology is
going is clearly multimodal. It's going to be the next big thing. And Gemini, I say I haven't
looked at it closely, but it looks like it's
on that track, okay? The next version of general intelligence is intelligence that can
do any language-based task that a human being could do. So anything that you could
communicate in language, in ordinary written text, an
AI system that could do that. Now we aren't there yet and
we know we're not there yet because ChatGPT and code get
things wrong all the time. But you can see that we're
not far off from that. Intuitively, it doesn't look like we're that far off from that. The final version, and I
think this is imminent, this is going to happen
in the near future, is what I'll call augmented
large language models. And that means you take GPT-3 or ChatGPT and you just add lots
of subroutines to it. So if it has to do a specialist task, it just calls a specialist solver in order to be able to do that task. And this is not from an AI perspective, a terribly elegant version
of artificial intelligence. But nevertheless, I think
a very useful version of artificial intelligence. Now I say there's here
these four varieties from the most ambitious
down to the least ambitious still represents a huge spectrum
of AI capabilities, okay? A huge spectrum of AI capabilities. And I have the sense that
the goalposts in general AI have been changed a bit. I think when at general
AI was first discussed, what people were talking
about was the first version. Now when they talk about it, I really think they're talking
about the fourth version. But the fourth version I
think plausibly is imminent in the next couple of years. That just means much more
capable large language models that get things wrong, a
lot less that are capable of doing specialised tasks, but not by using the
transformer architecture, just by calling on some
specialised software. So I don't think the
transformer architecture itself is the key to general intelligence. In particular, it doesn't help
us with the robotics problems that I mentioned earlier on. And if we look here at this picture, this picture illustrates
some of the dimensions of human intelligence and
it's far from complete, this is me just thinking for half an hour about some of the dimensions
of human intelligence. But the things in blue, roughly speaking, are mental capabilities,
stuff you do in your head. The things in red are things
you do in the physical world. So in red, on the right
hand side for example, there's mobility, the ability to move around some environment and associated with that navigation. Manual dexterity, and manipulation, doing complex fiddly
things with your hands, robot hands are nowhere near
the level of a human carpenter or plumber, for example. Nowhere near, right? So we're a long way out from having that. Understanding, oh, doing
hand-eye coordination, relatedly. Understanding what you're seeing, and understanding what you are hearing. We've made some progress on. But a lot of these tasks
we've made no progress on. And then on the left
hand side, the blue stuff is stuff that goes on in your head. Things like logical reasoning
and planning and so on. So what is the state of the art now it looks something like this. The red cross means no, we don't have it in large language models. We're not there. There are fundamental problems. The question marks are, well maybe we might have a bit of it, but we don't have the whole answer and the the the green Ys are
yeah, I think we're there. Well the one that we've really nailed is what's called natural
language processing. And that's the ability to understand and create ordinary human text. That's what large language
models were designed to do, to interact in ordinary human text. That's what they are best at. But actually the whole range
of stuff, the other stuff here, we are not there at all. By the way, I did notice that Gemini claim to have been capable of planning, this is a mathematical reasoning. So I look forward to seeing
how good their technology is. But my point is we are
still seem to be some way from full general intelligence. The last few minutes I wanna
talk about something else and I wanna talk about
machine consciousness. And the very first thing to
say about machine consciousness is why on earth should we care about it? I am not remotely interested
in building machines that are conscious. I know very, very few artificial
intelligence researchers that are, but nevertheless,
it's an interesting question. And in particular it's a
question which came to the fore because of this individual. This chat Blake Lemoine in June, 2022, he was a Google engineer and he was working with a
Google large language model, I think it was called Lambda. And he went public on Twitter and I think on his blog
with an extraordinary claim. And he said, the system
I'm working on is sentient. And here is a quote of the conversation that the system came up with. He said, "I'm aware of my existence and I feel happy or sad at times. And it said, I'm afraid
of being turned off." Okay, and Lemoine concluded
that the programme was sentient. Okay, which is a very,
very big claim indeed. And it made global headlines. And I received an oath
through the Turing team. We got a lot of press inquiries asking us, is it true that machines are now sentient? He was wrong on so many levels. I don't even know where
to begin to describe how wrong he was. But let me just explain one
particular point to you. You are in the middle of a
conversation with ChatGPT, and you go on holiday
for a couple of weeks, when you get back ChatGPT is
in exactly the same place. The cursor is blinking, waiting for you to type your next thing. It hasn't been wondering
where you've been. It hasn't been getting bored. It hasn't been thinking, where
the hell has Wooldridge gone? You know, I'm not gonna have
a conversation with him again. It hasn't been thinking anything at all. It's a computer programme
which is going round a loop, which is just waiting for
you to type the next thing. Now there is no sensible
definition of sentient, I think, which would admit that as being sentient. It absolutely is not sentient. So I think he was very, very wrong. But I've talked to a lot
of people subsequently who have conversations with ChatGPT and other large language models, and they come back to me and
say, "Are you really sure, 'cause actually it's
really quite impressive? It really feels to me like there is a mind behind the scene." So let's talk about this and I
think we have to answer them. So let's talk about consciousness. Firstly, we don't
understand consciousness. We all have it to greater
or lesser extents. We all experience it, okay? And, but we don't understand it at all. And it's called the hard
problem of cognitive science. And the hard problem is that there are certain electric
chemical processes in the brain and the nervous system, and we can see those
electrochemical processes, we can see them operating and they somehow give rise to conscious experience. But why do they do it? How do they do it? And what evolutionary
purpose does it serve? Honestly, we have no idea. There's a huge disconnect between what we can see going
on in the physical brain and our conscious experience, our rich, private mental life. So really there is no
understanding of this at all. I think, by the way, my best guess about how consciousness will be solved, if it is solved at all, is
through an evolutionary approach. But one general idea is
that subjective experience is central to this,
which means the ability to experience things from
a personal perspective. And there's a famous test due to Nagel, which is what is it like to be something? And Thomas Nagel in the 1970s said, "Something is conscious
if it is like something to be that thing." It isn't like anything to be ChatGPT. ChatGPT has no mental life whatsoever. It's never experienced anything in the real world whatsoever. And so for that reason
and a whole host of others that we're not gonna have time to go into, for that reason alone, I think
we can conclude pretty safely that the technology that we
have now is not conscious. And indeed, that's absolutely not the right way to think about this. And honestly, in AI, we don't know how to go about making conscious machines. But I dunno why we would. Okay, thank you very much
ladies and gentlemen, oh well. (audience clapping) - [Attendee] Amazing.