Translator: Michele Gianella
Reviewer: Saeed Hosseinzadeh When I was a boy, I wanted to maximise
my impact on the world, and I was smart enough
to realise that I am not very smart. And that I have to build a machine that learns to become
much smarter than myself, such that it can solve all the problems
that I cannot solve myself, and I can retire. And my first publication
on that dates back 30 years: 1987. My diploma thesis, where I already try to solve
the grand problem of AI, not only build a machine that learns a little bit here,
learns a little bit there, but also learns to improve
the learning algorithm itself. And the way it learns, the way it learns, and so on recursively, without any limits except the limits of logics and physics. And, I'm still working
on the same old thing, and I'm still pretty much
saying the same thing, except that now
more people are listening. Because the learning algorithms that we have developed
on the way to this goal, they are now on 3.000 million smartphones. And all of you have them in your pockets. What you see here are the five most valuable companies
of the Western world: Apple, Google, Facebook,
Microsoft and Amazon. And all of them are emphasising that AI, artificial intelligence, is central to what they are doing. And all of them are using heavily
the deep learning methods that my team has developed
since the early nineties, in Munich and in Switzerland. Especially something which is called:
"the long short-term memory". Has anybody in this room ever heard
of the long short-term memory, or the LSTM? Hands up, anybody ever heard of that? Okay. Has anybody never heard of the LSTM? Okay.
I see we have a third group in this room: [those] who didn't
understand the question. (Laughter) The LSTM is a little bit like your brain: it's an artificial neural network
which also has neurons, and in your brain, you've got
about 100 billion neurons. And each of them is connected to roughly 10,000
other neurons on average, Which means that you have got
a million billion connections. And each of these connections
has a "strength" which says how much
does this neuron over here influence that one over there
at the next time step. And in the beginning, all these connections are random
and the system knows nothing; but then, through a smart
learning algorithm, it learns from lots of examples
to translate the incoming data, such as video through the cameras,
or audio through the microphones, or pain signals through the pain sensors. It learns to translate that
into output actions, because some of these neurons
are output neurons, that control speech muscles
and finger muscles. And only through experience, it can learn to solve
all kinds of interesting problems, such as driving a car or do the speech recognition
on your smartphone. Because whenever you take out
your smartphone, an Android phone, for example, and you speak to it, and you say: "Ok Google, show me
the shortest way to Milano." Then it understands your speech. Because there is a LSTM in there
which has learned to understand speech. Every ten milliseconds,
100 times a second, new inputs are coming from the microphone, and then are translated, after thinking, into letters which are then questioned
to the search engine. And it has learned to do that by listening to lots of speech
from women, from men, all kinds of people. And that's how, since 2015, Google speech recognition
is now much better than it used to be. The basic LSTM cell looks like that: I don't have the time to explain that, but at least I can list the names of the brilliant students in my lab
who made that possible. And what are the big companies
doing with that? Well, speech recognition
is only one example; if you are on Facebook -
is anybody on Facebook? Are you sometimes clicking
at the translate button? because somebody sent you something
in a foreign language and then you can translate it. Is anybody doing that? Yeah. Whenever you do that, you are waking up, again,
a long short term memory, an LSTM, which has learned to translate
text in one language into translated text. And Facebook is doing that
four billion times a day, so every second 50,000 sentences are being translated by an LSTM working for Facebook; and another 50,000 in the second;
then another 50,000. And to see how much this thing
is now permitting the modern world, just note that almost 30 percent of the awesome computational
power for inference and all these Google Data Centers, all these data centers of Google,
all over the world, is used for LSTM. Almost 30 percent. If you have an Amazon Echo, you can ask a question and it answers you. And the voice that you hear
it's not a recording; it's an LSTM network which has learned from training examples to sound like a female voice. If you have an iPhone,
and you're using the quick type, it's trying to predict
what you want to do next given all the previous context
of what you did so far. Again, that's an LSTM
which has learned to do that, so it's on a billion iPhones. You are a large audience, by my standards: but when we started this work,
decades ago, in the early '90s, only few people were interested in that, because computers were so slow
and you couldn't do so much with it. And I remember I gave a talk
at a conference, and there was just
one single person in the audience, a young lady. I said, young lady,
it's very embarrassing, but apparently today
I'm going to give this talk just to you. And she said, "OK, but please hurry:
I am the next speaker!" (Laughter) Since then, we have
greatly profited from the fact that every five years
computers are getting ten times cheaper, which is an old trend that has held
since 1941 at least. Since this man, Konrad Zuse, built the first working
program controlled computer in Berlin and he could do, roughly,
one operation per second. One! And then ten years later,
for the same price, one could do 100 operations: 30 years later, 1 million operations for the same price; and today, after 75 years, we can do a million billion times as much
for the same price. And the trend is not about to stop, because the physical limits
are much further out there. Rather soon, and not
so many years or decades, we will for the first time
have little computational devices that can compute as much as a human brain; and that's a trend that doesn't break. 50 years later, there will be
a little computational device, for the same price, that can compute as much as all
10 billion human brains taken together. and there will not only be one,
of those devices, but many many many. Everything is going to change. Already in 2011,
computers were fast enough such that our deep learning methods for the first time could achieve
a superhuman pattern-recognition result. It was the first superhuman result
in the history of computer vision. And back then, computers were
20 times more expensive than today. So today, for the same price, we can do 20 times as much. And just five years ago, when computers were 10 times
more expensive than today, we already could win, for the first time,
medical imaging competitions. What you see behind me
is a slice through the female breast and the tissue that you see there
has all kinds of cells; and normally you need a trained doctor,
a trained histologist who is able to detect
the dangerous cancer cells, or pre-cancer cells. Now, our stupid network knows nothing about cancer,
knows nothing about vision. It knows nothing in the beginning: but we can train it to imitate
the human teacher, the doctor. And it became as good, or better,
than the best competitors. And very soon, all of medical diagnosis
is going to be superhuman. And it's going to be mandatory, because it's going to be
so much better than the doctors. After this, all kinds of medical
imaging startups were founded focusing just on this,
because it's so important. We can also use LSTM to train robots. One important thing I want to say is, that we not only have systems that slavishly imitate
what humans show them; no, we also have AIs
that set themselves their own goals. And like little babies,
invent their own experiment to explore the world and to figure out
what you can do in the world. Without a teacher. And becoming more and more general
problem solvers in the process, by learning new skills
on top of old skills. And this is going to scale:
we call that "Artificial Curiosity". Or a recent buzzword is "power plane". Learning to become a more and more
general problem solvers by learning to invent, like a scientist,
one new interesting goal after another. And it's going to scale. And I think, in not so many years
from now, for the first time, we are going to have an animal-like AI - we don't have that yet. On the level of a little crow, which already can learn
to use tools, for example, or a little monkey. And once we have that, it may take just a few decades to do the final step
towards human level intelligence. Because technological evolution is about a million times faster
than biological evolution, and biological evolution
needed 3.5 billion years to evolve a monkey from scratch. But then, it took just a few tens
of millions of years afterwards to evolve human level intelligence. We have a company
which is called Nnaisense like birth in [French], "Naissance",
but spelled in a different way, which is trying to make this a reality and build the first
true general-purpose AI. At the moment, almost all research in AI
is very human centric, and it's all about making human lives
longer and healthier and easier and making humans
more addicted to their smartphones. But in the long run, AIs are going to -
especially the smart ones - are going to set themselves
their own goals. And I have no doubt, in my mind, that they are going to become
much smarter than we are. And what are they going to do? Of course they are going to realize
what we have realized a long time ago; namely, that most of the resources,
in the solar system or in general, are not in our little biosphere. They are out there in space. And so, of course,
they are going to emigrate. And of course they are going to use trillions of self-replicating
robot factories to expand in form of a growing AI bubble which within a few hundred thousand years is going to cover the entire galaxy by senders and receivers such that AIs can travel the way they are
already traveling in my lab: by radio, from sender to receiver. Wireless. So what we are witnessing now is much more than just
another Industrial Revolution. This is something
that transcends humankind, and even life itself. The last time something
so important has happened was maybe 3.5 billion years ago,
when life was invented. A new type of life is going to emerge
from our little planet and it's going to colonize
and transform the entire universe. The universe is still young:
it's only 13.8 billion years old, it's going to become much older than that,
many times older than that. So there's plenty of time
to reach all of it, or all of the visible parts, totally within the limits
of light speed and physics. A new type of life is going
to make the universe intelligent. Now, of course, we are not going to remain
the crown of creation, of course not. But there is still beauty in seeing yourself
as part of a grander process that leads the cosmos from low complexity
towards higher complexity. It's a privilege to live at a time where we can witness
the beginnings of that and where we can contribute
something to that. Thank you for your patience. (Applause)