- [Alex] Okay uh good morning everyone let's let's change a little bit of topic now I'm going to talk a little
bit more about applications of quantum computing and in
particular in machine learning this is going to be a
machine learning talk and for preparing it
I, I did this exercise I thought let's take the last year the last period of 365 days and let's see what was in the
news about machine learning And you know every once in
awhile you will hear some news that something has been done so on but like doing this
retrospective really shocked me because there's so much
that has been done this year we've gone from like generating
high resolution images of faces of people that do not exist to using machine learning in in medicine in helping predicting diseases to also it's also being used to as a tool in other areas of research
to do actual discoveries and then the list goes on and on and on we have like now presenters
that are not real we have AI creating art, we
have AI writing coherent text and even a couple of
days ago we had a machine making, well competing
in a debate competition against humans and this actually well, seeing this with a bit of like perspective makes you think wow like from
here in into a couple of years there's nothing that deep learning is not going to be able to do, right well actually, in this talk
I want to convince you that this may not be quite the case and let me put you an example of it let's think that you are an
expert in deactivating bombs Okay, something that
most of you are probably and um let's well you want to innovate and you want to implement machine learning and try to make your work
easier by using machine learning so you have this deep learning algorithm you develop this deep learning algorithm that takes information
about the particular bomb for instance a very
important thing is as you as everyone knows, the
color of the cables, right and well, then you use it, you use the algorithm to predict
which cable you have to cut so the answers you get
are something like this Okay and cool you train
it, it works fine, perfect but then it goes in the
application into a real problem so you are faced with a real bomb that if you cut the
wrong cable, it explodes and you just get this information well I don't know you but probably, I would like a bit more information right and maybe I don't know how
sure are you about this. Isn't it that well, were you more or less the same sure but just a little bit less that it wasn't the blue cable I just wanted a bit more you know. And it turns out that this questions like how sure is an algorithm
about a specific prediction is a very difficult question to answer in the standard framework,
framework of deep learning. The reason for this is that
deep learning as we know it now is mostly based on an optimization, it's based on calculus and these questions do not fit that well. In contrast, there's other well
there's people aware of this and there's other frameworks in which machine learning operates that they have a more natural frame, are a more natural framework
for these sorts of questions. Like a probability theory and that's what I'm going to talk about. This Bayesian approach
to machine learning. The war version should I, I
don't want to scare people here it just using these sorts of theorems about probability distributions. This is essentially Bayes
theorem which tells you what is the probability
of some event occurring given that we have some
previous information A. And how to compute it
given other information that is more easily accessible And this has a very nice
application in machine learning. You can think of what is the probability that a label is given,
given that I know some data in training I know some previous, I have some previous experience. For example, and I compute I can compute that from quantities
that are more accessible, more accessible in my
data set for example. And now the kinds of answers
that we get are still not, maybe not completely convincing, but at least we're getting
a bit more information about the solution that is being output. With something like this, I would be a bit more convinced
on cutting the right cable you know. Uh, okay and one so, one approach so when I press this, is well doing it in classical
computers the people have been working on this and
have been doing research and this kind of Bayesian
training of deep neural networks can be done. Here I just will have to warn you that here comes the boring math but I will try to keep it simple. Essentially the way of
doing Bayesian training of deep neural networks
is thanks to this analogy between each layer in the network and something that is
called a Gaussian process. And the important thing
about Gaussian processes is that we need to know, is that well, we assume that there is a
global Gaussian distribution underlying the outputs of each label. And then we want to compute this quantity. The what is called the
Posterior distribution which is essentially the
probability distribution of some label White Star given that I have some input X star and some training set. Which with instances and labels And this if we assume
this Gaussian process is, has this form here. It's just a normal distribution. The Gaussian distribution
with some in and some variants that are given by this formula
is important thing here, I wonder if this I can point with this. No, no okay anyway the scale
here is an important quantity that is called the covariance,
the covariance matrix. And it's essentially a matrix that you build out of
your data out of your data by applying what is called
a covariance function to each of the, if each
combination of data points. And the very very nice thing
is that for each layer, you can compute this covariance matrix just using the information
from previous layers. So you can do, you can do this
training in a recursive way. Awesome then, so this thing exists so why is not everyone using it. Well it turns out that, well
it's not like super hard if not, it's not NP hard
to compute this inverse so remember that we need
this covariance function with this covariance
matrix but we have a power to the minus one, we have
to invert that big matrix. And this inversion, yeah
it's not super super hard but, but still for very big datasets for a large amount of points, the number of operations
that one has to do goes with the third power of
the number of data points. And this at some point
becomes a bit intractable. And once so, so, so
what what can we do now. Here is the point where
quantum computing can help. So why don't we do something like this. We encode these vectors
Y and this this case star into quantum States and
we interpret our matrix as a quantum operator. Can we now do something easier? Well it turns out that yes. Luckily there was this
algorithm by Hassidim, Harrow Hassidim and
Lloyd, developed in 2009 that allows to do exactly this. So you have a system A times X equals V like a linear system of equations. And there exists a quantum algorithm that retrieves the solution
this, this vector X which has a very similar
form to this scale to the minus one times Y. So we can do that part on the one hand. And on the other hand, we
also have quantum algorithms to perform this inner
product in an efficient way. So we can do this. And that's what we were, well
these are the sorts of results we were connecting in
order to have an end to end quantum algorithm to do
this Bayesian training of deep neural networks. What we did we do essentially, this is in this paper over here that we released like half a year ago. Essentially we need just two ingredients which is first the recursive formula for the covariance matrix
over layer as a function of the covariance matrix
of the previous layer. And then we need and this,
it's true that it's not a trivial thing but we would need the initial covariance matrix, the covariance matrix of the first layer encoded as a quantum state which I mean in principle
you could for instance, compute classically and
then prepare such a state but anyway this, this we don't
care at least in this project we don't care too much about it given these two things,
what we were able to do is to build an approximation
of the covariance matrix of the last layer. Again well we will we build this and then what we also built
was the time evolution operator under this approximation. So this is essentially, yeah this could be encoded
into a quantum circuit or, or, or simulated by
a Hamiltonian simulation and could be applied in the HHL algorithm to do the matrix impression. So, essentially yeah, we take this state encoding the initial covariance matrix and we developed the evolution,
the time evolution operator that allows us to do the matrix inversion in a quantum way and
compute the inner products. So to obtain the parameters
of the distribution that we want to fit the data to. Not only that, this was more theoretical but then also we did
some sort of experiments. Bear in mind that these are experiments done by theoreticians so they may they may not satisfy real experimentalists but well we were, we were coding the core part of the
algorithm this HHL part. We were implementing it
in various frameworks in regard to forest and in IBM Q. And we were doing simulations
of the run of the algorithm for invert, inverting big
matrix's as big as 4 by 4. And running the protocols
in these simulators using different, different kinds of noise. In this figure I have both gate noise which is an X operator
applied after every gate of the circuit with some probability and you see that this awful essentially because of the number of gates that you have in this
secret so it's quite big so even for low probabilities
you have a lot of ex operators applying on your on your state and then we have this measurement noise which is just a readout error
when you do measurements and this is not, not, not that bad. Not only that we also did runs
in real quantum computers. Both of IBM and reeding
and in the case of IBM we got particularly nice results. In particular, well we got the here well I'm putting this
probability of success under a swab test just not to
make too much fuss about it. This can translate into a Fidelity with the desired target
state and in the case of IBM we get fidelity's of about 78%
which is which is brilliant so yeah, that's that's essentially all I wanted to tell you
just to wrap up quickly. I hope I've, well the takeaway message is that not all machine
learning is deep learning. And actually there's
other frameworks another and other ways of doing machine learning that may be more useful for
particular applications. In this context by using deep learning based in Gaussian processes, it is useful. You can train very large networks but it's also classically hard. Nevertheless, for the
classical, for the hard parts we can resort to quantum computing and have some sort of hybrid
classical quantum algorithms to do the full training
and in this respect, the experiments that we have
conducted are encouraging as I said especially in the IBM platform but still there's a lot to be done. The matrix's that we could
invert in real computers were not bigger than two by two. So probably it would take less
time maybe doing that by hand but anyway, all the tools are there. We did everything open
source, all the code so, so they are up to available
for generalization or, or for the modifications
and I guess it's a it's a matter of time
that we have application of these algorithms in
more realistic scenarios. And that's all thank you very much.