SERTAC KARAMAN: OK,
welcome again, everyone. It's an honor to kick
off this session. So in this session,
we have a number of distinguished panelists, and
I'll introduce them one by one. I'll keep my introductions short
so that they can get some time to do their talks. First, I'd like to invite on
stage Angelia Nedrich, who's from ASU, and will kick
us off with the first one. ANGELIA NEDICH:
Thank you, Sertac. [SIDE CONVERSATION] ANGELIA NEDICH: So I choose
to talk about optimization and the things that have
been done in LIDS in terms of large-scale decentralized
optimization network systems, maybe because I was
closer to that area, and I could relate to most
of the work that was done. So what I want to
start with is, like, where today's challenges are. I took this off
the NSF web page. There are three relevant-- kind of go along with
things we have heard today, is that one of the bigger-- this is one of the 10 challenges
that NSF is talking about. And one of-- it's
a future science, data-driven autonomous
systems, which really talks about
how we're going to be collecting a lot of data,
using data, managing data. And it's going to be
one of the directions in which research will be
driven through machine learning, artificial intelligence, the
Internet of Things, and so on. [SIDE CONVERSATION] ANGELIA NEDICH: Another
thing that caught my eye there was talking about-- they use the word
convergence research, which I couldn't get
what they mean by that. But if you look further
in the description, what they mean by that is what
actually is happening at MIT is having people from
different backgrounds, from different expertise,
to be able to solve these future problems--
challenging problems, they don't perceive
one discipline running the whole scenery. But there will be interplay
of multiple decisions, areas of research. They're going to
require convergence in terms of merging ideas. That was the word that
I couldn't figure out what they meant by. And they expect that
combining different ideas from different disciplines
is going to eventually lead research and development in
the next 20 or so many years. So when you look back at what
were the past challenges, over the last maybe 30, 40
years, things were similar, actually. Questions were similar. There was also a of
dimension, complexity, except these systems
in the past were a bit smaller and less complex. So they were also dealing
with data at the time. But the data was collected in
a smaller scale and slower, and the devices were slower,
with smaller processing capabilities. And the internet was actually
just at the beginning. So if you remember,
internet in the late '90s, it was very slow, when
you had to open a screen, or you had to wait quite a long
time to get your page open. So what I see and the
difference is just the scale and the speed at
which the things happen. And in complexity
of the systems that are now networks of networks
that we're heading to. So when I look
back is like, what are the things that have been
done into that direction. One thing that
definitely stands out is the Lagrangian methods that
were studied by Bertsekas even from the early '70s. And then these methods
actually were part of-- some kind of
splitting methods were part of the thesis from one of
Bertsekas' students, Eckstein, which is also leading
to ADMM method, which with one recent paper
with multiple authors just regained a
lot of attention. So a message is here
is ADMM became popular. But ADMM has a long
tradition, and since that way back in the past, and
it has roots at LIDS. Are another aspect or
a research direction had to do with parallel
optimization, which is also a particular topic I here have. Also a student of
Professor Bertsekas. And it was already looking
at some parallel methods. But the concept here is deeper. It goes into a
monotone operators. Paul Tseng was another
student of Dimitri, and he worked really
extensively on all kinds of optimization methods. He worked in parallel
block coordinates, incremental asynchronous
decompositions, and so on. Tom Luo was also part of the
LIDS and co-timer with Paul. He worked with-- his advisor's
Professor Tsitsiklis. But he also worked on
optimization methods on large scale. Actually, his aspect was
more on the complexity, from what I gather from some
sequence of papers at the time. Now let's say if you look at
current machine learning-- there are different communities
of machine learning. And a lot of different people
have different interpretations of what machine learning is. But there is one
particular stylized problem with optimization, which
runs under the name machine learning. And in principle, it's
something that in LIDS was known as incremental method. You have a sum of functions. You want to minimize them. And it's complicated
to go, because the sum involves a lot of elements. So you don't want
a computer gradient of the entire objective. You just go one by one. Whether you do it cyclically,
whether you choose incremental, it's known as
incremental methods. And actually, there is a work
by Bertsekas and Tsitsiklis in that direction. Paul Tseng had actual papers
that are out after the thesis. My thesis was on
that topic, even though my thesis
never in the name spells the name incremental. It's hiding under
a different name. Then Professor Ozdaglar with,
I think, Mert Gurbuzbalaban was as a postdoc here,
and Professor Parrilo, had also followed up
with a sequence of papers on incremental aggregated
methods, which tells that. And also the paper I think that
shows that somebody shuffling cycling methods can beat
the basic cyclic method. Basically, you are
reshuffling, you're breaking the curse
of wrong bad orders-- bad patterns. So basically, I see this
sequence of machine learning papers, which are in the
community of machine learning, are not as-- it seems that sometimes,
they may be left out a bit. OK, and one more aspect
that I wanted to discuss is also another
kind of area that is these days very
attractive, which has to do with
decentralized computations. Sometimes it runs under the
name multi-agent systems optimization in networks
that are different. In different places, people
they give different names. It starts with a work by Athans,
Bertsekas, and Tsitsiklis. And actually, Professor
Tsitsiklis' thesis has the model in place. And in the book also, there are,
by Bertsekas and Tsitsiklis, in Parallel and
Distributed Computation, I think it regained recognition. As Ben was just mentioning, it
fell into a forgotten state. And then all of a sudden,
people remember the book. And I think for that work, you
received a von Neumann Theory Prize. Then Asu and I
picked up that work-- maybe we started a
couple years earlier, but the publication
showed in 2009. That's the first paper
that was published. We continued-- borrowed the
ideas of this distributed kind of information
exchange, but we addressed different problems,
like basically machine learning problem set up in the
mutli-agent setting. After that point, I think
that Olshevsy, Alex-- I don't know. I just saw him earlier. His thesis also taps into
the aspects of these, actually, aggregation
strategies, which is really hiding a
consensus label or agreement protocol in a name. And then after that, I am
mentioning three of us, because after that,
we have students who have followed and
worked on various aspects of these problems-- on directed graphs and
all kinds of things. And I think I go to
also draw attention to what Alex is doing. He's actually looking
at interesting aspects that, actually, some
of these strategies demonstrate network-independent
scalability. So what you have
in these algorithms is that there is a
transient behavior that depends on the
network structure, even when the network
structure can be time-varying. So there's this
transient time when, if you analyze the
algorithm, you can see the effects of the graphs. And after the transient time,
the effective graph disappears. It doesn't affect the
final asymptotic scaling of the protocols. And that's about it. Thank you. [APPLAUSE] SERTAC KARAMAN:
Thank you Angelia. I'd like to invite
our second speaker to the stage, Asu
Ozdaglar, who is a faculty member at EECS at MIT-- department head at EECS as well. [INTERPOSING VOICES] ASU OZDAGLAR: Perfect. It's wonderful to
be in this event. I was so excited. It feels like I'm
almost at my wedding. I know everybody. So I just want to
talk to you briefly about my journey with
minmax problems and games, and the very huge effect
of LIDS in this journey. So I'll go back
15 years, and I'll see how things picked
up throughout the time as we're doing
research in this space. So let me start with
the formulation. We're interested in a minmax
problem for our function with two variables--
very standard. And these arise in a
multitude of applications. And I was writing
the applications, and I was laughing. This is how applications
looked like 15 years ago for LIDS students. Of course, there's
robust optimization that has a lot of research these
days over the past 10 years. Basically, this is the
backbone of that formulation. y is a parameter
that we don't know. We're trying to minimize the
cost function over x, assuming the worst possible value of y. So that's what they
mean by minmax. And then the other one that
we were very interested in-- this is part of my PhD
thesis, Angelia's PhD thesis. And I'll of talk about the
connections in a minute. We're interested in solving
constrained optimization problems. So we have a primal
problem with constraints. We're minimizing sum f
of x, subject to a bunch of constraints. And one very natural way of
solving such a constraint problem is, you relax
the constraints. You form a Lagrangian function. And then you formulate
a dual problem, whereby you minimize the
Lagrangian function over x, you find a dual function-- beautiful
concave function, no matter how crazy primal problem is. So you try to solve this
possibly now smooth concave problem. But the point is this is
also a minmax problem. So how do we solve this problem? I'm very interested in
computing a saddle point. That's the solution
we're interested in. So the saddle point
is you take f of x, y. You fix y star minimize over x. That's your x star. You fix x star maximized over y. That's your y star. So this is the solution
concept we're interested in. And a method we love very
dearly in optimization is gradient
descent-ascent, which is the simultaneous iterations. You're trying to minimize
over x, maximize over y, simultaneously take steps
along the gradient with respect to x minus gradient with respect
to x plus gradient with respect to y. So you try to do
this simultaneously. And it's magical that, while
doing this simultaneously, this will converge
the saddle point. And I would like
to also give you a little bit of history of
where this method emerges. This is one of the things
that I found as a student. This is toward the
end of my PhD-- fascinating. So while we like this
in optimization a lot, the foundations of
this method is actually mathematical economics. So if you look at
Samuelson's work, or actually a beautiful gem
that I would recommend everyone, this is a book that we
loved with Angelia-- Arrow, Hurwicz, Uzawa,
who are economists. And the book is Linear And
Nonlinear Optimization. So it was the first time this
emerged as continuous time versions of these methods, and
proves some convergence results under strict convexity
assumptions-- strong convexity. Uzawa, another economist,
focused on a discrete time version. Showed convergence
to a neighborhood under, again, strong convexity. And then a bunch of other
very, very strong papers-- Gol'shtein, Maistroskii,
Korpelevich, which also introduced
extragradient method. Please look at the
dates-- '77, '58. So this is where these
methods were studied. And we got very interested
in this, Angelia and I. We got these papers. And then this is essentially
I'm talking about 15 years ago. And I was looking over my notes. I keep all the papers. I have stacks of
papers in my office. I was going over my notes,
looking at these papers. And I found this. So this is Gol'shtein's paper. And those who know how I
work will recognize the pinks and yellows. And there are some notes. I was looking there's some--
assumes bounded subgradients. That's Angelia's handwriting. And then I go to
these assumptions, and I think we put
something very restrictive. So we were never happy
reading these papers. We always found
something to pick on. And so I remember,
actually, this was again when we were working
on these problems 15 years ago. Get stacks of these
papers-- coffee shop. Lots of coffee with it. And then we tried to say, OK,
this restrictive is strong convexity. This bothered us, because
the dual problems-- I showed you the Lagrangian. That's linear in mu,
the Lagrange multiplier. So this won't work for
duality, which we were interested in at that time. So we were sort of so-- you remember-- completely
excited about this notion of showing convergence
for ergodic averages. So instead of
thinking about data that's generated
by the algorithm, we were taking time
averages of all data that's generated so far. And that allowed us, for
these convex problems, without strong convexity,
to be able to get per-iterate convergence
rate estimate. So I wrote, very high level,
one of the results-- again '09. This is very misleading. This was 2005. I remember this
in a coffee shop. So this is basically
thinking about converging to the saddle point through
a gradient descent-ascent But I also put in blue
the main assumption we were working with. Subgradients are
uniformly bounded. So we assumed, actually,
that these minmax problems are over compact sets. So we have convergence
rate estimates. It goes to a
neighborhood, where you can choose the
approximation quality by playing with the step size. You have 1 over k
convergence rate. So we were fascinated by that. So I'm going to fast forward. 15 years this is a paper. We left-- don't think I
looked at minmax after that for a couple of years. Then comes machine learning. The pictures got much
more interesting. So we see pigs here. So on the left, there's
some picture of a pig. The state-of-the-art classifier
classifies it to be 91% pig. And then you add a very
little imperceptible noise. And then that
classifier classifies it to be an airliner. So this is the joke in
machine learning that says, OK, machine learning
makes pigs fly. [LAUGHTER] Basically, if you look at
what we are thinking about, in here, the standard
generalization without thinking about these
adversarial perturbations is we want to minimize over
the model parameters some loss. We have the data x, y. In the case of classification,
x are examples, y are labels, drawn according to
some distribution D. We want to choose our model to
minimize this expected loss. So the robust
version is nothing, but you want to do this against
perturbations of the input data. How to define that perturbation
is a very interesting question. But Aleksander Madry
in a recent paper assumed these are infinity
perturbations on x. So we're thinking about
this robust training, where we're minimizing the
same cost against the maximum perturbation. So I'm running out of time. Another fascinating
question-- generative adversarial networks. So we're basically thinking
about designing some generator neural network which
maps random light noise into in high-dimensional
object with structures. These are fake images. These are cars that are
trained for using samples from a true distribution. And there's a
discriminator that's trying to distinguish
between these two. Very interesting, and
you write the problem down using different
kinds of formulations. And you run into
another minmax problem. So the point is these
are all minmax problems, but with very different flavor. So objective functions, we
were fighting strongly convex. These are not
convex-concave-- non-convex So what happens with that? You look at many
empirical papers. There are training oscillations,
these issues of mode collapse. This is the true distribution. You look at the steps
of the algorithms, so you basically
oscillate between modes of the true
distribution over there. And even on the simple
bilinear, because this is what got me very concerned. We have a bilinear convex
problem, GDA diverges. And then we were like, we
worked on this problem. We made it converge. What's happening? So this is basically, when
you don't have compact sets, this will actually diverge. Even if you have
compact sets, it was not the convergence of the
point that we were getting, but rather the function value. So the considerations
have changed. And then last year,
I went to a talk given by one of our
faculty, Costas Daskalakis. He put this algorithm,
which I had not seen before, optimistic gradient
descent-ascent. This is a GDA with a
little negative momentum. And he was showing, basically,
all the beautiful things this algorithm was doing. And then there's
the 1 over 2 there. And somebody asked Costas,
why is it 1 over 2? He said, it should be 1 over 2. And this actually was studied
before in the online convex optimization literature by Sasha and I was fascinated. Why is that 1 over 2? So my fascinations
are very weird. So we started basically
looking at this with a student of mine, Sarath,
sitting at the back, as well as our postdoc. So we actually took our GDA,
as well as the extra gradient that I just talked about,
and actually showed that these are approximations of
the very seminal proximal point method studied by Rockafellar,
as well as Dimitri, in the '80s. And you can actually see
from that lens what's happening with all
of these algorithms why that 1/2 and
many other insights. So I'd like to,
in my 30 seconds, go very quickly to the games. So I won't have time
to say much about it. But before going
to that, of course, one cannot move to
games without stopping, if you're a LIDS student, at the
distributed decision problem. So this was an active area
in AD's distributed resource allocation. John mentioned influential
works by Bob Gallagher, Dimitri Bertsekas. This sort of minimum
delay writing algorithms formed the foundation
of much of the research in the '90s, 2000s, on
utility-based resource allocation. We keep highlighting
this work distributed parallel processing. This is the paper. This is our Bible paper. If you want to
work in this area, this is the first
paper you read. And if you look at the title,
it's not like these days, we write a paper per
word of this title. This is "Distributed
Asynchronous Deterministic Stochastic" everything. And it's basically
the foundation for the extensive
literature-- thousands of papers on control
coordination and optimization of multi-agent systems
that Angelia talked about. Let me say very quickly
where we came to games here. So early 2000s, this
is when I basically started on the faculty, we were
seeing many influential papers here coming from
computer science, where we're thinking
about resource allocation with strategic agents. This is no longer a single
objective function optimization problem. But rather, we're asking
for the question of there are many agents that want to
do the best for themselves. So you can think about this in
the context of transportation traffic. I want to go home. I cannot care less about the
whole delay, but only my delay. You can think about it in the
context of communication data networks, where
you're thinking about source-based architectures. This was the influential work
by Roughgarden and Tardos, followed by Johari
and Tsitsiklis. We came to it from
basically trying to understand economic
markets, pricing, capacity, investment decisions with these
congestion effects and network effects. There's a number of papers. And the other
highlight these days, very much following
that work, is prices-- tolls, is one way of
regulating these flows. How about information? The next horizon is the use
of these GPS-based apps that actually promises to provide
decentralized solutions to this problem. Let me say, like, 30 seconds
on another passion of LIDS and mine in particular,
which is the social networks. And there's a ton
of things we've been thinking about in that
context over the past eight to 10 years. One is learning and
information aggregation opinion dynamics over social networks. Herding, this is why do these
cute penguins herd completely disregarding what they know
about where they need to go? Munzer and I spent two years
trying to understand this-- very interesting. Polarization in opinions--
how do communities form? We're engineers. We would like to do some
design targeting interventions in the context of
social networks. We have a limited budget. In static or dynamic,
how do we use this budget to be able to control
these networks? These days, it goes
much into using the data that you get from
the graphs to be able to learn something
about the underlying graph or dynamics. And the other very
interesting area is, of course, design of online
platforms, digital platforms, review systems,
which is very much a combination of the system
aspects, human aspects, and data. So I will skip the other two. I want to give a little bit more
on large-scale network games here, but I am very
much out of time. I just want to start with
a few concluding thoughts. I love this quote from
John, and this was in one of the LIDS magazines. I think what is great about
being a LIDS student-- it has been for me, is
that it provides a degree in "mathematics of everything." So students are basically
not just getting a degree in a particular thesis. But they learn the tools that
allow them to go and apply it to whatever problem
they find interesting in a systematic
and insightful way. And this research has
diversified over the past 15 years. Its scope has been
significantly broadened And so there is one thing
that has not changed. It's the LIDS-type of work. And that's basically providing
creative systematic solutions that will have very
long-lasting impacts. So thank you. [APPLAUSE] SERTAC KARAMAN: So
our third speaker is Benjamin Recht, who is a
faculty member at UC Berkeley. I've been reading Ben's
paper for a long time, and I always thought that
he must be a LIDS alum. He's an MIT alum. It turns out he's
not a LIDS alum. He's going to tell us where
he's got a degree at MITN. But it turns out
that he actually has walked the
corridors of Building 35 quite a bit during his time. So please welcome Benjamin. BENJAMIN RECHT: Thank you. [APPLAUSE] OK, good. I have this dirty secret that
I am actually a Media Lab alum. I have-- [LAUGHTER] I know. I have a PhD in architecture. It's very interesting. [LAUGHTER] And honestly, weird
stuff was happening there that I didn't
realize at the time. Look, I was 22, and I was
really dumb and didn't know. And actually, it was great. So I didn't know what I was
doing when I came to MIT. I didn't even know really
what I wanted to do, which made the Media Lab. Media Lab is very interesting. You choose your own path. And I thought I was going
to do one thing, which maybe at the banquet
after a couple of drinks, I'll tell you what that was. But I fell into
something else, which is I found my way to Building 35. I do think one thing
that's really fascinating is, you type Building 35
into Google Image search, it is very hard
to find an image. There's, like, this one. And this was also
in the intro slides. And nobody else has
ever taken a picture of this hideous building. [LAUGHTER] The Wiesner building, that
appears all over the place, but sadly. So I managed to find a
way into Building 35. And I thought about
it, and I think I took four courses,
I think, at MIT that have had just a profound
impact on everything I do. And they were 6.432, which
is Detection and Estimation. I took with Greg
Wornell, but also had been developed by Alan
Willsky and Jeff Shapiro. 9.520, we'll just
leave that aside. Although I will say
that, at the time, my TA was this
guy Sasha Rakhlin. He seemed very smart. And then I took 6.24-something-- I don't know. Megretski offered this course
once on complex systems. And I got to take it,
and it was amazing. And then 6.253 with Dimitri. And these four laid the
intellectual foundation for everything I've done since. I mean, it's neat that
it was very welcoming. I walked in. Alan let me hang out in
his group meetings, which were incredible. And I really do feel like LIDS
became my intellectual home, even though it took me a
little while to find it. So yeah, three of them
were LIDS courses. I have a typo. That's all right. So now, one thing I didn't take
a course in when I was at MIT was reinforcement learning,
because at the time, no one cared. Fair enough. We cited it many times. But then, all of a
sudden people used reinforcement
learning to solve Go, and they won't
get excited again. And then they tried
to push this stuff into all sorts of technology. You go from Go being the hardest
problem there is to being I can now solve anything. I'm going to solve all
the difficult problems in whatever field you have. I don't care what they are. We will solve them with RL. And the main problem
that we had there is that games in the real
world are very different places, because here, we
mean actual board games, not games in an economic sense. There, everything's
very well-specified, very well-structured. But you throw these things
out in a complex environment, and things become complicated. Things become very complicated. And this is where we have
to bring in new notions about robustness,
about trustability, about scalability. And so my group and I have been
thinking about those issues in reinforcement learning
probably for about five years now. It's been really fascinating. It's interesting to see
how many of these ideas were seeded through ideas
I learned here at LIDS. So I have a different definition
of reinforcement learning. I tried to figure
out what it meant. And Ben said it's
very complicated. There's a large
community of people. I think the problems
that captured my excitement and my imagination
could be summed up like this. That reinforcement
learning is the study of how you use past
data to enhance some future manipulation
of a dynamic system. Now, everybody in the room
would say, wait a minute. That's not
reinforcement learning. Come up with some
other name for it. Like, what are we
talking about here? So, right, maybe I have my
own spin on these things. That was the view that we
entered into this thing. I think reinforcement
learning, or whatever it is, the reason why
people are excited is it's finding this way
to merge machine learning with systems and control. And what is machine learning? Machine learning
does have this idea of using data to do decisions. Although to be fair, they
claim to do decisions. But most of the time, they
only care about prediction. It is a little bit of a weird
sleight of hand that we play. And the idea is that you have so
much complexity, that really, I want to mitigate that
complexity with data, and use data as a proxy. Summarize things nicely,
and be able to use that to deal with very
complex environments, sensors, and models. And then control, which
I think everybody here is much more comfortable with,
is all about using feedback to mitigate uncertainty. So we deal with uncertainty
by using feedback. And now this is a
way to deal with, in the same sense, environment
sensors and models that are uncertain. So it seems like we
should be able to mitigate both complexity and
certainty at the same time. And that's how we would take
dynamics, some detailed models, robustness ideas from
control, and merge them with these new and powerful
ideas in machine learning. And so how do you do it? Well you go back and you find
some textbooks that maybe can lead you along the way. Of course, they're all
written by Dimitri. [LAUGHTER] And so we started digging into
this, some of the thoughts that we were having
was, OK, look. This does look
like these problems that come up in dynamic
programming and optimal control. And so we went back and we
got this fantastic book. It is true, dynamic programming
was offered when I was here. And it's was a mistake
that I didn't take it. That's my bad. But I came back. We read through that book. We went through volume 2, which
actually, that's really where all the good stuff happens. Volume 2 is really good. And this is the
right cover, right? This is edition 4? This has a lot of really
great stuff in it. I went back to the
neurodynamic programming. It's amazing to see how much--
and Ben pointed this out, how many of the
algorithms that people use today are in that book. And now, as we've
all pointed out, Dimitri has his latest, merging
ideas from all three of these, and then taking
on all the things that have happened in
the 20 years since. And so bringing
those to the table, it does seem like
the way that you merge these things is with this
kind of optimal control type view. We view things as, OK, we're
solving dynamic programs, and we're solving dynamic
programs with uncertainty. And essentially, we can
always just write things as, this is the optimization
problem we'd like to solve. We'd like to minimize some
cost subject to dynamics and we want to find a policy
that actually solves this. And again, Ben pointed out, from
a LIDS background, I minimize. We have an objective, and
we're going to solve it. The unbounded rewards problem, I
think, we'll get to afterwards. Let me just skip it. I'm going to to skip
over these details, which aren't that important. The interesting thing is that
deep reinforcement learning, which many people have heard
of-- which is not the same as deep exploration, is just
you take all of these methods that you could derive
from that framework. And you just put a
neural net in the middle. And the neural net can
maybe deal with inputs that come from cameras. It could also just be used as
a generic function approximator if you have nonlinear dynamics. And all of these ideas, there's
really no new algorithmic ideas that have come out. All the ideas were there. It's the computers got faster
and people got excited, which is great. But I will say, most these
algorithms don't really work. And work, I mean in
a very technical way. I don't know. I need a good
technical definition. Because we're so-- as
I admitted to you all, I did my PhD at the Media Lab. And the motto at the Media
Lab was, it worked yesterday. Everybody knows demo or die. But the key thing about demo
or die is it worked yesterday. And this is the thing. When we mean work, when
we mean work as engineers, we mean I'm going to be
able to throw this out into an uncertain
environment, and not have it do something stupid every day. When really think about the,
like, level of robustness that we need, it's
much more than just having some kind of demo that
will play out once in your lab. And so I think some
of the future things that we're really
excited about is, I think merging non-parametric
prediction, ideas from classification,
ideas from taking high-dimensional sensing,
put them together, and throwing these
into control loops. So some of the things
that we've been looking is how to actually use
really complex sensors. Most of the stuff
we learned in 6.432 had very low-dimensional,
had very nice well-specified models. And now you have to deal with
these really complex cameras that are throwing millions
of time series at you every second, every pixel,
and using forecasting in very clever ways. And Dimitri has great stuff
on this in his new book. And it's really, like,
how do you incorporate these uncertain weird
sensors into really trustable and scalable
autonomous systems. One last thing-- I threw this word-- so the reinforcement learning
versus the control theory thing, everybody
claims their camps. And I just coined this new one,
actionable intelligence-- you guys can take it
or leave it, which is this is this thing
where we want to take data, and we want to use it to
enhance the future manipulations of dynamical systems. I did want to close with
just thing that's really been also captivating our group. We haven't made progress. We're still doing
a lot of reading. I think that this
is the future, is to realize that all machine
learning systems these days are these sorts of actionable
intelligence systems. Machine learning is
built to do prediction. But then we use it and
we show it to people, and then they interact with it. So we try to sell
you stuff on Amazon or we recommend songs
to you on Spotify, or we recommend
YouTube videos to you. And then the people
interact with them. The companies
retrain on the data that they're surveilling
you with every day. And then you interact again. And now you have this very
complex feedback loop. And all of a sudden,
you went from something that was simply a
prediction problem into something that's now a
complex interacting feedback system. And so some of these
social problems and these social
networks issues actually now have this interplay back
with control, reinforcement learning, and machine learning. And they're really
fascinating problems about how to make these systems
more understandable and better for society. All right, with that,
I'll yield my time. [APPLAUSE] SERTAC KARAMAN: And
our next speaker is Luca Carlone, who's a
professor in the Aeronautics Department here at MIT. LUCA CARLONE: Hi, everyone. As Sertac mentioned,
I'm Luca Carlone. I'm an assistant professor in
Aero Astro and I'm a PI in LIDS My group works in the broad
area of robotics and autonomous systems. And today, I want to tell
you about the key ingredient of autonomy, which is
called spatial perception. And I want to relate somehow the
state-of-the-art in perception with foundational
contributions done in LIDS. So you can imagine
that, in order to navigate safely a
self-driving car, or in general a robot, needs to
use sensor data to understand the surrounding environment. For instance, consider
a self-driving car navigating an intersection. For the car to drive
without collision, the car has to understand
where are the lane boundaries, understand crossing, detect
and localize other vehicles, potentially track the
speed of other vehicles, detect traffic lights
and traffic signs, and potential reason over
the future intentions of other vehicles. These are all spatial
perception problems. In other words,
spatial perception is about using the
sensor data to get an internal model of
the external world that you can use for
control and decision-making. Spatial perception is not only
crucial for self-driving cars. But as it turns out that it's
fundamental for many robotics applications, from domestic
robots to industrial robotics, to drones used for
infrastructure inspection, and search and
rescue, for example. And even for applications that
are not typically associated with robotics, such as
vision augmented reality. So an initial question
is, for robotics, how to formulate these
perception problems. It turns out that
the popular model is to formulate perception
as an optimization problem. In this optimization
problem, the problem is searching over a potentially
large set of candidate models of the world. And it's searching
for the model which is minimizing the mismatch
between the sensor data and what the model is predicting
about what the sensor really should look like. So I'm showing here this example
regarding sensor [INAUDIBLE] images, just for-- because it's more intuitive. But the same theory applies
for any type of sensor data. In general, this is called
maximal likelihood estimation in estimation theory. So you can imagine
that here, if we put on the x-axis of a
plot all potential models and all potential
explanations of the world, and for each model, we plot
the corresponding mismatch with respect to the sensor
data, the model minimizing the mismatch with respect
to the sensor data will be a good
explanation of reality. For example, this model
would be a good explanation of the small house
painted in the picture, while other potential models
would achieve a larger mismatch would be poor
explanation of data, such as this tall building here. So as humans, we solve
these perception problems all the time. And it's even tough
for us to realize how difficult of a problem this is. So let's take one second
to watch this video. So this is a real video, which
includes an optical illusion. So most of you
should have thought at the beginning of the
video that the video included five concrete traffic posts. And then you realized,
by the end of the video, just because of the
change in perspective, that 3 of the posts are
not physically there. But they are just
painted on the road. So this example is interesting,
because at the same time, it's showcasing a failure
mode of our perception system, but it's also showcasing a key
strength of our perception. So just as [INAUDIBLE]
what happened here. At the beginning
of the video, you got stuck with a poor
explanation of reality. You thought that there were
physical posts standing on the road. And then when confronted
with more data, you ended up refining
the model and realizing that, essentially, a better
explanation for the video was that some of the posts
were not physically there. This turns out to be a
fundamental and very difficult challenge for robotics. For robotics, and
for robot perception, it's very difficult for
an algorithm to realize it got stuck in a poor
explanation of reality. And it's even tougher to get
a much better model which is a global minimizer
this mismatch function. In technical language, this
means that the state-of-the-art in robotics is mostly relying
on local optimization methods, which are able to find
local minimizer of this cost function. But in general, they
are not able to get to global minimizer which
are the right explanation for the data. So over the last
few years, my group has really been interested
in tackling this question. And what I have
been working on is what we call certifiable
perception algorithms. But the basic idea is to design
algorithms that not only get local minimizers, but
they're able to get good models with explanation
for the sensor data. And they're either
able to certify that the model the
algorithm computed is the best possible
model, given the data. In other words, it's
a global optimizer. Or to just declare failure if
they cannot find such a model. I'm showing an intuitive
explanation here. This is an object
detection problem. On the left, you can
see an algorithm which is not a certifiable algorithm. And the algorithm is predicting
the location of the car to be the one in yellow. Clearly, as humans,
we see that that's not the actual position of the car. So the algorithm is failing. And even worse, the
algorithm is not realizing that there
was a failure here. So it's giving this
solution without declaring-- failing without notice. On the other hand,
we are proposing certifiable
perception algorithm, which not only is
able to compute better explanations, better
models for the sensor data. But they're also able to certify
that the one that is computed is the best possible
explanation of the data. So as a group, we
are currently working on a number of these algorithms
applied to object detection in images, object
detection in LIDAR data, and in general localization and
mapping problem for robotics. The interesting thing
for me is that, while all these contributions
are quite new, the foundation really tracing
back to seminal contribution in LIDS. For example, one of the key
insights behind these methods is that it is even
more convenient. Instead of minimizing the
original function, which is shown in white
here, it is better to replace the original function
with one which is easier to minimize, like the one in
green that I am showing here. This is what is typically
called in optimization a convex programming relaxation. And it turns out that
Professor Bertsekas has been one of the
people establishing the foundation of
convex programming, and pushing the
importance of this field at a time in which convex
programming was really overshadowed by
alternative approaches-- other approaches like linear
and integer programming. A second insight
in the approaches that I'm proposing here
is that it turns out to be convenient to assume
that most of the measurements that you collect for your
sensor have bounded noise. And this is something
that, in control theory, is known as
set-membership estimation. And this turns out
to be, I believe, chapter 6 of Dmitri's thesis,
and [INAUDIBLE] is indeed on set-membership
estimation and control. So in hindsight, really a lot
of work happening more recently is about how to extend
fundamental results established in the field of control
theory, established by Dimitri and others, to cope with
more difficult spaces, like 3D rotations, for
example, or to cope with off nominal
data and outliers. So to keep with
this foundation, we are now able to obtain a pretty
impressive demonstration. Unfortunately, it
is not playing-- to obtain really
impressive demonstration in which we can just use images. The video is now playing,
so I'll narrate it for you. But in this video,
what it's showing is that we're able to analyze
algorithm without taking images from a standard camera,
and are able to reconstruct a 3D model of the environment
in real time on the fly. So these kinds of algorithms
are using pretty much a standard deep learning
algorithm to segment images into objects. For example, in the
image, you can see a desk in yellow, ground,
walls and so on. And then these
kinds of algorithms are running multiple a
large-scale optimization problem to reconcile all the
2D images into a single 3D representation of the
world, which you see here. So these kinds of capabilities
are very important for a robot, essentially, to navigate
in some unknown room, and also to just execute
high-level tasks. But you can imagine that the
same capabilities are also important beyond robotics. For example, you can use these
kind of mapping techniques to help a blind
person navigate a room or reach a desired object. So we conclude with
the last slide, saying that while
perception is a challenging problem with a single
robot, in the future, we envision multiple
agents, multiple robots, to be deployed at the same time,
and sharing the same space. So for example, it is
predicted that, by 2030, you're going to have 21 million
self-driving cars in the United States. And enabling these cars to
communicate with each other creates huge opportunities. For example, you can imagine
that in this picture, if the cars are communicating
with each other, and the first car is detecting
an accident, in a fraction of a second, the car can
inform all the other cars about the accident, allowing
them to slow down in time. While there are
opportunities connected to this capability of
communicating among the robots, there are, of course,
fundamental challenges. First of all, if we
transmit all the sensor data from all the robots-- from all the cars in this case,
it's just too much information being transmitted, leading to
saturation of the bandwidth. The second issue is that, if a
car is receiving all the sensor data from all the other cars,
just a single car does not have enough competition even
to process the sensor data. So likely, again, foundational
contributions done in LIDS come to the rescue in this case. If we trace back, and we
go to foundational work on distributed and parallel
algorithms done from John and Dimitri, we
realize that there are a number of tools that we
can use with the basic idea that, rather than
exchanging and centralizing all the data at a single
agent, we can just split the computation
among multiple agents such that they agree-- they converge on a single
explanation or a shared explanation of the world. And these, of course,
are not simply applied to self-driving cars, but also
to multi-robot deployment. Here I'm showing, for
example, a recent effort we are doing within DARPA's
subterranean challenge in deploying multiple
robots to get a 3D reconstruction of
an underground cave that you see as a top
view on the right. So we'll conclude here by
saying that, in this day of celebration, it's important
to remember that we stand on the shoulders of giants. And that often, the first step
to build a self-driving car, I suggest reading a
good paper from LIDS Thank you for your time. [APPLAUSE] SERTAC KARAMAN:
And now, I'd like to invite our last speaker on
stage, Cathy Wu, who is also a LIDS faculty member,
as well as a faculty member in the Civil
Engineering department. [SIDE CONVERSATION] CATHY WU: Hi, everyone I'm
the final speaker, I believe, for this panel. So this is a really good segue
from Luca and Ben's talk. I'm going to be talking
about reinforcement learning in the context
of urban systems. And so why do we care about
studying control or autonomy in urban systems? Well, building one car that
drives itself is hard enough, but we also want to get a
better sense of understanding the impact of making
this decisions integrator system in the urban system. And this may also
have implications for other types of
technology that we're putting into slices
of societal systems, such as social
networks and whatnot. And so as we look at
controlling automated vehicles, their connectivity,
they have a lot of influences on
the urban system. They have influences on
the traffic, the traffic infrastructure. They may have implications
for disaster planning, for other aspects of
transportation planning, for land use, for policy,
for incentive design. And one observation is that we
have a variety of techniques for solving-- this clickers quite hard to use. We have a lot of
techniques spread across many different
mathematical areas for addressing these
different problems. And one of the
frontiers and hopes is whether
reinforcement learning can one day be a unifying
methodology for these problems. But we're still very,
very early days. So as we look at some
of these problems-- and I'll focus on the connected
automated vehicles context, we do hit a lot of longstanding
control challenges. And I'll focus on some. I'll divide these into
systems and data challenges. And so in terms of
systems challenges, one component, like we saw from
Luca's talk, is hard enough. Now we're combining these
with many other components-- heterogeneous control
signals, heterogeneous actors. We have delayed
rewards and costs. We have limited
performance guarantees as we look at these
more complex systems. Also, as we change
the system slightly-- if we add a new type of vehicle
or if we change the network slightly, then the solutions are
very sensitive to these model specifications. There are also
humans involved in it almost every aspect
of this system, and they're very
challenging to model. They're heterogeneous. These systems are large scale. They're high-dimensional. There's a lot of
computational costs. And then there are the
corresponding data restrictions where, as we look at
more complex systems, the data is harder to
collect and harder to test. So I'm going to give 2 snippets
of recent work from my group that I'm excited
about and building on. And it's really, really
just the beginning for how do you
think about autonomy in these very complex systems. So I'm going to talk
about one that's focused on high
dimensional control that is more methodological. And then I'm going to
talk about some work that is trying to gain some
insights in this domain. All right, so we are-- like Luca said, we are sitting
on the shoulders of giants. Dimitri's books have educated
three decades at least of us, of many in the room. And so we're building upon a lot
of strong, strong foundations. So in the context
of urban systems, the agent that we work with
in reinforcement learning may be the vehicle, the
automated vehicle, and it's interacting with the
environment-- in this case the other vehicles,
the rules, the humans. And the automated
vehicle makes decisions which may include accelerations
or tactical maneuvers and so on. Overall, where the
agent is trying to optimize for its
reward, which in our case-- because we're concerned with
the urban system as a whole, we are concerned with
the average velocity of the entire system, or more
complex objectives as well. So we're optimizing
this objective-- this cumulative reward
over time, and we are optimizing
over some parameter theta that corresponds to the
weights for a deferral network. And that's where
the deep comes in. We've seen a lot of success in
a number of game and physics domains. And so now can we bring
these techniques and insights to more complex,
societally-relevant systems? All right, so one aspect of
this is with urban systems, we have really, really
high-dimensional problems. So let's first take a look
at high-dimensional control. And it's not just urban systems. There are many systems that
are high-dimensional in nature. So what can RL do? How can we improve these methods
for high-dimensional control? So I'm going to focus on
advantage actor-critic. And so there's a long history
of this method from the '80s, and also a lot of theoretical
development from folks from LIDS. And so the advantage
actor-critic is the basis for a wide class
of methods, policy gradient methods that are
widely used today. And the method is quite simple. We take the objective
that we saw. We approximate the gradient. And then we update
the parameters according to this gradient. And the name of the
game is, how do we estimate this gradient well? And what that means is
how do we estimate it in a low variance and
unbiased or low-biased manner. And so the advantage
actor-critic. So actor-critic refers
to the actor being the policy, the critic
being the value function. And the advantage is actually,
instead of taking the value function directly, we
actually can subtract some sort of reference point. And this has an interpretation
of the advantage of this quantity. And there's actually a variance
reduction interpretation for this difference, which is
because variance is actually one of the greatest challenges
with this class of methods. And we expect this
variance to be exacerbated by high-dimensional
control, where basically, our estimate of
the value function is going to be more
challenging to estimate as we have more vehicles and
more dimensions to control. OK, so the intuition
that I like to use is that, well, the
variance of a difference is the variance of the
corresponding terms minus 2 times the
covariance of the two terms. So if we now take a
look at this difference here, if we want to
minimize the variance, then we want to maximize the
covariance between these two terms. So this actually
leaves us close to what is a widely used
result. Before we get there, we might try this toy exercise. If we actually fit this
reference point, also called a baseline very, very well to
actually the Q function itself, we fail because we actually
destroy the gradient. This term will be 0. We have no gradient. We updated no direction. And the problem is that
this is introducing bias. So the state-of-the-art
bias-free baseline is also known as a state baseline, where
you can fit the value function, rather than the
state value function. And this is a really seminal
work from Greensmith, Bartlett, and Baxter in 2004. And so our work
is really asking, can we have, actually,
maintain this bias-free nature? But can we incorporate
action information, which is really
important when we have high-dimensional control. And so I don't have much
time to go into this. But the main insight
is that we're working with stochastic policies. These are probability
distributions over the actions. And so probability
distributions can be factored along the action dimension. And now we actually see
that there's basically this nice interaction between
this log and this product. It allows us to actually
derive bias-free state-action baselines. This allows us to have
an advantage that is across both states and actions. And then we basically
can derive the optimal. We can derive the benefit
over the state baseline. And we actually can see
that this works quite well in practice as well. And so we have
some benefit over-- OK, so I think, since
I'm short on time, I will just fly through. Now we're also interested in
taking reinforcement learning and understanding how we can
employ this body of technique to understand the impact
of automated vehicles on the greater urban system. In particular, we're
exploring the impact of controlling
vehicle kinematics to influence traffic congestion. These problems are actually
too hard for us right now. We're working towards this. What we do is break this
down into smaller pieces that we call
traffic LEGO blocks. We can take a look
at one of these, this single-lane circular track,
and we can train a controller. It was shown about 10 years ago
that a simple setup like this actually produces traffic jams
solely based on human driving, not based on lane changes or
traffic lights or anything. So I should show you
the learn controller that we developed using these
advantage actor-critic methods. Right now, the
controller is off, so that we can actually see
traffic jams forming here. Now the controller
is switched on, and we actually see this vehicle
can take on a different driving profile, and actually
illuminates that backwards propagating wave
that we just saw. Illuminate the traffic
jam, closes this gap. And actually using
systems theory, we can actually characterize
that this result is actually near optimal. And this is something that we
usually cannot do with pure reinforcement
learning techniques. We need actually
some systems theory to allow us to bounce
the performance for these techniques. We can show that these
techniques, this controller actually generalizes, and
does not require memory. OK, so I'll just stop there. We can show this for now
a variety of other setups as well. But I won't go into that. OK, thank you. SERTAC KARAMAN:
Thank you so much. [APPLAUSE] And so now you heard
from Ben van Roy first in his keynote talk,
and then our five panelists. And now I'm going to
start out with a couple of standard questions. So I'm hoping that some of
the more exciting and unusual questions will come
from the audience. I'll start out
with two questions. And then we'll start--
there's two microphones, and you can take them
and get the question. So I'd like to start out with
my first question regarding the past. So I think the last time
we've done a conference like this was the Paths Ahead,
and it was pretty much exactly 10 years ago. And I was just
looking at the agenda. And I could see that, for
example, systems control and optimization is
actually split up into two different sessions. There was another
session of learning. And looking back 10 years
ago, what do you think has happened in the 10 years
that you found surprising that it emerged or reemerged? Or looking back 10 years
ago, would you actually see yourselves doing
the kinds of things that we're doing today? [INAUDIBLE] BENJAMIN VAN ROY: So in my view,
the context has changed a lot. And context matters a lot. The scale of computation that
is available to do analysis, as well as the amount of data we
are gathering constantly today, is astronomical
compared to 10 years ago because of cloud
computing also first because of the internet,
but even more so because of cellular penetration. Everybody in the world
suddenly has a smartphone now. And so with that change
of context, the algorithms we use for learning and systems
and control will also evolve. I think the trend
is toward using simpler and simpler algorithms,
but where nuances of the design might matter a lot. And also, this relates
to what Ben is saying. At some level, the
problem of control encapsulates everything
going on now. But the term control
conjures up approaches that were designed with a
different context in mind-- a context when control
was a popular term to use. So that's my take. SERTAC KARAMAN: Any others? Yeah, Luca? LUCA CARLONE: I
just want to add-- just exciting times, I would,
say over the last 10 years. Of course, like one month
from now, one year from now, you can say the same
thing about the future. But it seems the best
time to do research on the topics we're working
on right now-- in my case, robotics and autonomy. Thinking about going
back, like, 10 years I was thinking that we
went from the DARPA Urban Challenge 2005, 2007, being an
academic exercise back then, to Tesla selling cars
with autopilots right now, and self-driving cars
being widely adopted. I think there are
hundreds of thousands of self-driving cars
driving right now, at least on limited roads. And if you look at the
progress in robotics, it's just very
exciting what happened, Amazon buying Kiva
Systems for $700 million, iRobot selling the Roomba. Most of you guys probably
have a Roomba at home, iRobot selling 25
million units of Roomba. The thing that I want to add
is that it's interesting, because of course,
there was progress on the algorithmic side,
and a lot of research that was done over
the last 10 years and before is now
transitioning to products. So there is a lot of
good research happening. But the thing that is
interesting on my side is also to realize how
unexpected sometimes are the sources of progress or
the breakthroughs in a field. For example, you realize that
most of the machine learning revolution is driven by the
fact that, right now, we have a huge amount of data which
is a consequence, if you think about data, of the
internet, Facebook, and all these kinds of services. We have a huge
amount of computing. But that, again, started
more as something that was promoted by
the gaming industry to have very good
real-time vendor for games. And in robotics,
it's even more so. Despite the progress on
the algorithmic side, sometimes like a lot
of better sensors that we develop and
we're using for robots were just due to better
cameras being designed, for example, for mobile phones. So it's interesting to me that
there is this likely connection between different efforts across
different research-sharing technologies, coming
together with very good and well-designed
algorithms developed by the research community. SERTAC KARAMAN: Thank you. And I think that came
up in many of the talks, the topic on machine
learning-- reinforcement learning especially. But I wonder how does
the audience see that? So I think it came up
on many of the talks that you all alluded to. And how do you see the
emergence, or the reemergence, of machine learning and-- I think that when I
was a graduate student, I was sitting in the audience. And I was looking
at these talks, and I was trying to pick up,
like, a PhD topic for myself. And the students here, what do
you recommend for the future, for the next 10 years? What do you think that
they can focus on? What do you think we'd be
talking about in LIDS@90? Yeah, Angelia, please go ahead. ANGELIA NEDICH: Yeah I
was thinking, especially because when dealing
with this data, it's already emerging,
like, security and safety. So when you have
data, if you are using it to build things that
are-- you have autonomous systems, there is a
potential of somebody hacking into the systems. So you have to have a way
of protecting your shared information. And also, that's
what I would think-- some of the privacy
of the data as well. So those are some of the
aspects that I see emerging. Some of this forensic type
data analysis showing up. SERTAC KARAMAN: Asu
and Cathy, maybe? We'll start with Asu. ASU OZDAGLAR: You're
asking tough questions. [LAUGHTER] SERTAC KARAMAN: I warned you. ASU OZDAGLAR: I know. Yeah, clearly, machine learning
is right now very popular. It's hard to find
somebody who does not work on machine learning. There's a lot of
interesting problems in the context of a convergence
of optimization, statistics, computation. In terms of moving
forward, I think most important will be machine
learning has impressed us with all these successes
in vision, natural language processing, cats and
dogs, identifying classification problems. What they think the next stage
now will be going into more and more applications, where
safety-critical applications are deployed in problems,
where we're going to see more and more-- we would like to make sure
these are robust and can be used as an engineering technology. So how do we get there,
I think, probably is the next big question. And the other one
is basically, when you apply these
in applications-- of these societal
applications, of course, these will be very much
interacting with humans. And information will
be coming from humans. Information from
humans is very tricky. So how do you actually-- there's still the biases,
still be able to learn, and then bring the
human-societal aspect together with these robustness
adversarial effects in the machine learning
systems to be able to make it into an engineering technology. SERTAC KARAMAN: Cathy? CATHY WU: I just want to
emphasize this last point on the human use. I think that as we are
maturing these technologies, and seeing machine learning
being really effective at very well-defined tasks,
we still have very little understanding about
how they interact with humans. We have very little
understanding of humans. [LAUGHS] And so I think a
more concerted effort on modeling humans or
understanding, I think, is one thing I would like
to see in 10 years, having a lot more progress. And then I think that,
in the last 10 years-- I think in addition to the
maturing of machine learning techniques, I think
there is a lot to say about how accessible
the community has made it. I think you mentioned
that every high school-- or maybe not every high school. Many high schoolers can
download these packages and play with cart-pole and
they can get their feet wet. And I think that
something could also be done for making some
other types of methods more accessible as
well, because they do have a lot to contribute,
but they're harder to get into. SERTAC KARAMAN: OK, maybe I'll
allow, in the remaining 15 minutes-- from the
audience, is there any questions in the audience? I think there is a few. I've seen Devavrat over here. Maybe we'll start there, and
go one in each direction. OK, there's one over there. AUDIENCE: What are other current
landscapes and future emerging pollinations between
control theory and the other three
disciplines-- one, non-equilibrium
statistical physics, second dynamical systems
and chaos control, and three manifold
learning and topology? SERTAC KARAMAN: Anything? [INAUDIBLE] CATHY WU: The questions
just got harder. [LAUGHTER] SERTAC KARAMAN: I know that
when I get to the audience, it will be harder. But I guess it was mentioned
statistical physics is one thing that you mentioned. It actually came up, I
think, in one of the talks. If not statistical
physics, but people mentioned it one way or another. And there was a couple others. What were they? AUDIENCE: Chaos control. SERTAC KARAMAN: Chaos. And then the third one was? AUDIENCE: Manifold
learning and the topology. SERTAC KARAMAN: For any
of them, any takers? BEN VAN ROY: I wish I understood
all those topics, but-- [LAUGHTER] But similarly with my expertise
in genetic algorithms-- [LAUGHTER] I'm not competent really
to all those topics. CATHY WU: We need more
bridges to more people. SERTAC KARAMAN: Yeah. OK, maybe I'll continue
with another question, and we'll try to come back. Was there a question from here? Yeah. Was there a question? Yeah. AUDIENCE: Just to
play devil's advocate, being a student of Dimitri
that moves on to be discrete-- by discrete, I mean real
computer science, rather than in-between, I wonder if
there is not the element here of when you have a hammer,
the whole world is a nail. In some way, control theory
only had a big success-- for example, jets. And jets are a good
example of something that we don't imitate nature. It took a long time
until people understood we better have airplanes, rather
than imitate birds flying. But I observed, in my
department, 30 years ago people doing vision Hessian. Luckily, I still
remember what Hessian is. And it seems to me this is going
nowhere, because in our brain, we do not do Hessian. So it seems to me that
the problem of vision will be solved when we
start imitating the brain. And this is what in
reality happened. Really, I think until
we got deep learning, we didn't have really
a good visual system. So at this point, it
seems to me that we succeeded imitating to a
very low level of the brain. But the brain interacts
with cognition, with reasoning, and all
sorts of stuff, that goes. And it seems to me that the next
breakthrough in understanding will come from brain science-- understanding how this
high-level reasoning and lower-level
reasoning interact rather than controlled. SERTAC KARAMAN: Yeah,
I guess that was a comment or a question. [INTERPOSING VOICES] LUCA CARLONE: I completely
disagree with that-- unfortunately with the point of
saying to do with engineering we should study what the
human brain is doing. First of all, I have the
belief that the human brain and human performance is a
proof of concept of something that we are not able to do
in many cases with machines. But it is not necessarily the
best solution for the problems that we have to solve. Also because, in defense
of the human brain, the human brain has
power constraints, and so it is operating in
challenging conditions, because it is working on
a very tight power budget. So I don't think that
whatever we do with machines should just seem to separate
to that upper bound, but can eventually
cross that bound. And you can think
about an example of that-- if you think
about robotic arms, robotic arms are just
imitating what a human is doing in terms of manipulation. But right now, they're
outperforming the precision of humans in every [INAUDIBLE]
matched manipulation-- not manipulation. Let's say welding tasks. And self-driving cars, the same. Eventually,
self-driving cars are projected to drive much
better than a human, just because there
is no constraint on the type of
sensors that you put on the robot, the
amount of computation that you put on the robot. So I think that, again, we
can draw a lot of inspiration from humans. But we do not have
to feel constrained about seeing that as
the only viable model to get intelligence. SERTAC KARAMAN:
Any other takers? Otherwise, I'll move
to the next question. AUDIENCE: I have a question. SERTAC KARAMAN: Peter? Maybe we'll take one from
Peter, and then from you. [LAUGHTER] Seniority rules. AUDIENCE: Well, first, I'm going
to make a very rash prediction. That self-driving cars will have
considerably fewer accidents than human drivers,
because they'll never drive or inebriated,
amongst other things. However, there will be a crash
on the California freeway that will involve more
than 1,000 cars when the system is automated. The second thing-- so
I'll move from the sublime to the ridiculous. There is a very old
paper by Jurgen Moser, which basically is at the heart
of penalty function method. So although it's
never been exploited, and it's a substitute
for Lagrangian methods, and it's also a substitute
for using gradient descent. And I have no idea. But you like to
look at old papers. I do too. And so I think it's something
that people looking at this might take a look at. It's an obscure paper, I mean. But it's worth looking at. It's in the 1950s. SERTAC KARAMAN: And it's
another, I guess, comment. Any comments? Otherwise, I'll move
on to finally Devavrat. The floor is yours. AUDIENCE: First of all,
excellent talk, panel. Thank you. In many of the
settings, especially let's say over the
past few years, I was interacting
with retailers. And retailers is an
organization which is primarily driven by humans. And then you will tell them what
is the right sets of decisions to make. And they say, well, here is
the right decision to make. And when humans get involved,
especially part of the decision loop-- not rather just providing
the input, so to speak, it's very complicated
and challenging. So I would love to
hear the panel's view on what might be a
good way to, let's say, take a beautiful model of
MDP or a variation of that, with only just a
game theory behavior. But what might be the right way
to think about getting humans into the loop as we think
about decision systems, especially thinking
about decision systems within organizations? Now again, I am
cognizant of the fact that if you want to
have a good health, you should stop smoking. But most smokers don't do that. And democracy is at an
interesting place right now, despite the fact that we
know what's good or not. SERTAC KARAMAN:
Thank you, Devavrat. ASU OZDAGLAR:
Beautiful question. [LAUGHS] Whenever-- if you
talk about humans, one thinks about immediately
game theoretic models. But the problems
you're talking about are so complicated with so
many different factors-- humans trying to think
about a multi-agent MDP sometimes does not get us too
much tractability in terms of addressing the problem. That being said, I'm
still a strong believer that there may be reduced
models that actually can somehow bring the strategic
motives into the loop without having
full-fledged game theoretic models between so many agents. So still, I think there's
some combination of the ideas with reduced representations. That would be the way to go. But I may be biased. CATHY WU: Another
perspective may be-- I think instead of
a reduced model, I think potentially reduced
numbers of stakeholders. Potentially being
strategic in pinpointing who to make recommendations of
decisions to may facilitate-- my hope, may facilitate
some transition of research into more practical aspects,
where say in the city context, instead of needing to convince
every citizen of the city, if you can convince the mayor
or a few key individuals. And some decision is then
translated to policy. That just then becomes
a rule, then that can-- ASU OZDAGLAR: Let me
also add-- by the way, I think another very
exciting direction would be to be able to combine
the data and empirical work, together with reduced models. So we're thinking about
non-machine learning for data coming from humans. So is there a reduced way
of representing the motives information in such a way
that the machine learning algorithm takes that into
place, instead of just thinking about it as IID data. So I think that would be
a very exciting direction. SERTAC KARAMAN: Yeah. Go ahead, Ben. [INTERPOSING VOICES] BENJAMIN RECHT: I'll say this. I'll say this, and
I think this is a challenge for the
entire room of LIDS folks. I think one of the
most challenging parts, and challenging aspects,
of interacting with people is that things stop
being quantitative. And something I've been
seeing a lot in my group, and my interactions with
other people on campus, is how exactly do
quantitative people like us build nice bridges
to qualitative research? I actually think it's not
just the amount that we're going to go fix stuff. Because I feel
like there's a lot that we can learn from that
kind of qualitative aspect. And to me, that's
a grand challenge. How exactly do we take these
kinds of systems thinking, and when we're bringing it
into more social systems, interact with the unknown
and the unknowable? I think that's a
grand challenge. BENJAMIN VAN ROY: OK,
so let me address this, but also speak to the previous
question about a projection into the future of what might be
big and where things might go. But I think our view
of machine learning thought today is
typically pretty narrow. It's like you have
this data set, and you fit a model
into that data set, and that kind of stuff. But a machine should
be able to learn from all sources of information. And I think that part of that
is interacting with humans and learning from them,
just like students learn from teachers and
school, or small children learn from parents. There are algorithms
that can do that, as well as learning
from empirical data. And I think that that's a real-- I think that one trend we're
going to see going forward is greater abstraction,
because there's so much data collected from
so many different problems and different things. And you have access all
that, and you have access to so much computation. So abstraction is
going to be lifted to higher and higher levels. So like, John Tsitsiklis
gave this nice talk this morning, where he
talked about how LIDS has a tradition in abstraction. And probably in the
early days of LIDS, taking a class of problems,
like inventory problems, and abstracting that, and
saying we're coming up with ideas that are going to
be used to solve all inventory problems, was mind-boggling. They are a separate
inventory problem. But that the LIDS approach is
to abstract away from that. And then the next layer
is coming up with ideas that are relevant
to many problems, but that where
the researcher has to think about how to
map it each problem. But I think pushing
forward, there's going to be greater and
greater abstraction, where machine
learning algorithms will be designed not for a
specific class of problems. But you could even think
of each class of problems as being a data point, and
it's trying to generalize across classes of problems. And so what you're working on
will be at a much higher level. And the machine
learning algorithm could do things, like
collect empirical data, talk to the human, learn
what the human wants, learn from the
human's experience-- do all of that kind of stuff. SERTAC KARAMAN: Thank you. I think we have one more
question for Sanjoy. AUDIENCE: This is a little
bit a view from afar. It seems to me that there
are fundamental problems in this circle of problems
that are being talked about, like control with the vision
center and the feedback loop. It seems to me fundamentally
that problem is complete. How should we represent images,
when in some abstract sense, it should be
topological invariance, inactive components, et
cetera, and track that. And I think the issue that
Chomsky raises, poverty of stimulus, as I
understand it right now, it is using lots and lots of
training data in order to do, let's say, pattern
recognition in a broad sense. But the issue of feedback is the
issue of poverty of stimulus. How would you do this,
and with what kind of data, where the amount
of data that you need is limited in some sense? And this is related to what
happened in systems theory. This is invariant thinking. For example, what things
can we do with feedback, and what things we cannot do? There are essential constraints. So my suggestion
is we need to look at a major problem like
vision in a feedback loop in a systematic,
more fundamental way. SERTAC KARAMAN: Thank you,
Sanjoy, for the comment. Any anybody want
to say anything? BENJAMIN RECHT: I think
we all agree, right? [LAUGHTER] ASU OZDAGLAR: Me too SERTAC KARAMAN:
OK, so that said, I think we're out of time. So let's thank our panelists. [APPLAUSE]