LIDS@80: Session 1 Panel Discussion

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
SERTAC KARAMAN: OK, welcome again, everyone. It's an honor to kick off this session. So in this session, we have a number of distinguished panelists, and I'll introduce them one by one. I'll keep my introductions short so that they can get some time to do their talks. First, I'd like to invite on stage Angelia Nedrich, who's from ASU, and will kick us off with the first one. ANGELIA NEDICH: Thank you, Sertac. [SIDE CONVERSATION] ANGELIA NEDICH: So I choose to talk about optimization and the things that have been done in LIDS in terms of large-scale decentralized optimization network systems, maybe because I was closer to that area, and I could relate to most of the work that was done. So what I want to start with is, like, where today's challenges are. I took this off the NSF web page. There are three relevant-- kind of go along with things we have heard today, is that one of the bigger-- this is one of the 10 challenges that NSF is talking about. And one of-- it's a future science, data-driven autonomous systems, which really talks about how we're going to be collecting a lot of data, using data, managing data. And it's going to be one of the directions in which research will be driven through machine learning, artificial intelligence, the Internet of Things, and so on. [SIDE CONVERSATION] ANGELIA NEDICH: Another thing that caught my eye there was talking about-- they use the word convergence research, which I couldn't get what they mean by that. But if you look further in the description, what they mean by that is what actually is happening at MIT is having people from different backgrounds, from different expertise, to be able to solve these future problems-- challenging problems, they don't perceive one discipline running the whole scenery. But there will be interplay of multiple decisions, areas of research. They're going to require convergence in terms of merging ideas. That was the word that I couldn't figure out what they meant by. And they expect that combining different ideas from different disciplines is going to eventually lead research and development in the next 20 or so many years. So when you look back at what were the past challenges, over the last maybe 30, 40 years, things were similar, actually. Questions were similar. There was also a of dimension, complexity, except these systems in the past were a bit smaller and less complex. So they were also dealing with data at the time. But the data was collected in a smaller scale and slower, and the devices were slower, with smaller processing capabilities. And the internet was actually just at the beginning. So if you remember, internet in the late '90s, it was very slow, when you had to open a screen, or you had to wait quite a long time to get your page open. So what I see and the difference is just the scale and the speed at which the things happen. And in complexity of the systems that are now networks of networks that we're heading to. So when I look back is like, what are the things that have been done into that direction. One thing that definitely stands out is the Lagrangian methods that were studied by Bertsekas even from the early '70s. And then these methods actually were part of-- some kind of splitting methods were part of the thesis from one of Bertsekas' students, Eckstein, which is also leading to ADMM method, which with one recent paper with multiple authors just regained a lot of attention. So a message is here is ADMM became popular. But ADMM has a long tradition, and since that way back in the past, and it has roots at LIDS. Are another aspect or a research direction had to do with parallel optimization, which is also a particular topic I here have. Also a student of Professor Bertsekas. And it was already looking at some parallel methods. But the concept here is deeper. It goes into a monotone operators. Paul Tseng was another student of Dimitri, and he worked really extensively on all kinds of optimization methods. He worked in parallel block coordinates, incremental asynchronous decompositions, and so on. Tom Luo was also part of the LIDS and co-timer with Paul. He worked with-- his advisor's Professor Tsitsiklis. But he also worked on optimization methods on large scale. Actually, his aspect was more on the complexity, from what I gather from some sequence of papers at the time. Now let's say if you look at current machine learning-- there are different communities of machine learning. And a lot of different people have different interpretations of what machine learning is. But there is one particular stylized problem with optimization, which runs under the name machine learning. And in principle, it's something that in LIDS was known as incremental method. You have a sum of functions. You want to minimize them. And it's complicated to go, because the sum involves a lot of elements. So you don't want a computer gradient of the entire objective. You just go one by one. Whether you do it cyclically, whether you choose incremental, it's known as incremental methods. And actually, there is a work by Bertsekas and Tsitsiklis in that direction. Paul Tseng had actual papers that are out after the thesis. My thesis was on that topic, even though my thesis never in the name spells the name incremental. It's hiding under a different name. Then Professor Ozdaglar with, I think, Mert Gurbuzbalaban was as a postdoc here, and Professor Parrilo, had also followed up with a sequence of papers on incremental aggregated methods, which tells that. And also the paper I think that shows that somebody shuffling cycling methods can beat the basic cyclic method. Basically, you are reshuffling, you're breaking the curse of wrong bad orders-- bad patterns. So basically, I see this sequence of machine learning papers, which are in the community of machine learning, are not as-- it seems that sometimes, they may be left out a bit. OK, and one more aspect that I wanted to discuss is also another kind of area that is these days very attractive, which has to do with decentralized computations. Sometimes it runs under the name multi-agent systems optimization in networks that are different. In different places, people they give different names. It starts with a work by Athans, Bertsekas, and Tsitsiklis. And actually, Professor Tsitsiklis' thesis has the model in place. And in the book also, there are, by Bertsekas and Tsitsiklis, in Parallel and Distributed Computation, I think it regained recognition. As Ben was just mentioning, it fell into a forgotten state. And then all of a sudden, people remember the book. And I think for that work, you received a von Neumann Theory Prize. Then Asu and I picked up that work-- maybe we started a couple years earlier, but the publication showed in 2009. That's the first paper that was published. We continued-- borrowed the ideas of this distributed kind of information exchange, but we addressed different problems, like basically machine learning problem set up in the mutli-agent setting. After that point, I think that Olshevsy, Alex-- I don't know. I just saw him earlier. His thesis also taps into the aspects of these, actually, aggregation strategies, which is really hiding a consensus label or agreement protocol in a name. And then after that, I am mentioning three of us, because after that, we have students who have followed and worked on various aspects of these problems-- on directed graphs and all kinds of things. And I think I go to also draw attention to what Alex is doing. He's actually looking at interesting aspects that, actually, some of these strategies demonstrate network-independent scalability. So what you have in these algorithms is that there is a transient behavior that depends on the network structure, even when the network structure can be time-varying. So there's this transient time when, if you analyze the algorithm, you can see the effects of the graphs. And after the transient time, the effective graph disappears. It doesn't affect the final asymptotic scaling of the protocols. And that's about it. Thank you. [APPLAUSE] SERTAC KARAMAN: Thank you Angelia. I'd like to invite our second speaker to the stage, Asu Ozdaglar, who is a faculty member at EECS at MIT-- department head at EECS as well. [INTERPOSING VOICES] ASU OZDAGLAR: Perfect. It's wonderful to be in this event. I was so excited. It feels like I'm almost at my wedding. I know everybody. So I just want to talk to you briefly about my journey with minmax problems and games, and the very huge effect of LIDS in this journey. So I'll go back 15 years, and I'll see how things picked up throughout the time as we're doing research in this space. So let me start with the formulation. We're interested in a minmax problem for our function with two variables-- very standard. And these arise in a multitude of applications. And I was writing the applications, and I was laughing. This is how applications looked like 15 years ago for LIDS students. Of course, there's robust optimization that has a lot of research these days over the past 10 years. Basically, this is the backbone of that formulation. y is a parameter that we don't know. We're trying to minimize the cost function over x, assuming the worst possible value of y. So that's what they mean by minmax. And then the other one that we were very interested in-- this is part of my PhD thesis, Angelia's PhD thesis. And I'll of talk about the connections in a minute. We're interested in solving constrained optimization problems. So we have a primal problem with constraints. We're minimizing sum f of x, subject to a bunch of constraints. And one very natural way of solving such a constraint problem is, you relax the constraints. You form a Lagrangian function. And then you formulate a dual problem, whereby you minimize the Lagrangian function over x, you find a dual function-- beautiful concave function, no matter how crazy primal problem is. So you try to solve this possibly now smooth concave problem. But the point is this is also a minmax problem. So how do we solve this problem? I'm very interested in computing a saddle point. That's the solution we're interested in. So the saddle point is you take f of x, y. You fix y star minimize over x. That's your x star. You fix x star maximized over y. That's your y star. So this is the solution concept we're interested in. And a method we love very dearly in optimization is gradient descent-ascent, which is the simultaneous iterations. You're trying to minimize over x, maximize over y, simultaneously take steps along the gradient with respect to x minus gradient with respect to x plus gradient with respect to y. So you try to do this simultaneously. And it's magical that, while doing this simultaneously, this will converge the saddle point. And I would like to also give you a little bit of history of where this method emerges. This is one of the things that I found as a student. This is toward the end of my PhD-- fascinating. So while we like this in optimization a lot, the foundations of this method is actually mathematical economics. So if you look at Samuelson's work, or actually a beautiful gem that I would recommend everyone, this is a book that we loved with Angelia-- Arrow, Hurwicz, Uzawa, who are economists. And the book is Linear And Nonlinear Optimization. So it was the first time this emerged as continuous time versions of these methods, and proves some convergence results under strict convexity assumptions-- strong convexity. Uzawa, another economist, focused on a discrete time version. Showed convergence to a neighborhood under, again, strong convexity. And then a bunch of other very, very strong papers-- Gol'shtein, Maistroskii, Korpelevich, which also introduced extragradient method. Please look at the dates-- '77, '58. So this is where these methods were studied. And we got very interested in this, Angelia and I. We got these papers. And then this is essentially I'm talking about 15 years ago. And I was looking over my notes. I keep all the papers. I have stacks of papers in my office. I was going over my notes, looking at these papers. And I found this. So this is Gol'shtein's paper. And those who know how I work will recognize the pinks and yellows. And there are some notes. I was looking there's some-- assumes bounded subgradients. That's Angelia's handwriting. And then I go to these assumptions, and I think we put something very restrictive. So we were never happy reading these papers. We always found something to pick on. And so I remember, actually, this was again when we were working on these problems 15 years ago. Get stacks of these papers-- coffee shop. Lots of coffee with it. And then we tried to say, OK, this restrictive is strong convexity. This bothered us, because the dual problems-- I showed you the Lagrangian. That's linear in mu, the Lagrange multiplier. So this won't work for duality, which we were interested in at that time. So we were sort of so-- you remember-- completely excited about this notion of showing convergence for ergodic averages. So instead of thinking about data that's generated by the algorithm, we were taking time averages of all data that's generated so far. And that allowed us, for these convex problems, without strong convexity, to be able to get per-iterate convergence rate estimate. So I wrote, very high level, one of the results-- again '09. This is very misleading. This was 2005. I remember this in a coffee shop. So this is basically thinking about converging to the saddle point through a gradient descent-ascent But I also put in blue the main assumption we were working with. Subgradients are uniformly bounded. So we assumed, actually, that these minmax problems are over compact sets. So we have convergence rate estimates. It goes to a neighborhood, where you can choose the approximation quality by playing with the step size. You have 1 over k convergence rate. So we were fascinated by that. So I'm going to fast forward. 15 years this is a paper. We left-- don't think I looked at minmax after that for a couple of years. Then comes machine learning. The pictures got much more interesting. So we see pigs here. So on the left, there's some picture of a pig. The state-of-the-art classifier classifies it to be 91% pig. And then you add a very little imperceptible noise. And then that classifier classifies it to be an airliner. So this is the joke in machine learning that says, OK, machine learning makes pigs fly. [LAUGHTER] Basically, if you look at what we are thinking about, in here, the standard generalization without thinking about these adversarial perturbations is we want to minimize over the model parameters some loss. We have the data x, y. In the case of classification, x are examples, y are labels, drawn according to some distribution D. We want to choose our model to minimize this expected loss. So the robust version is nothing, but you want to do this against perturbations of the input data. How to define that perturbation is a very interesting question. But Aleksander Madry in a recent paper assumed these are infinity perturbations on x. So we're thinking about this robust training, where we're minimizing the same cost against the maximum perturbation. So I'm running out of time. Another fascinating question-- generative adversarial networks. So we're basically thinking about designing some generator neural network which maps random light noise into in high-dimensional object with structures. These are fake images. These are cars that are trained for using samples from a true distribution. And there's a discriminator that's trying to distinguish between these two. Very interesting, and you write the problem down using different kinds of formulations. And you run into another minmax problem. So the point is these are all minmax problems, but with very different flavor. So objective functions, we were fighting strongly convex. These are not convex-concave-- non-convex So what happens with that? You look at many empirical papers. There are training oscillations, these issues of mode collapse. This is the true distribution. You look at the steps of the algorithms, so you basically oscillate between modes of the true distribution over there. And even on the simple bilinear, because this is what got me very concerned. We have a bilinear convex problem, GDA diverges. And then we were like, we worked on this problem. We made it converge. What's happening? So this is basically, when you don't have compact sets, this will actually diverge. Even if you have compact sets, it was not the convergence of the point that we were getting, but rather the function value. So the considerations have changed. And then last year, I went to a talk given by one of our faculty, Costas Daskalakis. He put this algorithm, which I had not seen before, optimistic gradient descent-ascent. This is a GDA with a little negative momentum. And he was showing, basically, all the beautiful things this algorithm was doing. And then there's the 1 over 2 there. And somebody asked Costas, why is it 1 over 2? He said, it should be 1 over 2. And this actually was studied before in the online convex optimization literature by Sasha and I was fascinated. Why is that 1 over 2? So my fascinations are very weird. So we started basically looking at this with a student of mine, Sarath, sitting at the back, as well as our postdoc. So we actually took our GDA, as well as the extra gradient that I just talked about, and actually showed that these are approximations of the very seminal proximal point method studied by Rockafellar, as well as Dimitri, in the '80s. And you can actually see from that lens what's happening with all of these algorithms why that 1/2 and many other insights. So I'd like to, in my 30 seconds, go very quickly to the games. So I won't have time to say much about it. But before going to that, of course, one cannot move to games without stopping, if you're a LIDS student, at the distributed decision problem. So this was an active area in AD's distributed resource allocation. John mentioned influential works by Bob Gallagher, Dimitri Bertsekas. This sort of minimum delay writing algorithms formed the foundation of much of the research in the '90s, 2000s, on utility-based resource allocation. We keep highlighting this work distributed parallel processing. This is the paper. This is our Bible paper. If you want to work in this area, this is the first paper you read. And if you look at the title, it's not like these days, we write a paper per word of this title. This is "Distributed Asynchronous Deterministic Stochastic" everything. And it's basically the foundation for the extensive literature-- thousands of papers on control coordination and optimization of multi-agent systems that Angelia talked about. Let me say very quickly where we came to games here. So early 2000s, this is when I basically started on the faculty, we were seeing many influential papers here coming from computer science, where we're thinking about resource allocation with strategic agents. This is no longer a single objective function optimization problem. But rather, we're asking for the question of there are many agents that want to do the best for themselves. So you can think about this in the context of transportation traffic. I want to go home. I cannot care less about the whole delay, but only my delay. You can think about it in the context of communication data networks, where you're thinking about source-based architectures. This was the influential work by Roughgarden and Tardos, followed by Johari and Tsitsiklis. We came to it from basically trying to understand economic markets, pricing, capacity, investment decisions with these congestion effects and network effects. There's a number of papers. And the other highlight these days, very much following that work, is prices-- tolls, is one way of regulating these flows. How about information? The next horizon is the use of these GPS-based apps that actually promises to provide decentralized solutions to this problem. Let me say, like, 30 seconds on another passion of LIDS and mine in particular, which is the social networks. And there's a ton of things we've been thinking about in that context over the past eight to 10 years. One is learning and information aggregation opinion dynamics over social networks. Herding, this is why do these cute penguins herd completely disregarding what they know about where they need to go? Munzer and I spent two years trying to understand this-- very interesting. Polarization in opinions-- how do communities form? We're engineers. We would like to do some design targeting interventions in the context of social networks. We have a limited budget. In static or dynamic, how do we use this budget to be able to control these networks? These days, it goes much into using the data that you get from the graphs to be able to learn something about the underlying graph or dynamics. And the other very interesting area is, of course, design of online platforms, digital platforms, review systems, which is very much a combination of the system aspects, human aspects, and data. So I will skip the other two. I want to give a little bit more on large-scale network games here, but I am very much out of time. I just want to start with a few concluding thoughts. I love this quote from John, and this was in one of the LIDS magazines. I think what is great about being a LIDS student-- it has been for me, is that it provides a degree in "mathematics of everything." So students are basically not just getting a degree in a particular thesis. But they learn the tools that allow them to go and apply it to whatever problem they find interesting in a systematic and insightful way. And this research has diversified over the past 15 years. Its scope has been significantly broadened And so there is one thing that has not changed. It's the LIDS-type of work. And that's basically providing creative systematic solutions that will have very long-lasting impacts. So thank you. [APPLAUSE] SERTAC KARAMAN: So our third speaker is Benjamin Recht, who is a faculty member at UC Berkeley. I've been reading Ben's paper for a long time, and I always thought that he must be a LIDS alum. He's an MIT alum. It turns out he's not a LIDS alum. He's going to tell us where he's got a degree at MITN. But it turns out that he actually has walked the corridors of Building 35 quite a bit during his time. So please welcome Benjamin. BENJAMIN RECHT: Thank you. [APPLAUSE] OK, good. I have this dirty secret that I am actually a Media Lab alum. I have-- [LAUGHTER] I know. I have a PhD in architecture. It's very interesting. [LAUGHTER] And honestly, weird stuff was happening there that I didn't realize at the time. Look, I was 22, and I was really dumb and didn't know. And actually, it was great. So I didn't know what I was doing when I came to MIT. I didn't even know really what I wanted to do, which made the Media Lab. Media Lab is very interesting. You choose your own path. And I thought I was going to do one thing, which maybe at the banquet after a couple of drinks, I'll tell you what that was. But I fell into something else, which is I found my way to Building 35. I do think one thing that's really fascinating is, you type Building 35 into Google Image search, it is very hard to find an image. There's, like, this one. And this was also in the intro slides. And nobody else has ever taken a picture of this hideous building. [LAUGHTER] The Wiesner building, that appears all over the place, but sadly. So I managed to find a way into Building 35. And I thought about it, and I think I took four courses, I think, at MIT that have had just a profound impact on everything I do. And they were 6.432, which is Detection and Estimation. I took with Greg Wornell, but also had been developed by Alan Willsky and Jeff Shapiro. 9.520, we'll just leave that aside. Although I will say that, at the time, my TA was this guy Sasha Rakhlin. He seemed very smart. And then I took 6.24-something-- I don't know. Megretski offered this course once on complex systems. And I got to take it, and it was amazing. And then 6.253 with Dimitri. And these four laid the intellectual foundation for everything I've done since. I mean, it's neat that it was very welcoming. I walked in. Alan let me hang out in his group meetings, which were incredible. And I really do feel like LIDS became my intellectual home, even though it took me a little while to find it. So yeah, three of them were LIDS courses. I have a typo. That's all right. So now, one thing I didn't take a course in when I was at MIT was reinforcement learning, because at the time, no one cared. Fair enough. We cited it many times. But then, all of a sudden people used reinforcement learning to solve Go, and they won't get excited again. And then they tried to push this stuff into all sorts of technology. You go from Go being the hardest problem there is to being I can now solve anything. I'm going to solve all the difficult problems in whatever field you have. I don't care what they are. We will solve them with RL. And the main problem that we had there is that games in the real world are very different places, because here, we mean actual board games, not games in an economic sense. There, everything's very well-specified, very well-structured. But you throw these things out in a complex environment, and things become complicated. Things become very complicated. And this is where we have to bring in new notions about robustness, about trustability, about scalability. And so my group and I have been thinking about those issues in reinforcement learning probably for about five years now. It's been really fascinating. It's interesting to see how many of these ideas were seeded through ideas I learned here at LIDS. So I have a different definition of reinforcement learning. I tried to figure out what it meant. And Ben said it's very complicated. There's a large community of people. I think the problems that captured my excitement and my imagination could be summed up like this. That reinforcement learning is the study of how you use past data to enhance some future manipulation of a dynamic system. Now, everybody in the room would say, wait a minute. That's not reinforcement learning. Come up with some other name for it. Like, what are we talking about here? So, right, maybe I have my own spin on these things. That was the view that we entered into this thing. I think reinforcement learning, or whatever it is, the reason why people are excited is it's finding this way to merge machine learning with systems and control. And what is machine learning? Machine learning does have this idea of using data to do decisions. Although to be fair, they claim to do decisions. But most of the time, they only care about prediction. It is a little bit of a weird sleight of hand that we play. And the idea is that you have so much complexity, that really, I want to mitigate that complexity with data, and use data as a proxy. Summarize things nicely, and be able to use that to deal with very complex environments, sensors, and models. And then control, which I think everybody here is much more comfortable with, is all about using feedback to mitigate uncertainty. So we deal with uncertainty by using feedback. And now this is a way to deal with, in the same sense, environment sensors and models that are uncertain. So it seems like we should be able to mitigate both complexity and certainty at the same time. And that's how we would take dynamics, some detailed models, robustness ideas from control, and merge them with these new and powerful ideas in machine learning. And so how do you do it? Well you go back and you find some textbooks that maybe can lead you along the way. Of course, they're all written by Dimitri. [LAUGHTER] And so we started digging into this, some of the thoughts that we were having was, OK, look. This does look like these problems that come up in dynamic programming and optimal control. And so we went back and we got this fantastic book. It is true, dynamic programming was offered when I was here. And it's was a mistake that I didn't take it. That's my bad. But I came back. We read through that book. We went through volume 2, which actually, that's really where all the good stuff happens. Volume 2 is really good. And this is the right cover, right? This is edition 4? This has a lot of really great stuff in it. I went back to the neurodynamic programming. It's amazing to see how much-- and Ben pointed this out, how many of the algorithms that people use today are in that book. And now, as we've all pointed out, Dimitri has his latest, merging ideas from all three of these, and then taking on all the things that have happened in the 20 years since. And so bringing those to the table, it does seem like the way that you merge these things is with this kind of optimal control type view. We view things as, OK, we're solving dynamic programs, and we're solving dynamic programs with uncertainty. And essentially, we can always just write things as, this is the optimization problem we'd like to solve. We'd like to minimize some cost subject to dynamics and we want to find a policy that actually solves this. And again, Ben pointed out, from a LIDS background, I minimize. We have an objective, and we're going to solve it. The unbounded rewards problem, I think, we'll get to afterwards. Let me just skip it. I'm going to to skip over these details, which aren't that important. The interesting thing is that deep reinforcement learning, which many people have heard of-- which is not the same as deep exploration, is just you take all of these methods that you could derive from that framework. And you just put a neural net in the middle. And the neural net can maybe deal with inputs that come from cameras. It could also just be used as a generic function approximator if you have nonlinear dynamics. And all of these ideas, there's really no new algorithmic ideas that have come out. All the ideas were there. It's the computers got faster and people got excited, which is great. But I will say, most these algorithms don't really work. And work, I mean in a very technical way. I don't know. I need a good technical definition. Because we're so-- as I admitted to you all, I did my PhD at the Media Lab. And the motto at the Media Lab was, it worked yesterday. Everybody knows demo or die. But the key thing about demo or die is it worked yesterday. And this is the thing. When we mean work, when we mean work as engineers, we mean I'm going to be able to throw this out into an uncertain environment, and not have it do something stupid every day. When really think about the, like, level of robustness that we need, it's much more than just having some kind of demo that will play out once in your lab. And so I think some of the future things that we're really excited about is, I think merging non-parametric prediction, ideas from classification, ideas from taking high-dimensional sensing, put them together, and throwing these into control loops. So some of the things that we've been looking is how to actually use really complex sensors. Most of the stuff we learned in 6.432 had very low-dimensional, had very nice well-specified models. And now you have to deal with these really complex cameras that are throwing millions of time series at you every second, every pixel, and using forecasting in very clever ways. And Dimitri has great stuff on this in his new book. And it's really, like, how do you incorporate these uncertain weird sensors into really trustable and scalable autonomous systems. One last thing-- I threw this word-- so the reinforcement learning versus the control theory thing, everybody claims their camps. And I just coined this new one, actionable intelligence-- you guys can take it or leave it, which is this is this thing where we want to take data, and we want to use it to enhance the future manipulations of dynamical systems. I did want to close with just thing that's really been also captivating our group. We haven't made progress. We're still doing a lot of reading. I think that this is the future, is to realize that all machine learning systems these days are these sorts of actionable intelligence systems. Machine learning is built to do prediction. But then we use it and we show it to people, and then they interact with it. So we try to sell you stuff on Amazon or we recommend songs to you on Spotify, or we recommend YouTube videos to you. And then the people interact with them. The companies retrain on the data that they're surveilling you with every day. And then you interact again. And now you have this very complex feedback loop. And all of a sudden, you went from something that was simply a prediction problem into something that's now a complex interacting feedback system. And so some of these social problems and these social networks issues actually now have this interplay back with control, reinforcement learning, and machine learning. And they're really fascinating problems about how to make these systems more understandable and better for society. All right, with that, I'll yield my time. [APPLAUSE] SERTAC KARAMAN: And our next speaker is Luca Carlone, who's a professor in the Aeronautics Department here at MIT. LUCA CARLONE: Hi, everyone. As Sertac mentioned, I'm Luca Carlone. I'm an assistant professor in Aero Astro and I'm a PI in LIDS My group works in the broad area of robotics and autonomous systems. And today, I want to tell you about the key ingredient of autonomy, which is called spatial perception. And I want to relate somehow the state-of-the-art in perception with foundational contributions done in LIDS. So you can imagine that, in order to navigate safely a self-driving car, or in general a robot, needs to use sensor data to understand the surrounding environment. For instance, consider a self-driving car navigating an intersection. For the car to drive without collision, the car has to understand where are the lane boundaries, understand crossing, detect and localize other vehicles, potentially track the speed of other vehicles, detect traffic lights and traffic signs, and potential reason over the future intentions of other vehicles. These are all spatial perception problems. In other words, spatial perception is about using the sensor data to get an internal model of the external world that you can use for control and decision-making. Spatial perception is not only crucial for self-driving cars. But as it turns out that it's fundamental for many robotics applications, from domestic robots to industrial robotics, to drones used for infrastructure inspection, and search and rescue, for example. And even for applications that are not typically associated with robotics, such as vision augmented reality. So an initial question is, for robotics, how to formulate these perception problems. It turns out that the popular model is to formulate perception as an optimization problem. In this optimization problem, the problem is searching over a potentially large set of candidate models of the world. And it's searching for the model which is minimizing the mismatch between the sensor data and what the model is predicting about what the sensor really should look like. So I'm showing here this example regarding sensor [INAUDIBLE] images, just for-- because it's more intuitive. But the same theory applies for any type of sensor data. In general, this is called maximal likelihood estimation in estimation theory. So you can imagine that here, if we put on the x-axis of a plot all potential models and all potential explanations of the world, and for each model, we plot the corresponding mismatch with respect to the sensor data, the model minimizing the mismatch with respect to the sensor data will be a good explanation of reality. For example, this model would be a good explanation of the small house painted in the picture, while other potential models would achieve a larger mismatch would be poor explanation of data, such as this tall building here. So as humans, we solve these perception problems all the time. And it's even tough for us to realize how difficult of a problem this is. So let's take one second to watch this video. So this is a real video, which includes an optical illusion. So most of you should have thought at the beginning of the video that the video included five concrete traffic posts. And then you realized, by the end of the video, just because of the change in perspective, that 3 of the posts are not physically there. But they are just painted on the road. So this example is interesting, because at the same time, it's showcasing a failure mode of our perception system, but it's also showcasing a key strength of our perception. So just as [INAUDIBLE] what happened here. At the beginning of the video, you got stuck with a poor explanation of reality. You thought that there were physical posts standing on the road. And then when confronted with more data, you ended up refining the model and realizing that, essentially, a better explanation for the video was that some of the posts were not physically there. This turns out to be a fundamental and very difficult challenge for robotics. For robotics, and for robot perception, it's very difficult for an algorithm to realize it got stuck in a poor explanation of reality. And it's even tougher to get a much better model which is a global minimizer this mismatch function. In technical language, this means that the state-of-the-art in robotics is mostly relying on local optimization methods, which are able to find local minimizer of this cost function. But in general, they are not able to get to global minimizer which are the right explanation for the data. So over the last few years, my group has really been interested in tackling this question. And what I have been working on is what we call certifiable perception algorithms. But the basic idea is to design algorithms that not only get local minimizers, but they're able to get good models with explanation for the sensor data. And they're either able to certify that the model the algorithm computed is the best possible model, given the data. In other words, it's a global optimizer. Or to just declare failure if they cannot find such a model. I'm showing an intuitive explanation here. This is an object detection problem. On the left, you can see an algorithm which is not a certifiable algorithm. And the algorithm is predicting the location of the car to be the one in yellow. Clearly, as humans, we see that that's not the actual position of the car. So the algorithm is failing. And even worse, the algorithm is not realizing that there was a failure here. So it's giving this solution without declaring-- failing without notice. On the other hand, we are proposing certifiable perception algorithm, which not only is able to compute better explanations, better models for the sensor data. But they're also able to certify that the one that is computed is the best possible explanation of the data. So as a group, we are currently working on a number of these algorithms applied to object detection in images, object detection in LIDAR data, and in general localization and mapping problem for robotics. The interesting thing for me is that, while all these contributions are quite new, the foundation really tracing back to seminal contribution in LIDS. For example, one of the key insights behind these methods is that it is even more convenient. Instead of minimizing the original function, which is shown in white here, it is better to replace the original function with one which is easier to minimize, like the one in green that I am showing here. This is what is typically called in optimization a convex programming relaxation. And it turns out that Professor Bertsekas has been one of the people establishing the foundation of convex programming, and pushing the importance of this field at a time in which convex programming was really overshadowed by alternative approaches-- other approaches like linear and integer programming. A second insight in the approaches that I'm proposing here is that it turns out to be convenient to assume that most of the measurements that you collect for your sensor have bounded noise. And this is something that, in control theory, is known as set-membership estimation. And this turns out to be, I believe, chapter 6 of Dmitri's thesis, and [INAUDIBLE] is indeed on set-membership estimation and control. So in hindsight, really a lot of work happening more recently is about how to extend fundamental results established in the field of control theory, established by Dimitri and others, to cope with more difficult spaces, like 3D rotations, for example, or to cope with off nominal data and outliers. So to keep with this foundation, we are now able to obtain a pretty impressive demonstration. Unfortunately, it is not playing-- to obtain really impressive demonstration in which we can just use images. The video is now playing, so I'll narrate it for you. But in this video, what it's showing is that we're able to analyze algorithm without taking images from a standard camera, and are able to reconstruct a 3D model of the environment in real time on the fly. So these kinds of algorithms are using pretty much a standard deep learning algorithm to segment images into objects. For example, in the image, you can see a desk in yellow, ground, walls and so on. And then these kinds of algorithms are running multiple a large-scale optimization problem to reconcile all the 2D images into a single 3D representation of the world, which you see here. So these kinds of capabilities are very important for a robot, essentially, to navigate in some unknown room, and also to just execute high-level tasks. But you can imagine that the same capabilities are also important beyond robotics. For example, you can use these kind of mapping techniques to help a blind person navigate a room or reach a desired object. So we conclude with the last slide, saying that while perception is a challenging problem with a single robot, in the future, we envision multiple agents, multiple robots, to be deployed at the same time, and sharing the same space. So for example, it is predicted that, by 2030, you're going to have 21 million self-driving cars in the United States. And enabling these cars to communicate with each other creates huge opportunities. For example, you can imagine that in this picture, if the cars are communicating with each other, and the first car is detecting an accident, in a fraction of a second, the car can inform all the other cars about the accident, allowing them to slow down in time. While there are opportunities connected to this capability of communicating among the robots, there are, of course, fundamental challenges. First of all, if we transmit all the sensor data from all the robots-- from all the cars in this case, it's just too much information being transmitted, leading to saturation of the bandwidth. The second issue is that, if a car is receiving all the sensor data from all the other cars, just a single car does not have enough competition even to process the sensor data. So likely, again, foundational contributions done in LIDS come to the rescue in this case. If we trace back, and we go to foundational work on distributed and parallel algorithms done from John and Dimitri, we realize that there are a number of tools that we can use with the basic idea that, rather than exchanging and centralizing all the data at a single agent, we can just split the computation among multiple agents such that they agree-- they converge on a single explanation or a shared explanation of the world. And these, of course, are not simply applied to self-driving cars, but also to multi-robot deployment. Here I'm showing, for example, a recent effort we are doing within DARPA's subterranean challenge in deploying multiple robots to get a 3D reconstruction of an underground cave that you see as a top view on the right. So we'll conclude here by saying that, in this day of celebration, it's important to remember that we stand on the shoulders of giants. And that often, the first step to build a self-driving car, I suggest reading a good paper from LIDS Thank you for your time. [APPLAUSE] SERTAC KARAMAN: And now, I'd like to invite our last speaker on stage, Cathy Wu, who is also a LIDS faculty member, as well as a faculty member in the Civil Engineering department. [SIDE CONVERSATION] CATHY WU: Hi, everyone I'm the final speaker, I believe, for this panel. So this is a really good segue from Luca and Ben's talk. I'm going to be talking about reinforcement learning in the context of urban systems. And so why do we care about studying control or autonomy in urban systems? Well, building one car that drives itself is hard enough, but we also want to get a better sense of understanding the impact of making this decisions integrator system in the urban system. And this may also have implications for other types of technology that we're putting into slices of societal systems, such as social networks and whatnot. And so as we look at controlling automated vehicles, their connectivity, they have a lot of influences on the urban system. They have influences on the traffic, the traffic infrastructure. They may have implications for disaster planning, for other aspects of transportation planning, for land use, for policy, for incentive design. And one observation is that we have a variety of techniques for solving-- this clickers quite hard to use. We have a lot of techniques spread across many different mathematical areas for addressing these different problems. And one of the frontiers and hopes is whether reinforcement learning can one day be a unifying methodology for these problems. But we're still very, very early days. So as we look at some of these problems-- and I'll focus on the connected automated vehicles context, we do hit a lot of longstanding control challenges. And I'll focus on some. I'll divide these into systems and data challenges. And so in terms of systems challenges, one component, like we saw from Luca's talk, is hard enough. Now we're combining these with many other components-- heterogeneous control signals, heterogeneous actors. We have delayed rewards and costs. We have limited performance guarantees as we look at these more complex systems. Also, as we change the system slightly-- if we add a new type of vehicle or if we change the network slightly, then the solutions are very sensitive to these model specifications. There are also humans involved in it almost every aspect of this system, and they're very challenging to model. They're heterogeneous. These systems are large scale. They're high-dimensional. There's a lot of computational costs. And then there are the corresponding data restrictions where, as we look at more complex systems, the data is harder to collect and harder to test. So I'm going to give 2 snippets of recent work from my group that I'm excited about and building on. And it's really, really just the beginning for how do you think about autonomy in these very complex systems. So I'm going to talk about one that's focused on high dimensional control that is more methodological. And then I'm going to talk about some work that is trying to gain some insights in this domain. All right, so we are-- like Luca said, we are sitting on the shoulders of giants. Dimitri's books have educated three decades at least of us, of many in the room. And so we're building upon a lot of strong, strong foundations. So in the context of urban systems, the agent that we work with in reinforcement learning may be the vehicle, the automated vehicle, and it's interacting with the environment-- in this case the other vehicles, the rules, the humans. And the automated vehicle makes decisions which may include accelerations or tactical maneuvers and so on. Overall, where the agent is trying to optimize for its reward, which in our case-- because we're concerned with the urban system as a whole, we are concerned with the average velocity of the entire system, or more complex objectives as well. So we're optimizing this objective-- this cumulative reward over time, and we are optimizing over some parameter theta that corresponds to the weights for a deferral network. And that's where the deep comes in. We've seen a lot of success in a number of game and physics domains. And so now can we bring these techniques and insights to more complex, societally-relevant systems? All right, so one aspect of this is with urban systems, we have really, really high-dimensional problems. So let's first take a look at high-dimensional control. And it's not just urban systems. There are many systems that are high-dimensional in nature. So what can RL do? How can we improve these methods for high-dimensional control? So I'm going to focus on advantage actor-critic. And so there's a long history of this method from the '80s, and also a lot of theoretical development from folks from LIDS. And so the advantage actor-critic is the basis for a wide class of methods, policy gradient methods that are widely used today. And the method is quite simple. We take the objective that we saw. We approximate the gradient. And then we update the parameters according to this gradient. And the name of the game is, how do we estimate this gradient well? And what that means is how do we estimate it in a low variance and unbiased or low-biased manner. And so the advantage actor-critic. So actor-critic refers to the actor being the policy, the critic being the value function. And the advantage is actually, instead of taking the value function directly, we actually can subtract some sort of reference point. And this has an interpretation of the advantage of this quantity. And there's actually a variance reduction interpretation for this difference, which is because variance is actually one of the greatest challenges with this class of methods. And we expect this variance to be exacerbated by high-dimensional control, where basically, our estimate of the value function is going to be more challenging to estimate as we have more vehicles and more dimensions to control. OK, so the intuition that I like to use is that, well, the variance of a difference is the variance of the corresponding terms minus 2 times the covariance of the two terms. So if we now take a look at this difference here, if we want to minimize the variance, then we want to maximize the covariance between these two terms. So this actually leaves us close to what is a widely used result. Before we get there, we might try this toy exercise. If we actually fit this reference point, also called a baseline very, very well to actually the Q function itself, we fail because we actually destroy the gradient. This term will be 0. We have no gradient. We updated no direction. And the problem is that this is introducing bias. So the state-of-the-art bias-free baseline is also known as a state baseline, where you can fit the value function, rather than the state value function. And this is a really seminal work from Greensmith, Bartlett, and Baxter in 2004. And so our work is really asking, can we have, actually, maintain this bias-free nature? But can we incorporate action information, which is really important when we have high-dimensional control. And so I don't have much time to go into this. But the main insight is that we're working with stochastic policies. These are probability distributions over the actions. And so probability distributions can be factored along the action dimension. And now we actually see that there's basically this nice interaction between this log and this product. It allows us to actually derive bias-free state-action baselines. This allows us to have an advantage that is across both states and actions. And then we basically can derive the optimal. We can derive the benefit over the state baseline. And we actually can see that this works quite well in practice as well. And so we have some benefit over-- OK, so I think, since I'm short on time, I will just fly through. Now we're also interested in taking reinforcement learning and understanding how we can employ this body of technique to understand the impact of automated vehicles on the greater urban system. In particular, we're exploring the impact of controlling vehicle kinematics to influence traffic congestion. These problems are actually too hard for us right now. We're working towards this. What we do is break this down into smaller pieces that we call traffic LEGO blocks. We can take a look at one of these, this single-lane circular track, and we can train a controller. It was shown about 10 years ago that a simple setup like this actually produces traffic jams solely based on human driving, not based on lane changes or traffic lights or anything. So I should show you the learn controller that we developed using these advantage actor-critic methods. Right now, the controller is off, so that we can actually see traffic jams forming here. Now the controller is switched on, and we actually see this vehicle can take on a different driving profile, and actually illuminates that backwards propagating wave that we just saw. Illuminate the traffic jam, closes this gap. And actually using systems theory, we can actually characterize that this result is actually near optimal. And this is something that we usually cannot do with pure reinforcement learning techniques. We need actually some systems theory to allow us to bounce the performance for these techniques. We can show that these techniques, this controller actually generalizes, and does not require memory. OK, so I'll just stop there. We can show this for now a variety of other setups as well. But I won't go into that. OK, thank you. SERTAC KARAMAN: Thank you so much. [APPLAUSE] And so now you heard from Ben van Roy first in his keynote talk, and then our five panelists. And now I'm going to start out with a couple of standard questions. So I'm hoping that some of the more exciting and unusual questions will come from the audience. I'll start out with two questions. And then we'll start-- there's two microphones, and you can take them and get the question. So I'd like to start out with my first question regarding the past. So I think the last time we've done a conference like this was the Paths Ahead, and it was pretty much exactly 10 years ago. And I was just looking at the agenda. And I could see that, for example, systems control and optimization is actually split up into two different sessions. There was another session of learning. And looking back 10 years ago, what do you think has happened in the 10 years that you found surprising that it emerged or reemerged? Or looking back 10 years ago, would you actually see yourselves doing the kinds of things that we're doing today? [INAUDIBLE] BENJAMIN VAN ROY: So in my view, the context has changed a lot. And context matters a lot. The scale of computation that is available to do analysis, as well as the amount of data we are gathering constantly today, is astronomical compared to 10 years ago because of cloud computing also first because of the internet, but even more so because of cellular penetration. Everybody in the world suddenly has a smartphone now. And so with that change of context, the algorithms we use for learning and systems and control will also evolve. I think the trend is toward using simpler and simpler algorithms, but where nuances of the design might matter a lot. And also, this relates to what Ben is saying. At some level, the problem of control encapsulates everything going on now. But the term control conjures up approaches that were designed with a different context in mind-- a context when control was a popular term to use. So that's my take. SERTAC KARAMAN: Any others? Yeah, Luca? LUCA CARLONE: I just want to add-- just exciting times, I would, say over the last 10 years. Of course, like one month from now, one year from now, you can say the same thing about the future. But it seems the best time to do research on the topics we're working on right now-- in my case, robotics and autonomy. Thinking about going back, like, 10 years I was thinking that we went from the DARPA Urban Challenge 2005, 2007, being an academic exercise back then, to Tesla selling cars with autopilots right now, and self-driving cars being widely adopted. I think there are hundreds of thousands of self-driving cars driving right now, at least on limited roads. And if you look at the progress in robotics, it's just very exciting what happened, Amazon buying Kiva Systems for $700 million, iRobot selling the Roomba. Most of you guys probably have a Roomba at home, iRobot selling 25 million units of Roomba. The thing that I want to add is that it's interesting, because of course, there was progress on the algorithmic side, and a lot of research that was done over the last 10 years and before is now transitioning to products. So there is a lot of good research happening. But the thing that is interesting on my side is also to realize how unexpected sometimes are the sources of progress or the breakthroughs in a field. For example, you realize that most of the machine learning revolution is driven by the fact that, right now, we have a huge amount of data which is a consequence, if you think about data, of the internet, Facebook, and all these kinds of services. We have a huge amount of computing. But that, again, started more as something that was promoted by the gaming industry to have very good real-time vendor for games. And in robotics, it's even more so. Despite the progress on the algorithmic side, sometimes like a lot of better sensors that we develop and we're using for robots were just due to better cameras being designed, for example, for mobile phones. So it's interesting to me that there is this likely connection between different efforts across different research-sharing technologies, coming together with very good and well-designed algorithms developed by the research community. SERTAC KARAMAN: Thank you. And I think that came up in many of the talks, the topic on machine learning-- reinforcement learning especially. But I wonder how does the audience see that? So I think it came up on many of the talks that you all alluded to. And how do you see the emergence, or the reemergence, of machine learning and-- I think that when I was a graduate student, I was sitting in the audience. And I was looking at these talks, and I was trying to pick up, like, a PhD topic for myself. And the students here, what do you recommend for the future, for the next 10 years? What do you think that they can focus on? What do you think we'd be talking about in LIDS@90? Yeah, Angelia, please go ahead. ANGELIA NEDICH: Yeah I was thinking, especially because when dealing with this data, it's already emerging, like, security and safety. So when you have data, if you are using it to build things that are-- you have autonomous systems, there is a potential of somebody hacking into the systems. So you have to have a way of protecting your shared information. And also, that's what I would think-- some of the privacy of the data as well. So those are some of the aspects that I see emerging. Some of this forensic type data analysis showing up. SERTAC KARAMAN: Asu and Cathy, maybe? We'll start with Asu. ASU OZDAGLAR: You're asking tough questions. [LAUGHTER] SERTAC KARAMAN: I warned you. ASU OZDAGLAR: I know. Yeah, clearly, machine learning is right now very popular. It's hard to find somebody who does not work on machine learning. There's a lot of interesting problems in the context of a convergence of optimization, statistics, computation. In terms of moving forward, I think most important will be machine learning has impressed us with all these successes in vision, natural language processing, cats and dogs, identifying classification problems. What they think the next stage now will be going into more and more applications, where safety-critical applications are deployed in problems, where we're going to see more and more-- we would like to make sure these are robust and can be used as an engineering technology. So how do we get there, I think, probably is the next big question. And the other one is basically, when you apply these in applications-- of these societal applications, of course, these will be very much interacting with humans. And information will be coming from humans. Information from humans is very tricky. So how do you actually-- there's still the biases, still be able to learn, and then bring the human-societal aspect together with these robustness adversarial effects in the machine learning systems to be able to make it into an engineering technology. SERTAC KARAMAN: Cathy? CATHY WU: I just want to emphasize this last point on the human use. I think that as we are maturing these technologies, and seeing machine learning being really effective at very well-defined tasks, we still have very little understanding about how they interact with humans. We have very little understanding of humans. [LAUGHS] And so I think a more concerted effort on modeling humans or understanding, I think, is one thing I would like to see in 10 years, having a lot more progress. And then I think that, in the last 10 years-- I think in addition to the maturing of machine learning techniques, I think there is a lot to say about how accessible the community has made it. I think you mentioned that every high school-- or maybe not every high school. Many high schoolers can download these packages and play with cart-pole and they can get their feet wet. And I think that something could also be done for making some other types of methods more accessible as well, because they do have a lot to contribute, but they're harder to get into. SERTAC KARAMAN: OK, maybe I'll allow, in the remaining 15 minutes-- from the audience, is there any questions in the audience? I think there is a few. I've seen Devavrat over here. Maybe we'll start there, and go one in each direction. OK, there's one over there. AUDIENCE: What are other current landscapes and future emerging pollinations between control theory and the other three disciplines-- one, non-equilibrium statistical physics, second dynamical systems and chaos control, and three manifold learning and topology? SERTAC KARAMAN: Anything? [INAUDIBLE] CATHY WU: The questions just got harder. [LAUGHTER] SERTAC KARAMAN: I know that when I get to the audience, it will be harder. But I guess it was mentioned statistical physics is one thing that you mentioned. It actually came up, I think, in one of the talks. If not statistical physics, but people mentioned it one way or another. And there was a couple others. What were they? AUDIENCE: Chaos control. SERTAC KARAMAN: Chaos. And then the third one was? AUDIENCE: Manifold learning and the topology. SERTAC KARAMAN: For any of them, any takers? BEN VAN ROY: I wish I understood all those topics, but-- [LAUGHTER] But similarly with my expertise in genetic algorithms-- [LAUGHTER] I'm not competent really to all those topics. CATHY WU: We need more bridges to more people. SERTAC KARAMAN: Yeah. OK, maybe I'll continue with another question, and we'll try to come back. Was there a question from here? Yeah. Was there a question? Yeah. AUDIENCE: Just to play devil's advocate, being a student of Dimitri that moves on to be discrete-- by discrete, I mean real computer science, rather than in-between, I wonder if there is not the element here of when you have a hammer, the whole world is a nail. In some way, control theory only had a big success-- for example, jets. And jets are a good example of something that we don't imitate nature. It took a long time until people understood we better have airplanes, rather than imitate birds flying. But I observed, in my department, 30 years ago people doing vision Hessian. Luckily, I still remember what Hessian is. And it seems to me this is going nowhere, because in our brain, we do not do Hessian. So it seems to me that the problem of vision will be solved when we start imitating the brain. And this is what in reality happened. Really, I think until we got deep learning, we didn't have really a good visual system. So at this point, it seems to me that we succeeded imitating to a very low level of the brain. But the brain interacts with cognition, with reasoning, and all sorts of stuff, that goes. And it seems to me that the next breakthrough in understanding will come from brain science-- understanding how this high-level reasoning and lower-level reasoning interact rather than controlled. SERTAC KARAMAN: Yeah, I guess that was a comment or a question. [INTERPOSING VOICES] LUCA CARLONE: I completely disagree with that-- unfortunately with the point of saying to do with engineering we should study what the human brain is doing. First of all, I have the belief that the human brain and human performance is a proof of concept of something that we are not able to do in many cases with machines. But it is not necessarily the best solution for the problems that we have to solve. Also because, in defense of the human brain, the human brain has power constraints, and so it is operating in challenging conditions, because it is working on a very tight power budget. So I don't think that whatever we do with machines should just seem to separate to that upper bound, but can eventually cross that bound. And you can think about an example of that-- if you think about robotic arms, robotic arms are just imitating what a human is doing in terms of manipulation. But right now, they're outperforming the precision of humans in every [INAUDIBLE] matched manipulation-- not manipulation. Let's say welding tasks. And self-driving cars, the same. Eventually, self-driving cars are projected to drive much better than a human, just because there is no constraint on the type of sensors that you put on the robot, the amount of computation that you put on the robot. So I think that, again, we can draw a lot of inspiration from humans. But we do not have to feel constrained about seeing that as the only viable model to get intelligence. SERTAC KARAMAN: Any other takers? Otherwise, I'll move to the next question. AUDIENCE: I have a question. SERTAC KARAMAN: Peter? Maybe we'll take one from Peter, and then from you. [LAUGHTER] Seniority rules. AUDIENCE: Well, first, I'm going to make a very rash prediction. That self-driving cars will have considerably fewer accidents than human drivers, because they'll never drive or inebriated, amongst other things. However, there will be a crash on the California freeway that will involve more than 1,000 cars when the system is automated. The second thing-- so I'll move from the sublime to the ridiculous. There is a very old paper by Jurgen Moser, which basically is at the heart of penalty function method. So although it's never been exploited, and it's a substitute for Lagrangian methods, and it's also a substitute for using gradient descent. And I have no idea. But you like to look at old papers. I do too. And so I think it's something that people looking at this might take a look at. It's an obscure paper, I mean. But it's worth looking at. It's in the 1950s. SERTAC KARAMAN: And it's another, I guess, comment. Any comments? Otherwise, I'll move on to finally Devavrat. The floor is yours. AUDIENCE: First of all, excellent talk, panel. Thank you. In many of the settings, especially let's say over the past few years, I was interacting with retailers. And retailers is an organization which is primarily driven by humans. And then you will tell them what is the right sets of decisions to make. And they say, well, here is the right decision to make. And when humans get involved, especially part of the decision loop-- not rather just providing the input, so to speak, it's very complicated and challenging. So I would love to hear the panel's view on what might be a good way to, let's say, take a beautiful model of MDP or a variation of that, with only just a game theory behavior. But what might be the right way to think about getting humans into the loop as we think about decision systems, especially thinking about decision systems within organizations? Now again, I am cognizant of the fact that if you want to have a good health, you should stop smoking. But most smokers don't do that. And democracy is at an interesting place right now, despite the fact that we know what's good or not. SERTAC KARAMAN: Thank you, Devavrat. ASU OZDAGLAR: Beautiful question. [LAUGHS] Whenever-- if you talk about humans, one thinks about immediately game theoretic models. But the problems you're talking about are so complicated with so many different factors-- humans trying to think about a multi-agent MDP sometimes does not get us too much tractability in terms of addressing the problem. That being said, I'm still a strong believer that there may be reduced models that actually can somehow bring the strategic motives into the loop without having full-fledged game theoretic models between so many agents. So still, I think there's some combination of the ideas with reduced representations. That would be the way to go. But I may be biased. CATHY WU: Another perspective may be-- I think instead of a reduced model, I think potentially reduced numbers of stakeholders. Potentially being strategic in pinpointing who to make recommendations of decisions to may facilitate-- my hope, may facilitate some transition of research into more practical aspects, where say in the city context, instead of needing to convince every citizen of the city, if you can convince the mayor or a few key individuals. And some decision is then translated to policy. That just then becomes a rule, then that can-- ASU OZDAGLAR: Let me also add-- by the way, I think another very exciting direction would be to be able to combine the data and empirical work, together with reduced models. So we're thinking about non-machine learning for data coming from humans. So is there a reduced way of representing the motives information in such a way that the machine learning algorithm takes that into place, instead of just thinking about it as IID data. So I think that would be a very exciting direction. SERTAC KARAMAN: Yeah. Go ahead, Ben. [INTERPOSING VOICES] BENJAMIN RECHT: I'll say this. I'll say this, and I think this is a challenge for the entire room of LIDS folks. I think one of the most challenging parts, and challenging aspects, of interacting with people is that things stop being quantitative. And something I've been seeing a lot in my group, and my interactions with other people on campus, is how exactly do quantitative people like us build nice bridges to qualitative research? I actually think it's not just the amount that we're going to go fix stuff. Because I feel like there's a lot that we can learn from that kind of qualitative aspect. And to me, that's a grand challenge. How exactly do we take these kinds of systems thinking, and when we're bringing it into more social systems, interact with the unknown and the unknowable? I think that's a grand challenge. BENJAMIN VAN ROY: OK, so let me address this, but also speak to the previous question about a projection into the future of what might be big and where things might go. But I think our view of machine learning thought today is typically pretty narrow. It's like you have this data set, and you fit a model into that data set, and that kind of stuff. But a machine should be able to learn from all sources of information. And I think that part of that is interacting with humans and learning from them, just like students learn from teachers and school, or small children learn from parents. There are algorithms that can do that, as well as learning from empirical data. And I think that that's a real-- I think that one trend we're going to see going forward is greater abstraction, because there's so much data collected from so many different problems and different things. And you have access all that, and you have access to so much computation. So abstraction is going to be lifted to higher and higher levels. So like, John Tsitsiklis gave this nice talk this morning, where he talked about how LIDS has a tradition in abstraction. And probably in the early days of LIDS, taking a class of problems, like inventory problems, and abstracting that, and saying we're coming up with ideas that are going to be used to solve all inventory problems, was mind-boggling. They are a separate inventory problem. But that the LIDS approach is to abstract away from that. And then the next layer is coming up with ideas that are relevant to many problems, but that where the researcher has to think about how to map it each problem. But I think pushing forward, there's going to be greater and greater abstraction, where machine learning algorithms will be designed not for a specific class of problems. But you could even think of each class of problems as being a data point, and it's trying to generalize across classes of problems. And so what you're working on will be at a much higher level. And the machine learning algorithm could do things, like collect empirical data, talk to the human, learn what the human wants, learn from the human's experience-- do all of that kind of stuff. SERTAC KARAMAN: Thank you. I think we have one more question for Sanjoy. AUDIENCE: This is a little bit a view from afar. It seems to me that there are fundamental problems in this circle of problems that are being talked about, like control with the vision center and the feedback loop. It seems to me fundamentally that problem is complete. How should we represent images, when in some abstract sense, it should be topological invariance, inactive components, et cetera, and track that. And I think the issue that Chomsky raises, poverty of stimulus, as I understand it right now, it is using lots and lots of training data in order to do, let's say, pattern recognition in a broad sense. But the issue of feedback is the issue of poverty of stimulus. How would you do this, and with what kind of data, where the amount of data that you need is limited in some sense? And this is related to what happened in systems theory. This is invariant thinking. For example, what things can we do with feedback, and what things we cannot do? There are essential constraints. So my suggestion is we need to look at a major problem like vision in a feedback loop in a systematic, more fundamental way. SERTAC KARAMAN: Thank you, Sanjoy, for the comment. Any anybody want to say anything? BENJAMIN RECHT: I think we all agree, right? [LAUGHTER] ASU OZDAGLAR: Me too SERTAC KARAMAN: OK, so that said, I think we're out of time. So let's thank our panelists. [APPLAUSE]
Info
Channel: MIT Laboratory for Information and Decision Systems
Views: 833
Rating: 5 out of 5
Keywords:
Id: CDTRbSK2kck
Channel Id: undefined
Length: 84min 21sec (5061 seconds)
Published: Wed Dec 04 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.