The Science Behind InterpretML: LIME

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
>> Today on this special build edition of the AI Show, we get to hear from Marco Ribeiro, Senior Researcher from the Microsoft Research Team. When you build machine learning models, it's important to know what they're doing. Marco, the creator of LIME or Local Interpretable Model-Agnostic Explanations, will talk about the research, how it works and why it's important. Make sure you tune in. [MUSIC] >> My name is Marco, I'm a Senior Researcher at Microsoft Research. Focus on my research is helping humans interact with machine learning models meaningfully, that includes interpretability, for example, I wrote the LIME paper and a bunch of others. I'll talk a little bit about LIME in this video, but before that I thought a little history would be interesting. So LIME was the first paper I wrote in my PhD and at the time, interpretability wasn't as hot as it is now, there were almost no papers coming out on this subject, and this is how I got into it. I was doing an internship at a large tech company trying to use machine learning for a task, and I trained the model and it worked really well in cross-validation. But whenever we sent it to user testing, it sucked and it took me forever to figure out what it was. Through a long time of debugging and trying to understand, I figured out that my model had learned to distinguish between how we had collected positive and negative data rather than picking up on what it should be picking up in the real tests. So it learned a shortcut, it detected how we collected the data and that didn't translate into the real-world use case. Now, this experience, there are two things that really bothered me. So the first one was that cross-validation accuracy didn't translate into real-world performance. Maybe this is obvious for people who are doing machine learning in the real world, but as a researcher, if you take a machine learning course, you learn that cross-validation is a good way to evaluate models. The model has not looked at that data and it works well, so therefore it's good. But models are really good at picking up quirks or spurious correlations and datasets and that can really mess up performance when your real data does not have those quirks. But the thing that bothered me most was that I could not figure out what my model was doing. So here I was, PhD student, supposed to be an expert and it took me the longest time to understand how my model was making predictions. So I came back from my internship, and at the time I was working on a completely different topic, distributed systems and machine learning, but I decided that this bothered me enough that I wanted to work on it. So I looked at the literature to see what people had done in terms of understanding how models were making predictions. Then I saw a bit of a gap, so I decided to change topics and work on that, and that's how I got into interpretability in general. So why should we bother with interpretability at all? Interpretability has many uses, sometimes they're even forced to do it due to regulation. But the use case that's dearest to my heart is the one I was just talking about. You're training your model and you want to figure out, does it work? Is it doing something sensible? How can I improve it? I think that understanding what the model is doing really helps all of that. It helps you avoid putting a really bad model into production because you catch things that don't make sense ahead of time. Machine-learning models are really good at a particular kind of over-fitting that I was just mentioning, picking up on spurious correlations, picking up how we get the data, picking up on things that will not generalize. If you understand it, if you can see those things happening, usually it's obvious when you see it, but even when you don't have situations like that, if you understand why mistakes are being made by a model, it's easier to figure out how to fix them. So it's easier to figure out if you need to get more data, if you need to change model architecture, how to change it, etc. We had some experiments in the LIME paper where we had people with no machine-learning expertise picking up or trying to decide between two models, which one is going to generalize better and even people with no ML expertise could do really good model selection. The could also do some feature engineering, so they could look at what the model was doing and say, the model should not be doing this, and click out on words. In this case, it was text, words that the model should not have been using. So even people who have no machine-learning expertise at all can already do something. How much more people who are actually training these models and applying them? So the actual developers of models, I think, can gain a lot. So understanding what's going wrong, it already gets you almost half of the way into fixing them, and I think interpretability is really useful for that, among many other uses. So I've been talking about LIME, but I didn't even tell you what it is. I'll just give you the key ideas behind the technique that we use in the paper and the key ideas are as follows. First idea is, we're not going to try to explain everything at once. If you have a model that's really complicated, it's a very hard task to explain everything it's doing. Let's say I'm trying to predict risk of defaulting on a loan, it's a complicated problem. So maybe you think like, oh, if credit score is super-high or super-low, it's easy, just look at credit score, but maybe in the middle it's complicated and then you realize, oh, it interacts with time. So if your credit score is slow because you're an immigrant and haven't had time to open up accounts, but you work at Microsoft and this is actually a problem that I face as an immigrant getting a credit card approved, but maybe that should be taken into account. So you don't just look at credit score. So if you try to explain, this is just a small example, if you try to explain all of that at once for a good model, you're going to get into trouble. You're going to hide too much or definitions can be super complicated. So what LIME is going to do is going to try to explain a single prediction. So let's say that the model thinks that I, Marco, have low-risk, we want to understand why that is, and not try to explain every single person in my dataset at once. Then the other idea is that we're going to treat the model as a black box, and there's a trade-off. If you treat the model as a black box, you cannot exploit its internals and you need to do other things. But there's also a gain that you can explain any model, and we made that decision of explaining the model as a black box. But if the model is a black box, how can you understand it? How can you explain? I think the only way you can do it is by perturbing the example in seeing what the black box model does. Now, there's a lot of ways you could perturb the example. There's a lot of things you could do, but we did a very simple thing with LIME, and this is what we did. Let's say we want to understand why I'm low-risk, what you do is change different features. So you change, for example, you say Marco doesn't work at Microsoft, he's unemployed and then you see, does the risk prediction change? If it changes to high risk, that's a good indication that the model is using the fact that I work at Microsoft, but you don't want to change only a feature at a time, like remember the case where maybe credit score was important, but you have to take into account whether a person is an immigrant or whether their accounts are new and so on. So maybe I change my job and my credit history a bit, and maybe I change a lot of other things at once and you perturb a bunch of different times and see what the model does. After many perturbations, you learn a simple explanation, a linear model that says something like, Microsoft lowers the risk by 0.2, being an immigrant raises it by 0.05 and etc. So this explanation would be describing the model for people who are close to me, people who are similar to me. The explanation has to be really faithful or really good for people who look like me and maybe for people who look like me and looking at zip code doesn't matter. So the explanation is not saying that zip code is irrelevant for the model, it's just saying, for people who are in the neighborhood of Marco, what matters is where you work, being an immigrant, and so on. Anyway, combining these two, the main idea is this, treat the models as black box, perturb the example you want to explain and then learn an explanation that really describes the model in that region, in that neighborhood, instead of trying to explain everything at once. That's a very simplified version of what LIME is and does. I hope I've given you a little bit of the background and convinced you to at least consider interpretability, consider, can it help me to understand why my model is making predictions? I think it can, and LIME can be a useful tool in your tool belt of interpretability. There's many others as well, but signing off. Thanks. [MUSIC]
Info
Channel: Microsoft Developer
Views: 1,469
Rating: 5 out of 5
Keywords: Microsoft, Developer, build2020, responsibleml
Id: g2WtL45-PFQ
Channel Id: undefined
Length: 10min 16sec (616 seconds)
Published: Sat May 16 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.