Explainable AI Cheat Sheet - Five Key Categories

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
as ai starts to make more and more decisions in our daily lives in business in finance in government it is becoming increasingly important to make sure these systems are making the right decisions and for the right reasons explainable ai is a set of tools and methods that allow us to understand ai and machine learning models and their predictions and decisions explainable ai is especially important in high stakes decisions think things like medical diagnosis or legal proceedings or maybe some large financial decisions another reason in the rise of importance of explainability and interpretability techniques is the move by regulators to hold companies more and more accountable for the decisions that are made by their ai and machine learning systems this video is an accessible and gentle introduction to explainable ai methods we will look at some of the major categories of explainability techniques in machine learning what animal do you see in this picture you've probably seen computer vision models that are able to detect animals in photos kind of like this now you might be able to say that this is a husky and that's probably correct it looks like it to me at least and we can have a model that is able to detect this as a hospital but then think about the case when we impose an explainability method called saliency maps on this image this is an example of a saliency map it shows you the areas most responsible for the decision of off the mob in this toy example it's the snow in the background that led the model to say that this is a husky and not looking at the dog itself now this is a classic case in in machine learning where a model can make the right decision but for the wrong reasons and explainability methods allow us to to find some of these faults and make sure our models do better this is a little bit of what we'd expect a proper model to to look at so i tend to for example tell huskies by the patterns on their eyes so here you can see it's a little bit more yellow so this is more of where the model focused or what regions of the image were most responsible for for the model generating this we will go over five major categories of explainable ai so we've made this into an image a cheat sheet that anybody can use we'll link to the image in the description and it looks like this so we start our explainable ai journey at the top here let's start the first question we ask is if we want to explain a model's decisions this is an important branching point because there are explainability methods that focus on a model and its internal representations more so than its predictions or decisions if we answer yes we branch here to the right then we ask the second important question is the model that we're working with currently interpretable by design if we answer yes that leads us to the first group so this is a group of models that are interpretable by design like logistic regression like linear models like k nearest neighbors you can look at the weights and coefficients of the models and that is a form of explainability now this is especially important in the case of high stakes decisions so the more interpretable by design the model that we have is the better for these high stakes decisions now it is a complex world a lot of problems are better served by models that are a little bit more complex and not as interpretable by design as these linear methods and this is especially the case in computer vision or in natural language processing so if your model is less interpretable than these linear models um we understand the next branching question is to ask if we need a method that works on all types of models if we answer yes and then we are led to the second group of explainability methods which are model agnostic explainability methods the leading example here that we should probably look at is shop sharply additive explanations this is an image from from um the shop python library uh that gives you a little bit of an explanation of how the model shows an explanation so this is a model on the left so these are its at its inputs so age is 65 sex is female blood pressure is 180 and bmi is 40. so you give these to a black box model and its output is 0.4 so this could be some diagnosis it could be something related to diabetes or something else so what what chap does is first it calculates this base rate so what this property that we're predicting is the base rate is kind of like the average of all the uh of this value in the data set that we have and then it gives you this sort of added additive explanation for for how each of the features um corresponded or pushed that prediction for example the bmi of 40 seemed to have increased the uh the value of of the prediction so it pushed it a little bit to the right and then the blood pressure pushed it a little bit more to the right the sex pushed it to the left and then age brought it all the way back to the right now notice that this is not how this model calculated this this output but this is an explanation so regardless of the model that was used to generate this um this is an explanation that can be um added on top to to sort of give us an insight into how this single prediction um relates or how it can be explained um given all the examples we've seen previously this is a quick look at chap so that's our major our second major group let's say of explainability methods after the first methods that are interpretable by design the next category is model specific methods gradient saliency methods are more related to neural networks and so integrated gradients is is a form of radiant saliency maps to break that down it is a map because it highlights different regions on on on the image let's say if you're using it on image but you can use it on tabular data in tables or or text as well saliency is to say that that this map that we've imposed it tells you how important each pixel was in the model making that decision so salience in that so and gradient is a a property of neural networks so it's an important mathematical signal that is used to train neural networks but it also can be used to to judge the importance of of the features and so there's a bunch of methods that rely on on gradient about three four years ago it used to be popular to use raw gradients but now there are more advanced models or methods like integrated gradients [Music] that generate these explanations now we have another group of methods that are very useful in making explanations uh based on examples from from the data set you can talk about adversarial examples or influence functions or counter examples so let's look at adversarial examples this is a model that detects traffic signs so you can see the the image the box here was able to correctly identify a stop sign so it has the bounding box in the right place it has identified it with 99 confidence that this is a stop sign here on the right you see a stop sign with a little bit of variation but that was able to fool the model into thinking it's a sports ball um so this is an adversarial attack or an adversarial example that showcases a case where the model fails this is another case so for a similar computer vision model that detects and identifies traffic signs by just adding these patches of black and white this example is able to fool a lot of computer vision models into thinking this is not a stop sign but rather this is a 45 mile per hour speed limit sign so it is another famous adversarial example that's our fourth category of explainable ai methods and this concludes the groups of models that explain models decisions so let's now circle all the way back and look at methods that explain model internals so we can talk about things like feature visualization or activation maximization probes are used in natural language processing i'm not going to go too deeply into many of these i'm intending to do a bunch more videos about this topic because i spend maybe a bunch of my time in this uh circle in this area more than the others um and so i intend to read some of these papers on on the channel uh in the future but we look at um the background image that we used here is for feature visualization feature visualization is this method that you focus on specific neurons and especially specifically in computer vision models and you're able to use the behavior of that model to generate images that tell you uh about the ideas or the concepts that that tend to activate this neuron so here you can see neurons that are early in the in the model would tend to be edge detectors so they would tend to identify where the edges are but then you can later on deeper in the model you can isolate neurons that are able to detect let's say textures or patterns or parts of of things and maybe like flowers or patches and specific objects as well more recently we saw this example it's also a feature visualization from uh this is 2000 2021 from a paper from from openai and you can see a neuron that sort of fires or understands or we can visualize the things that i've activated and it seems to be correlated with anime so we can if we um optimize and create images using that that neuron uh they would tend to look like anime or this is a group photo neuron you have things like paintings and you can see these sort of dream-like uh hallucination like abstractions of things like crying or happy or sleepy or shocked i think this is extremely fascinating and it's an area where i i spend a lot of time just reading and it's absolutely fascinating so this concludes um this quick look at the explainable ai cheat sheet the first version of it we'll keep updating it you can find it on x dot page dot io we'll there if we come up with any updated versions it'll be at that url some very interesting uh resources uh include the the interpretable machine learning book by christophe molnar there's a free online version of that book there's a link below and the good thing here is that most of these groups have dedicated categories or chapters in in the book so that's been a very useful uh resource for for putting this together and i love how the author sort of explains the the methods and then but also mentions advantages and disadvantages so it's a valuable resource other resources i've used to come up to speed is this video and explainability by isabel ogenstein i also enjoyed the nlp highlights podcast episode with samir singh and there's this interesting talk also on youtube called please stop doing explainable quote-unquote ml by cynthia i think they're all linked below i found them very very interesting so these are some of the key resources that you will probably interest if you want to learn more about explainable eye so i hope you've enjoyed this introduction to explainable ai very high level overview we plan to continue updating the the cheat sheet on on the the url and i plan to do a little bit more sort of paper reading maybe article reading videos in the channel with a special focus on explainability but on other topics as well so thank you for watching
Info
Channel: Jay Alammar
Views: 40,326
Rating: undefined out of 5
Keywords:
Id: Yg3q5x7yDeM
Channel Id: undefined
Length: 14min 8sec (848 seconds)
Published: Thu Apr 29 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.