Shapley Values Explained | Interpretability for AI models, even LLMs!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hello, and welcome to this AI Coffee Break! Today, we will talk about a way to better see how machine learning models are making predictions. This is important because we have more and more AI models around us. They are increasingly complex and are used in applications such as healthcare, finance, or to show you ads. But do we know on which parts of my browsing history some AI based its decision to show me this ad? Or why did a model predict that I have a 40% chance of having diabetes? This is where interpretability comes in. It helps us understand how these models work and why they make certain predictions. In this video, we will give a short and high-level introduction into Shapley Value. They are a method that works for ANY model. We will first show some code, and if you keep watching, you will see what model interpretation has to do with games. Yes, games. But first, let’s thank our sponsor of today, AssemblyAI! Just last month, they released the Universal-1 automatic speech recognition (ASR) model! It offers more than 92.5% accuracy with only 30.4 seconds of latency thanks to its effective parallelization during inference. Accuracy is important when it comes to understanding my eastern European accent pronouncing technical words like “GAN”, or “GPT model”, or RLHF. Other than accuracy, to me the most useful part is that it was pre-trained on 12.5M hours of multilingual audio data (that’s ~3 petabytes!!). I personally am most excited about Universal-1’s code-switching capabilities, namely that it can transcribe different languages in the same sentence, just look: is so useful for multilingual speakers like me! Check out Universal-1 yourself in Assembly AI’s playground. It’s very simple to use. Also, know that AssemblyAI offers two tiers to use Universal-1. "Best" for the most accurate tier and "Nano" for the fast, lightweight offering which is less expensive. Nano is perfect for batch processing of audio that does not need the highest quality of speech-to-text. Check out Assembly AI and their new model with the link in the description below! Now, back to the video. Imagine you have a model that takes some inputs such as values for age, the sex, the body mass index and predicts the probability of diabetes. You want to know how much each of these inputs contributes to the model's prediction. Shapley values can tell you exactly how much each input contributes to the model's prediction, of let's say, a 40% probability of diabetes. But unlike other ML interpretability methods, Shapley Values have numerous advantages. They are model-agnostic, meaning they can be applied to any model and also to any modality, like text, images, and so on. And these values are meaningful (unlike those outputted by methods based on gradients or attention, where the numbers are hard to interpret): positive values are for features that contribute towards the outcome, while negative values are for features that try to decrease the outcome. Even better, these values are a fair distribution of the model's prediction among the input features. Specifically, if we take the value for the age, add the value for the sex (so we subtract it because we add a negative value), add the value for BP and BMI, we get the model's prediction ... up to the so-called base rate. The base rate is what the model outputs when all inputs are zero, but more about this later. So, the overall idea is that we start from this base value, we add the Shapley values for each input and we get the model's prediction. Now, how does this look like for more complicated models, such as a LLaMA 2 language model? Let's see how we can use the SHAP library to interpret the model's predictions. First, we need to install the SHAP library, then we load the model and the tokenizer, and we define a function that takes a sentence and returns the model's prediction. We then use the SHAP library to explain the model's prediction for a given sentence. We can see the input here, the output here and that the SHAP library returns the Shapley values for each token in the sentence, and we can use these values to understand how the model makes its predictions. Because this is a language model that predicts token after token, we get a set of Shapley values corresponding to input tokens for each predicted token. This is a force plot, where we start from the base value. This is the probability the model assigns to the outputted word “studying” when there is absolutely no input to the model. Then, we add the red contributions, subtract the blue contributions because they are negative), and we get the logit of the model predicting this output token. Neat! We link to this code in the description below if you want to play it. There, we minimally modified the shap library to make it work for modern language models, such as LLaMA 2, so we provide that package as well. Maybe soon, the SHAP library will support them by default. Now, let's get a little to the theory behind the code we've just seen and explain how Shapley Values are computed. Shapley values stem from far before deep learning was cool, namely from 1953 where Lloyd Shapley was thinking about how to fairly distribute the winnings of a game among the players. So, let's start with an example game: a one-sided soccer game, where we have a team of robot players that cooperate and try to shoot as many goals as they can. Based on how well players do in the game, we want to reward them appropriately. But how to determine how much each player contributed towards the outcome? Well, first we need to determine the base value, so the outcome of the game when nobody is playing. Then we can determine the contribution of each player by looking at all possible coalitions of players and see how much the outcome changes when we add a player to the coalition. Then we can reward them accordingly. To get to the formula behind all this, let's first switch to machine learning. In ML, players are inputs or features, for example word tokens. The outcome of the game is the model's prediction, for example the probability that this sentence expresses positive sentiment. The importance of an input is based on how much it contributed towards this prediction and this is what we want to calculate now. To compute the Shapley value for a player, let's say this one, we do the following: We look at what the prediction of the model is when this player is active, versus when it is inactive. Then the so-called marginal contribution of the player is the difference between these two predictions, which is zero in this case, because the presence or absence of the token "my" did not change anything. But you know, a player is maybe not that important because Messi and Ronaldo are on the team, so to really determine the effect of the token of interest onto the outcome, we need to look at all possible coalitions of players and see how much the outcome changes when we add and remove the other players from the coalition as well. Now, for this coalition for token “my”, we see a marginal contribution. And we do this exhaustively, for all possible teams, we sum up all marginal contributions, normalize by a factor taking care of the combinatorial effect, and we get the Shapley value phi for the token of interest. Done. Now we have the Shapley value for the token “my”, and what we did for that token, we can do for all other tokens in the sentence to get the Shapley values for all of them. Together, they tell us how much each token contributed to the model's prediction. They are positive if they contributed towards increasing this probability, negative when they decreased it, and zero if they did not change anything. Now maybe you should know, that while Shapley Values are awesome because they can work for any model and have these wonderful properties, in practice they do have some problems. Namely, before we said that we need to compute the number of all possible teams that the token "my" can get, and even in this case with just 3 possible teammates for "my", the number of possible coalitions is 2 to the power of 3, so 8. In other words, the number of coalitions grows exponentially with sequence length!! So, in practice, we need to approximate the Shapley values with Monte Carlo sampling and compute fewer of them, or as many as needed to have the first digits of the Shapley values correct (in the same way in which we do not need to compute all digits of pi). Then the other problem, is that Shapley values assume that the input features are independent, and that they can be safely put together or ablated. But in reality, this is not the case, as some input features are correlated: for example, the word "new york" is composed of two tokens, and if we form a coalition with just "york", but delete "new", we basically have a degenerate team, and we have split teammates that should never be split. In practice, it is hard to determine these correlations and keep tokens together. The shap library, for example, handles this by first clustering the inputs and for a cluster, it either deletes the entire cluster, or not. Of course, there are lots of extensions that try to do this even better. Now, this was our short introduction into the huge topic of ML interpretability and if you want more details, go out and explore. I have a thesis to submit now, so I need go, but I’ll let Ms. Coffee Bean put a link in the description to a great starting reference into this topic. See you with our next video. Okay, bye!

Info

Channel: AI Coffee Break with Letitia

Views: 3,975

Rating: undefined out of 5

Keywords: illustrated Shapley Values, annotated SHAP, SHAP explained, how does interpretability work, ML interpretability fundamentals, why does AI say this?, internals of LLMs, neural network, AI, artificial intelligence, machine learning, visualized, visualizations, deep learning, easy, beginner, explained, high-level explanation, basics, research, computer science, women in ai, algorithm, example, aicoffeebean, aicoffeebreak, animated, animation, letitia parcalabescu, game theory for ML

Id: 5-1lKFvV1i0

Channel Id: undefined

Length: 9min 59sec (599 seconds)

Published: Mon May 06 2024