SHAP values for beginners | What they mean and their applications

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
have you ever wondered how your machine learning models work I'm not talking about the model architecture or which features are most important in general I mean how the model has made a specific prediction to understand this look no further than shop it is a powerful python package that will allow you to understand and debug your models hi I'm Connor and welcome to Ado today we're going to understand how to interpret sharp values and explore the applications of shap if you want more make sure to wait till the end of the video where I'll explain how you can get access to a python shap course let's get started so shap values are used to explain individual model predictions they tell us how each feature has contributed to that prediction specifically how the model feature has increased or decreased the prediction to understand this let's suppose HR has asked you to predict the annual bonus of all the employees in your company you build them model using a data set of 1000 employees the data set has five features including experience which is the number of years of experience and degree which is a binary feature indicating if the employee has a degree or not now HR may be curious about how this model Works they may even single out a specific employee and question how they predicted bonus was determined you may be tempted to use a method like feature importance to answer some of those questions and sure this can tell us how important each feature is to the model prediction in general but what about individual predictions feature importance also cannot tell us if features tend to increase or decrease the prediction or if we had a classification problem it would not tell us how the features change the probability of a positive prediction this is where sharp comes in yeah we have a shop waterfall plot for one of the employees there's a lot of information yeah so let's break it down e of f of x is the average predicted bonus across all 1000 employees in our data sets f of x is the predicted bonus for this specific employee the sharp values are all the values in between they tell us how each feature has contributed to the prediction when compared to the average prediction lastly the numbers on the y-axis are the feature values 31 equals experience tells us that this employee has 31 years of work experience we can see that the sharp value for degree is 16.91 we say that because this employee has a degree the predicted bonus is 16.91 higher than the average predicted bonus now that's a bit of a mouthful to simplify things we can say that the feature has increased the prediction keep in mind it is the features value in the context of the other feature values that has led to the sharp value for that feature the sharp value for degree can change depending on which employee you're looking at that is even if all the employees you look at have a degree hopefully this interpretation is clear when we have a continuous Target variable but what about classification problems let's look at another example we build a model used to predict if a mushroom is poisonous or edible now we can use shap to understand how each feature has changed the predicted probability that a mushroom is poisonous more specifically we interpret the shaft values in terms of log odds for example this mushroom smell or odor has increased the predicted log odds by 0.89 in other words its smell means it is more likely that we predict this mushroom to be poisonous you see how sharp values tell us which features are most important to an individual prediction we can also combine or aggregate shaft values from multiple predictions this includes the force plots mean shock plot P swarm plot and dependence plots these plots can tell us how the model works as a whole at this point you may be asking yourself why even bother is it not enough that the model is making accurate predictions do we really need to understand how it is making those predictions well sharp has some key benefits the first is debugging shop allows you to take a closer look at incorrect predictions and understand which features have caused the error we can also find cases where the model May perform well on a data set but would perform poorly on new data in production one example of this comes from a model used to power a mini automated car the model was not working correctly and we use shap to figure out why it turned out the model was using background pixels and pixels of objects in the background to make predictions so when we move the car to a new location the objects changed and the predictions became unreliable if you want to read more about this application then check out the article Linked In the description the second is that shap can provide the basis for human friendly explanations you may be cautious about the prediction that a mushroom is edible and rightly so that prediction can have serious consequences shap can be used to provide an explanation and increase trust in the model's prediction the last is data exploration a data set will contain all sorts of hidden patterns these include non-linear relationships and interactions Black Box models are really good at finding these patterns we can train a model on the data set and it will use these hidden relationships to make predictions when we interpret the model we learn what it is using to make those predictions sometimes we can learn something completely new in this way shap becomes a tool for data exploration and the knowledge we gain can go beyond the model for example building better model features for simple models like linear regression that's it I hope you enjoyed this brief introduction to the shop package if you want to learn more then check out one of these videos the first looks at the theory behind Chap and the second looks at the python shap package you can also access by python sharp course for free by signing up to the newsletter in the description this will equip you with the knowledge and skills needed to explain any machine learning model using shap
Info
Channel: A Data Odyssey
Views: 33,877
Rating: undefined out of 5
Keywords:
Id: MQ6fFDwjuco
Channel Id: undefined
Length: 7min 6sec (426 seconds)
Published: Sun Mar 12 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.