Bias in AI: Finding & Fixing Anomalies in Machine Learning Algorithms

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome everybody uh to an exciting uh important conversation about uh bias in ai we'll be diving into uh things in just a moment here but just to kick off i want to do a quick introduction of what kind of brings us here today um so my name is jacob smith uh i'm joined by my right hand ryan robinson who uh is probably not on my right side but uh say hi to the people ryan hey everyone um and we are here representing an exciting community called collider uh that is essentially a community of technical professionals so data software and other engineers and professionals who are using cutting edge technology to solve real world problems um it's all about collaboration and problem solving if that sounds like you we'd love you to be involved join our slack channel connect with us wherever uh wherever we're at we will fill you in in the chat in just a moment and how you can stay involved i we're here representing an awesome company called ultimetric digital transformation is the name of the game which basically means we help uh companies stay ahead of the curve with the ever-changing landscape of uh technology um and uh one of our speakers is representing ultrametric as well which i'll get into in just a second so again today we're talking about bias in ai and machine learning algorithms how to identify where prejudice assumptions may sneak into the equation and how to mitigate so that you can be ensuring that you're making good decisions um with your algorithms we're joined here today by a couple of very very smart and knowledgeable gentlemen uh anthony martin from the ultimetric team uh as well as chris penny from mythics inc who are going to be diving into they can introduce this in a bit more detail but they're going to be diving into some specifics about what i just mentioned um i also encourage you to check out uh their most recent presentation that they did with with our team that was all about how ai enables computer vision which was a really exciting session uh maybe a month or two ago um that i just dropped the link in the chat so be sure to check that out as well um and before i pass it off i just want to encourage everybody this is intended to be interactive collaborative so please feel free to drop questions in the chat as we go um and we're gonna leave time at the end not only for uh lingering questions but also uh we're gonna be asking folks to share you know how what you learned today might apply to what each of you are doing with your own projects uh we'd really love to to kind of hear um from every whoever's uh interested in sharing so something to think about as we go along and so without further ado uh anthony and chris take it away hey i'll do a quick introduction guys my name is chris i'm with a systems integrator mythics inc uh incorporated i'm based out in austin the name of the game for us is digital transformation as well but the space we play in is mostly um migrating customers from on-premise workloads to cloud-based workloads and oracle google and a few other big cloud vendors but i am excited to speak on this topic i've got some experience with bias and ai and i think the goal of this presentation first we wanted to talk about the multiple contexts in which you might hear of bias and ai because it's not just you know systematic prejudice bias the terms thrown around a lot so we're going to discuss those other the other contexts in which the terms thrown around and then we'll dive into some code and some questions so i'll pass it off to you anthony on that note yep hey welcome guys so uh again i am a deputy ford motor company i'm an employee of ultimetric uh and ryan and jacob are my colleagues so i work in electric vehicle space involved heavily in big data so i every day in day out i work with electric vehicle data uh do the data analysis you know glean some insights so i got rid of experience in machine learning and ai so as chris said i'm very excited uh to talk on this topic so without further ado um again as jacob said right if you guys have any question we want this session to be as much interactive as possible if you guys have any questions please do stop us um and we can address as and when you have one so chris take it away awesome so before we dive into uh ai and bias let's make sure we have like a common language and when we're referring to these these terms make sure we're all on the same page and singing the same song so the first slide we wanted to do what is machine learning and i know we've got a multi-disciplinary software heavy crowd so i wanted to provide an interpretation that i think intuitively makes sense for a lot of uh you the members of the crowd and that is an ai model or a machine learning model is really just a software model that makes a prediction but the way in which the software model is designed it moves away from rule-based models to actually learning from data learning from experience so in that sense most of machine learning or at least the machine learning we're going to talk about today is based on pattern recognition what patterns are recognized in the data another interpretation is based on this definition which i'd like this is from a famous british station it's it's all models are wrong but some are useful and what it's referring to is that models are really just an approximation of reality right so you a models contingent on two things it's contingent on the data that it's fed with because those are where it extracts and learns the patterns those are where it does the pattern recognition and it's also based on the model class so the way in which the model sees reality uh is very contingent on those two factors and this is important because when we get into the bias topic later we'll talk about how any kind of inconsistencies or redundancies or noise in any of those two areas is going to propagate and percolate when the model's making predictions so you can go to the next slide anthony this is another important thing to remember just to make sure we're on the same terminology machine learning models are useful or they're most useful when making predictions when the future looks like the past the past in this sense just represents the data that the model was trained on so this brings up some interesting questions right and one of them for example is what happens with rapidly changing fields right for example maybe the finance sector or any kind of social media app where trends on social media rapidly evolve and change based on some larger context another thing is what happens when the future looks different from the past these are all interesting questions that we'll dive into later in this presentation uh before we jump into that though i do want to mention there is one term in machine learning it's called iid it means identically and independently distributed this is an assumption that we make with models and that assumption is essentially around this we're assuming that the data at least each road or each instance in the data we're assuming that it's independent of other examples in that data set and we're assuming that it comes from the same distribution so it's drawn from the same probability distribution when these assumptions are not met we have to as uh modelers as engineers or as data scientists there's certain special i guess techniques um and other processes that we have to do in light of those assumptions being violated so you can go to the next slide answering that common language understanding right let's just talk about like what bias is like from from our understanding right from a human perspective we think bias is like favoring someone um unjustly right or inversely like it also means discriminating someone um right so that's what we all define while biases so it's integral to know like why this is happening like nobody wants to be a bad person right nobody likes to be called like or he's a discriminating person but why does it happen right it is because of the most laziest part of our um you know organ in our human anatomy which is our brain that bugger is lazy right um after a certain age he does not want to learn or he does not have the same curiosity to learn the new experiences right rather what he tries to do is just you know when he sees some things smell something feels something right one he tries to associate it with uh experience that uh brain remembers from the past and the second thing is because you know we have 567 megapixel camera like two of them in our eyes but our brains are not capable to process like all the information that we get right so it simplifies like how often right do you guys you know run into someone in a mall like or bump yourself like in the edge of a bed right no i cease it right but still we go hurt ourselves at times because you know we oversimplify things our brain is unable to process all the information that we have it only selects few information from here and there simplifies the the big complex world into a smaller narrative so that it's easily processed by the brain so that's what happens right so we have an experience in the past and our brain simplifies stuff when these two things put into um together right at times unknowingly right we end up in uh uh being a biased person so if any one of you think that you are not by us then you are wrong right every single human being right is biased in some way or another it's just that at times we don't know or at times we know but then we uh you know hold it back until you know we we get more experiences right so the work people that get into trouble is when uh the people who do not realize that they are biased and then they speak out way too quickly than they think right so let's take the same principle and apply it in computer learning or artificial intelligence right so artificial intelligence and machine learning does the same thing it does the pattern recognition it wants to understand the data like you can give a complex set of data but it want to simplify it in a mathematical space if you will so that it can tell a story it can it can predict something out of it right so given 200 parameters that you have right it wants to simplify it understand from it and then predict like if he or she is a good or bad person for example um so does it mean that a machine is uh discriminating no that's not the nature of a machine right a computer cannot discriminate as chris pointed out right uh machine learning is just uh you know the way you train is like take a point in time in history represent that in terms of data which is from the point of human leave right so all the data like i see history as something i represent it in a format i can given that i'm a flawed person my brain is flawed in thinking right i'm a biased person so when i put together our data it's kind of biased in a way right i might not realize that the model all it does is it just reflects what some of the biases that we have in the data right and there are a few other cases which from which the bias could uh you know propagate for example the systemic uh the bias from the model itself or the tuning that we could do in the model right there are other parameters but but a machine by itself is not capable to to discriminate someone right it just reflect um uh you know what it had learned from the histories though we want our machines to to assume that the world is ideal like we are in the garden of eden and we wanted to predict something nice and beautiful and not discriminate anybody but we as humans have to be very conscious in providing such a data to uh to the computer so that it can learn from it right if you just give the data from the history which is again uh uh our perspective of what the history was which will have some bias the mission is just going to resonate it so one thing that any ml and the ai engineers should realize that is that any data that they get is in some shape or form is biased right so they should recognize these facts and then try to build off of it um so then what does a bias means to a machine right so for a machine it's just an error right it was trying to predict like for example uh um if you want to predict like a a a white male with a bad and uh with a bear and that you know hat is jacob smith right um if the model ends in predicting it as like uh chris then then yeah we may say that it's a bias right but to a model it's just a prediction error so what are different types of prediction errors like i as i highlighted here in the deck there are three forms reduced to irreducible and biased um so reducible and irreducible errors are variances and um we it's it's so any model that we built right it would have these two factors a bias factor and the various factor and it's up to the engineer or the data scientist who's building the algorithm based on the use case they have to determine if their use case warrants for weightage on the bias in some case or do they want the model to be uh highly variant right so they bias and variants are inversely proportional for example if you increase the bias then you can reduce the variance if you increase the variations in the ba the bias would be reduced so what is bias itself right so put it in the ml term or artificial intelligence term bias is nothing but uh the ability of the model to learn all the features in the data set and then be able to predict something right so when when uh there is a high bias then that means that the model is favoring a particular class right for example we might have cat dog you know bird or whatever species and in every single instance our model is by us to predict whatever input that we give as a cat right um so that is it's basically like how well are the model understood or able to understand the functional form of the independent data and then be able to predict so variance is something where our model is able to generalize the data enough right it was able to learn super good uh during with the test data but then when you present it with uh or when you put it in practice in the real world it's making bad predictions again example if you build a model for let's say for americans right just by their features you know you you tell them like you either you know where he could potentially live right based on their slang their beliefs and whatnot so if you build such a model and give an input of a feature of an indian like me then the model is going to of course not be able to predict because the model learned um very well about um all of you know american culture the races everything but but uh it learns so well that it's unable to predict anything outside of the given feature right so that's kind of the variance factor so you can um so the it's upon the engineer to to based on the use case to to assess like you know what they want to to control so in some cases we want to the ideal case is that we want to bring down both the bias and variance to the lowest number possible but that may not be always possible and in some cases we may want the model to be a little bit more biased than we want or in some cases we want the model to be a little bit more you know highly variant compared to the bias factor right so with that um so chris do you want to add anything uh yeah then the next slide i showed the trade-off yep so you had to jump in where anthony said um error right it can be decomposed there's a lot of there's a lot of different decompositions of error but a useful framework is systematic error and then random error the random error is the noise uh in the data right so in order to reduce that i mean it's kind of irreducible unless you have a bigger data set with smaller data sets you're likely to have more noise just because there's less samples but what we want to focus on is systematic error and this is what anthony was talking about systematic error is composed of two things it's bias and it's variance and there's usually a trade-off so when we increase one we're usually decreasing the other and vice versa optimal model performance is somewhere at low bias and low variance and it's important to remember that these two quantities they're not necessarily linearly related so if you increase one doesn't mean that you're going to get like you know a proportionate linear decrease in the other the relationship's more complicated than that but ideally what we do is we want to tune the model such that it has low bias and low variance because that's going to give us the lowest amount of model error and that's going to give us the best generalization performance and when i say generalization performance i mean the best performance on like new and unseen data you want to go to the next one anthony so here's an example of some of the the different bias and variance ratios um you can see those three i mean these are simple models they're in two dimensions so that we can visualize them but you'll see on the left it's the underfitting model which is a high bias model under fitting essentially means that our model is making too many assumptions right and it's too simple so in learning the relationships and and based on the assumptions that it's making uh it's assuming very simple relationships and that's what it's encoding in the model and that's also what it's essentially projecting on new data remember models are really just approximations of reality so the approximation of reality is is too simple and it's not going to perform well then we could have overfitting which is the case on the right that is where you have high variance and you have low bias in this case the hypothesis which is the model it's too complex and that just means in a nutshell that it's learning some of that noise if you remember on the previous slide we talked about systematic error random matter random error random error being that noise if our model is picking up on that noise which is very specific to that data set think of it as like idiosyncrasies and it tries to project that on new data obviously there's going to be a problem and there's going to be some error so the best model is really a compromise which is low bias and low variance so the next question naturally is how do you tune the bias variance trade-off and anthony feel free to jump in i've labeled a few on the left there's two things you can do one of them is you can tune the feature space which is the amount of predictors by adding extra predictors you are increasing variance and decreasing bias and then by decreasing the amount of predictive variables you have available to you you are increasing bias and decreasing variance another way is regularization that is a technique that you can use um it's kind of outside the scope of this presentation but for a curious reader i encourage you to look into that most of these models have regularization built into them where you can tune that parameter and you can control the bias variance trade off you want to jump in anthony and yeah i think so yeah the one thing that i want to add is like future engineering right so to splice it down in terms of you know english so in the model we have to be very conscious right now we might get tons and tons of data but we need as an engineer we need to assess like which data element do we really need for these use cases right so again having a lot of data always helps in better advanced model right with every new feature or column that you add in the excel in a powerpoint you would be able to explain or tell the story a little bit more better but unknowingly what we end up also doing is that we end up in creating a bias for example if we want to measure the success of people in america right so how do you do that so if you if you use uh salary as a success right the amount of money that they make and if you want to use that as a success that's fine but so that should be just based on the position that they hold the number of experience that they have and the education that they have and the money that they make but the moment you add the gender feature or the functionality in that which is not totally necessary in this case then our model would end up in biasing towards like okay male uh colleagues make more money than female colleagues right so that sort of gender bias would would automatically come in so a engineer should be um you know should and understand the data set that he or she is getting and they should carefully pick and choose that the features or the data elements that they want to use for the purpose of building this model um some of which is part of future engineering and real quick just to piggyback off that um you do a lot of that in the exploratory data analysis phase so that's one of the first phases of any nlp project where you're exploring the data you're seeing what values exist and then you're even making uh i guess you're bookmarking that hey this predictor variable could be problematic down the road but uh we will have to debug it later and we go into some of those debugging techniques later in the presentation you wanna go to the next slide so this is just uh it's really the same chart as before but um maybe a little bit better and it shows how the relationship between bias and variance isn't linear and that you can have low bias and low variance and that both both quantities aren't necessarily increasing or decreasing at the same rate the most generalizable model as you can see is in that middle band that's going to have low variance low bias you want to go to the next slide now so we want to i don't know if we need to touch up on imbalance classification too much just because we are crunched on time so i think we can go into this with the uh the code blocks and the google collab code is that fine with you anthony yep okay let's skip this one and then we'll go to responsible ai responsible ai bias ai so this is where we loop back to some of the things we talked about in the beginning of the presentation um and we talked about hey what does it essentially mean when ai or essentially like a model could learn some of the systematic bias or some of the prejudice bias like gender bias age bias racial bias in the data set as i mentioned before you the model's trained on data and the model makes an approximation of reality based on that data and it also too depends on the complexity of the model so the model makes assumptions with that but if there is bias in the data and sometimes the bias is easy to spot you know like anthony mentioned if you have a predictive variable that's gender then you know it's likely that there could be some bias in that really depending on uh what time period you collected that data for example if you're looking at making a prediction model on maybe data from like the 80s or the 70s where women weren't given i guess the same access to opportunities then a data set from that time period likely will have some bias and then even today um with a lot of the data sets there's still some of that systematic bias and that lack of access to opportunity which will kind of reverberate itself and be found in the data i went right here at the bottom i went into some of the specific sources of bias and ai one of them being measurement errors that is if when you're recording data if there's an error in recording it there could be an error with whatever tools or systems you're using to record that data or if you're maybe doing it hand coding it there could be a hand coded error in there another source of bias is corrupted data and then a third source of bias which we want to touch upon and talk about a lot is lack of representation or diversity in data many data sets present a unrepresentative view of the population and you can also have subgroups and subsections of that population being underrepresented in the data and i'll give you an example a good example is that most of the early facial recognition data sets um if you like deep dive into those data sets you'll see that 80 of the faces in those facial data sets are like caucasians and i think 75 or more are males so that's going to bias the data set uh in terms of performing better maybe on that specific subgroup and not performing as well on other subgroups like minorities or women or even you know both right if if you're a woman minority then you're going to get probably the worst performance out of all any answer you want to jump in and maybe add any of that yeah so the lack of representation right it's very important to collect data from uh from a huge multiple source right so um you just do not want a person or a team of like you know maybe you know three or four indians or maybe three or four americans joined together to collect the data we want a diverse demographic personnel to collect the data for the model uh to learn from it because the diversity in this case is very important as i said uh each of us you know i'm an indian and like an american versus japanese versus chinese we all bring in our own diversity our own understanding of what the world is so when we all collect the data right we in a way we kind of pretty much approximate what the real world is right instead of if we just have like one um group of people collect the data then we only address a part of subgroup which chris kind of touched upon it a while ago so it is very important to have a diverse group of team or teams uh to collect the data so that is the most important thing in any machine learning right eighty percent of the work is in collecting a diverse rich data and then having a clean data that is eighty percent of the work uh the rest twenty percent is simple model building right and then to jump in the final i guess main source of bias and ai's data access and a good example for that is that most of the language models today are trained on english data sets because the internet is mostly consisting of english data sets so these language models will do a lot better uh when it comes to you know english language and english language data sets than they would another language right so that's another example of how kind of bias creeped in and that's just because of data access right and the internet like i mentioned earlier is mostly english data sets and then with the lack of representation diversity a lot of those early facial recognition data sets were mostly caucasian and male so we could go to the next slide um two this is uh yeah i guess the two types of bias um this goes this ties back into the the data diversity there's uh systematic bias and that is let's say if a data set is collected from a certain time period then it could reflect maybe certain prejudices or lack of access and then sample bias and that goes into whoever is maybe doing the sampling whether it's automated or whether it's a person doing it if they're collecting data that doesn't accurately reflect or represent the group or the environment the model is expected to run on then you're going to have errors down the line and some of those errors lead to big reputation challenges for big big companies so a good example is there was i read this one research paper where there was gender bias in a x-ray imaging application where it was doing a lot better on men for detecting thoracic cancer than it was on women and there's even policing a lot of those early policing data sets they were almost disproportionately sampled to have many like non-caucasians so those sets obviously did better on those subgroups and subpopulations to the point where it led to some inferences that were almost perpetuating you know the systematic and sample bias so we can jump to the next slide so then the next question is um with all these woes and all these problems what do you do right and this is there's a very active area of research in this right now and it's called model explainability and interpretability this is essentially the the solution or at least the best one that we have right now to addressing some of these issues so interpretability i've defined it uh based on that article explanation and artificial intelligence and interpretability is the degree to which a human can understand the cause of a decision right another definition is the degree to which a human can consistently predict the model's result this is important because when models start misbehaving or when they start acting differently than how we intended them to be decision makers stakeholders data scientists we're going to have to be able to explain why a model is making this prediction or why a model is making this decision the image on the right we go over this in the code block so i'll go over briefly right now it's just a bar plot of directional feature contributions where it shows for a particular sample and that sample is going to be example number 182 from a data set the prediction was that this is the titanic data set by the way the prediction is that they weren't going to survive um and then what this chart does is it shows why did the model lead to that prediction right and it looks like the two main reasons for the model leading that prediction were one uh the gender male and two the age so systems like this are important for how we can debug if a mile if a model is biased towards a certain group like a certain gender or towards a certain age or towards a certain racial or ethnic group you can go to the next slide so another thing we're doing an active area of research or maybe not not so much an active area anymore is um uh interpretable interpretable models versus black boxes right so in the case of business decisions that need to be made there's usually often a incentive or there's a pool to have interpretable models because these can be easily explained these can be easily analyzed if you have a governing watchdog or some kind of regulatory committee and they're asking hey why did your why did you guys do this for example if it's a fraud detection application they can easily explain the conclusion or how they got to this conclusion with these kinds of models and examples of these models are linear models decision trees k-nearest neighbors and then you have black boxes which are harder to interpret ironically the black boxes usually are better performing models so you'll usually get better performance but at the at the cost of interpretability they're going to be the ones that are harder to explain what's going on just because these models are a lot more complex and examples of these models are support vector machines ensembles and neural networks what i have on the right is a decision tree and this is for a flower classification app it's a pretty basic data set it's anyone that's messed around with machine learning should have um has probably encountered this data set but it predicts what uh what's a like what what for example what what kind of flower it will be right so either a versa color a setosa or an iris and the decision tree shows you how it came to that conclusion right it shows you that hey um if the for example the first node if the length is under 2.45 centimeters if the pedal length is under 2.45 centimeters then it's going to classify it as setosa versus if it's over that then you're going to move down to the tree so it's a very interpretable decision making process and answer you can go to the next one if you want real quick this is more on model explainability and interpretability some of the tools used for the curious reader these are these tools are in python but i think they might be in some other languages too lime which is it uses local models for interpretability then you have shaft shafts interesting it uses game theory i don't know if anyone's into economics or game theory but it uses game theory in order to arrive it how it determines feature contributions you have tensorflow those are the those are three of the tools that i like to use personally and then some of the techniques i mentioned used by these libraries are directional feature contributions partial dependency plots and local surrogate methods and we can maybe dive into these with some of the code but that i do want to leave some time to go over that so the next thing is probably the last slide i'll run through this this is ml ops anyone interested in deploying machine learning at enterprise level or with their uh with their company is probably going to encounter this this is somewhat of a burgeoning new branch it's really the intersection of devops software engineering and machine learning but this goes over the entire model pipeline including the feedback loops so if you're doing machine learning at an enterprise scale you're probably using some kind using some of these software development paradigms like continuous development integration and you're probably deploying these machine learning models via containers and a container orchestration tool like kubernetes but this process is important as a machine learner or as a data scientist since there's so much automation you have to have occasional health checks to make sure that your model's not biasing right once you put once you have a turnkey solution um and it's important to monitor the health of that and what that means is you're going to have to retrain the model right if you find your model performance starts degrading over time maybe because the data it's being exposed to is different from the data that it was trained on you're going to have to retrain the model or if you find that your model is actually biasing or maybe a specific subgroup of people where it's not performing as well on that group as it is another group and you're going to want to retrain the model with new data you're going to make changes and this whole process ml ops just discusses that uh that that loop of retraining your model and then redeploying it and then retraining your model again and then redeploying it but anthony go ahead yeah so that that is the last point that chris mentioned is very important like unlike a java function right if you want to add a plus b like it's a rule-based function so no matter how many years it takes that function would work that's not the case with the model our model will decay eventually right it's like an apple um at some point in time it will decay so it is very important for a data engineer or a data scientist or a ml engineer to keep an eye on the model performance and have a continuous feedback loop to train the date uh try have the model trained continuously um so that the model is up-to-date yeah and that frequency is going to depend too on what industry you're in if you're in like a finance industry then where the data is changing a lot faster you're probably going to want to retrain it maybe once a week or maybe once a day depending on it also depends on what your financial constraints are too right because a lot of this is cloud deployed and whatever your footprint is that's going to consume resources and money but uh this is i'll give you i'll give you an example of a good one let's say you have like a some kind of social media app right and there's maybe or you have a wine tasting app for example and your model was trained on uh the popular wines at the time which was like mostly red wines are mostly a certain kind of wine quality well for whatever reason the popular wines of the day change maybe like six months down the road then your model's performance is probably going to degrade on maybe the new set of wine right because there's new features there's new feature combinations you're going to want to retrain your model and that new data set so that it's accurate so that's that's a simple example but i think we should jump into the code real quick and then answer some questions after that all right again so please do interject and uh you know ask any question that you may have so the data set that i'm going to show is basically a financial data set uh we have you someone a bit of information about our bank customers all we got to do is predict if we can sell a particular banking product in this case like term deposit to the customer right so i'm gonna jump through all the pre-processing techniques and just jump into biasing machine again i'm using a very simple model here like logistic regression this is maybe the like this is like a kindergartener model uh in machine learning right anybody who wants to learn machine learning the first model they probably would be building is this uh logistic rotation um so we have this data set it has some like 36 or 50 000 records i don't know and it has only two classes right no are there potential customer or not right so 0 1 1 true or false um so our goal is to to build a model to correctly identify the customers or have the model learn the future enough that it can predict who would be uh the targeted audience for this new product um so what i'm using here is like a grid search so um so let me explain it with this so some of which we kind of touched upon in the presentation right so uh what i'm doing here right so though it's a logistic regression model there are a lot of parameters that goes into this model and each of those parameters uh in some way or another affect the performance of the model so in this case what i'm doing is i'm doing the performance tuning i'm trying to pass like a bunch of parameters to this model and i'm creating multiple models on the same data set to understand like which is the best performing model which model gives me the absolute best accuracy score right so if you see here right there is again i printed the number of times this model rang so it again each line that you see is like a new set of model so this is the parameter that it ran out and it had the accuracy of like zero percent whereas with this parameter the accuracy was 88 percent so multiple models were created and we chose the best model at the end of the day which is like 89 percent at usc so who which one of you think like 89 accuracy is good right anybody like a model like almost 90 percent right it's able to predict things correctly so does it mean that the model is doing well right so i don't think so so accuracy is just like one aspect of of um understanding the model so think about this so in an airline industry again uh chris could uh you know remember this this is a example that one of our uh professor uh gave um to us like uh for machine learning if you are building a machine learning model for uh output screening like of tens and thousands of people are like good people right they are not terrorists but there is just one or two people who very rarely show up who might be a potential terrorist so do we so our if we had to build a model to identify like of all these people that are traveling through the terminal identify that one person who might be a threat to the national security yeah if sorry well i was going to say it's an imbalance classification problem right so this is a good example of when you have a majority class which is you know obviously no not terrorist uh but you have a minority class which is a very small small small segment of people and you want to predict uh essentially you know does a person belong to that minority class of you know a being a terrorist or a being a threat this is an example of an imbalanced classification problem so i'll i just want to throw that out there absolutely yeah so in this case right just by looking at the model performance my model would probably be able to predict like uh 99 with 99 accuracy that like all the people that i'm screening are are not a terrorist but for whatever reason this point one percent model error right that could lead into identifying uh or or invariably invalidly or misclassifying terrorist as a good person and letting them into the into the airplane right so we do not want to make that happen so just the accuracy right that time doesn't help you need to dig a bit more than that so um so let's look in terms of like you know how what are the factors that we could use to understand like you know how well the model had performed so in this case uh again this is a chart that we also showed in the presentation so this model is uh overfitting if you see it it started learning very well uh at the beginning but as we um you know as the model was training right it was uh sharply you know uh you know came to um came to a point right but the learning rate if you see it's kept increasing a little bit it's not much right so it was uh 0.025 yeah so it started at 0.025 and came at a 0.1 so the the learning rate is a little bit higher compared to where it started but but it's a little bit higher as well the um if you see the the validation which is like a testing at the end of the day right the loss is uh very high but then it eventually uh zeroed in on on uh the prediction like way too quick so this is a classic example of like an overfit model where a model is uh has learned um or is is learning very well the training data set but when you present it in the real world scenario with a different data set the model performance would not be that great so this is like we have uh basically this is like a high variance right so this is not good likewise i tuned the model again the same data set i tuned the model a little bit uh and this is uh what i get uh so i miss i think overfit model very sound effect model so this is good fit okay anyway i don't know so i don't know where the under effort model is um okay so this is a good example of like a a good model rate so where the training loss uh was uh very high at the beginning of the training session because that's when the the model is seeing the data for the first time it's trying to make sense out of it so it is so it's reducing the loss at some point in time it's it's learning uh something different so it loss goes a little bit high but as it as the iteration goes on as it sees more and more more and more data the the learning curve is out of the way right so this gives a good sense that you know the model has learned uh enough uh uh data points and it's able to generalize the data enough such that it's able to predict uh uh and correctly uh or when it sees a new set of data the same applies here in the in the validation as well like it starts the uh the loss starts at very high number and then it uh asks the model as the data like other model you know traverses through the data the loss kind of keeps decreasing but at some point in time it hits the plateau and this plot is very close to that of what uh the model saw uh during the training example so this is the kind of model that we want to ideally build at the end of the day so this is how we do the the bios versus the variance trade-off right you build uh the analytics like this run multiple models right it's not in machine learning or artificial intelligence it's not just like you build a model and that's to deploy so you have to build many many models and then compare the performance and some of this analytics out of it and then build or pick on the one best fit model that would meet your need uh so what i'm showing here is uh is a classification problem uh so um chris do you wanna add anything before i move on to our regression uh i think we're good um i might want to do some questions though just because it's really got two minutes left but uh yeah anthony hit that nail on the head right this is in terms of uh this is called like the model development phase but specifically model selection where you're comparing different models and what you're doing is you're comparing different models of the same class with different like parameters different tuning and you're probably also comparing models of different classes and which whatever one has the best performance is usually what you want to deploy but then as you mentioned before you want to monitor that deployment just because with traditional software deployment things are more static but with machine learning models you have the aspect of data and you have the aspect of data changing and that can cause model degradation over time depending on kind of how quickly the uh the the input data that the model is being exposed to changes and in which case you would wonder you'd want to do this whole process again but if this process is automated then that can kind of compound the difficulty if you're not being vigilant of you know certain bias or a certain errors yeah with that we can open it up to question so uh and this is uh like a regression problem again this is like a high bias high variance so the model that we want to build is something like this so this is something that chris covered in his presentation yeah are you able to share your code with them as maybe like a google collab okay because i i have a code snippet too and i'll share it in the comments i don't think we're gonna have time to go over it but both of us can kind of share our code snippet and they can maybe take a deeper dive into some of the code and i see you've got comments which is good so i think that'd be a good idea that's great and i'm curious so first of all does anybody have any questions and also or alternatively does anyone have anything they'd like to share in terms of like reflections or how this information might apply to the projects you're working on you can also feel free to drop it in the chat if you're feeling shy uh angel says are the slides available to look over yep um yes i don't see why not it's more of a thing for your anthony would you be able to share the slides with him yes i'll share it with you um you know jacob and ryan both the slide and the code snippets um yep perfect so we'll we'll be sending a uh follow-up um email angel uh and for everybody um with that information as well as uh there there will be a recording of this presentation as well um so if you missed anything from the actual presentation uh that'll be on our youtube channel which we'll share in just a moment all right hi um this is stephanie i have a question um well yes i have a question um next month i'll be presenting uh virtually at a conference called data justice hosted by the university of cardiff in wales and my my topic is um bias and ai and the um i guess the ethical responsibilities of developers um you know what are their responsibilities that they have when they're building tech that ultimately becomes like a tool of state violence like for example with facial recognition and so i guess i was just wondering i was interested in this talk because i would wanted to know like from a technical aspect how um i guess how can some of these approaches be applied to that um right it's a great question how can i guess how can you apply this to um addressing that question right so the first thing that that any uh data scientists should realize is that a like they as a as a human right they have a bias like within themselves and the data that has been collected is by us right so first realizing that that our world is not perfect the data that i'm getting is not perfect right there are some inherent issues within the data and within my view of looking at the data that is very important right acknowledging that there is an issue so once you acknowledge it right then like i said 80 of the work lies on understanding the use case so you should have a good knowledge on on what problem you're trying to solve as well as understand the the data value like you you would be presented with list of data sets right understand the data sets that you have like each and every single data element have a good understanding of that um some of the data right once you start understanding it you might realize like you know the example that i gave like to measure the success rate do we really need uh the the gender column to for the model prediction you don't right so you you then assess like what data elements you need and uh how you can or if at the end of the day if you realize like you know i do need some other data element for my model to not be biased to a certain group or a certain gender then you might ask for more data element or at times you could choose to you know eliminate few records itself like if it is not representing the world um correctly but for all of this right you need to have a good understanding of the business domain that you've worked on on it if you do not have them and that's fine but you should at least work with someone who has a good understanding and the grasp of the data right yeah so that is the first thing so chris is going yeah i was going to pick you back on that it definitely pays a lot of dividends to work with like a subject matter expert uh for whatever the domain is and do you mean detecting bias is hard removing bias is probably even harder i think some of the techniques you went over and the powerpoint like the explainable ai and some of those tools for example shap and lime for one showing you why or how a prediction was made so if a model makes a prediction then those tools can kind of debug it and say hey these are the particular predictors and features that led to us making this prediction so that's one way having interpretable models having explainable models another way is just increasing your data diversity and increasing your data representation so if you have like i don't know a very imbalanced ratio for a particular data set for example like a crime data set and you have a lot of groups that are maybe underrepresented or overrepresented in that it can go a long way in terms of making a better model if you focus on adding additional data for those underrepresented groups or increasing the data diversity um and then even i think we we went over this we had this in the slides we didn't go over it too much using other metrics too right accuracy isn't always the best metric there's other metrics you can use like precision and recall which will give you a better a better indication of if your model is underperforming on a certain sub-segment of people so those are three things you can do in terms of like diagnosing that bias exists and then maybe doing something uh prescriptive to help make a better model that is less susceptible to that kind of bias hey this is this is steve i will say um i i started collecting examples of of bias and ai back in 2019 and i've been sort of uh accreting this list over the past couple years and one of the things i've noticed is it seems like the vast majority of instances of bias tends to come from biased data sets it seems to be an extremely tricky problem to address particularly one of the things you just mentioned chris was looking out for certain groups being over represented or underrepresented in the data set that can be a really difficult question when you ask the question how do you define proper representation you know taking crime for example if you look at uh racial representation in the crime data set if you're defining proper representation for example by um the the proportion of different races that are currently incarcerated you're just going to end up encapsulating a lot of the bias and racism that's already inherent in the systems and so it can be really tricky i think to define proper representation and you can have really weird things that lead to biased data sets for example uh i think there's like a genomic study done in norway or finland or something like that where they did a survey of of the citizens of the country so for their purposes that should lead to a fairly even representation of their citizens but because of the details of how they did the survey you had to come in and and contribute you know a blood sample or something like that and it was during business hours which meant typically the only people who volunteered were the people who could afford to take off in the middle of the day or maybe worked at home because their spouse work which then lead to the data set being represented by more affluent people throughout the the the country and and pretty much no representation of people who you know work you know from from nine to nine and then go straight home to watch their kids or something like that and so yeah that's us yeah that's the that's the kind of sample bias that's a really good example of like sample bias for example where um it's almost like survivorship bias tune in that case to where only the people that i guess were affluent enough and had the time were able to even participate in the study but it's definitely a tough problem and it's even it's not as easy as just removing a feature too right if you have a predictor like age or gender and you think okay let me remove this my problem solved not necessarily because for example like zip codes can reflect a lot of systematic bias in terms of access to particular kinds of housing so it's uh i i think having interpretable and explainable models is probably um maybe the best solution we have right now and the cool thing about shop and lime is that they have you can use those with some of those black box methods like neural networks and ensembles that do provide that awesome lift and uh performance and the same time you can use those techniques to maybe get a better understanding of why it's making a prediction like this but it is certainly a very difficult problem and that's probably the biggest barrier for why you know machine learning is not as widely adopted in finance or health care or some of those other fields to where it has the most promise well yeah that's a really good question i think a really good point thank you fantastic um just in the interest of time thank you everybody for sticking around a few extra minutes uh this was fantastic uh anthony and chris thank you guys so so much for uh sharing your wisdom and kicking off a really big conversation uh in in a pretty remarkably simple terms given you know how large this topic is um i would love to keep this conversation alive i think there's just so much more to dive into and discuss um so encourage everybody who is present to keep in touch we will definitely be doing more uh similar types of topics um in the near future so we'll keep you posted again we'll be sending out a follow-up email also includes you to get connected with the links that we've uh shared in the chat um those will come via email as well um i'd love to maybe even i i'm thinking we could start like a data engineering channel uh in the slack uh chat so that we can just keep this conversation uh grooving in between presentations as well um so uh yeah that's that's another thought there um but in any case um thank you so much for everyone attending again the uh record the session was recorded we'll be sharing that recording um in the near future uh and we definitely hope to see everybody soon please don't be a stranger we'd love to chat with you you
Info
Channel: Collider
Views: 44
Rating: 5 out of 5
Keywords:
Id: UhEVLV95wM0
Channel Id: undefined
Length: 62min 10sec (3730 seconds)
Published: Thu May 06 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.