Bayes' Theorem, Clearly Explained!!!!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
base theorem versus statsquatch base theorem wins statquest hello i'm josh starmer and welcome to statquest today we're going to talk about bayes theorem and it's going to be clearly explained note this stat quest assumes that you are already familiar with conditional probability if not check out the quest that said let's do a quick review in the stat quest on conditional probability we took a trip to statland and asked everyone represented by a colorful dot if they loved candy and or soda these two people loved candy and soda these four people only loved candy these five people only loved soda and these three people didn't like candy and they didn't like soda bam then we calculated the probabilities for each cell in the contingency table by dividing the counts by the total number of people in statland 14. bam then we determined the total number and probability of people who loved soda and did not love soda and the total number and probability of people who loved candy and did not love candy bam then we calculated the conditional probability that someone in statland might not love candy but love soda given that we already know that they love soda we did this by dividing the five people that do not love candy but love soda by the seven people that love soda and we got 0.71 then just for fun we divided the numerator and the denominator by the total population of statland 14. doing this extra division did not change the result we still got 0.71 however now the numerator is the original unconditional probability that someone in statland does not love candy but loves soda and the denominator is the unconditional probability that someone in statland loves soda all in all the probability that someone does not love candy but loves soda given that we know that they love soda is equal to the probability that someone does not love candy but loves soda divided by the probability that someone in statland loves soda so one way to think about conditional probability is the probability that an event will happen in this case the probability that we meet someone who does not love candy but loves soda scaled by the knowledge we already have about the event in this case we know that the person loves soda note because saying someone who does not love candy and love soda given that they love soda is a little redundant many people omit writing out the and love soda part and shorten the conditional probability statement to someone who does not love candy given that they love soda while this notation is standard it makes it harder to see the relationship between the thing we want to calculate the probability for on the left and how we calculate it on the right using the slightly redundant notation for conditional probability makes it obvious that we want the probability that an event happens scaled by the knowledge we already have about the event personally this slightly redundant notation helped me get a better understanding of bayes theorem and we'll talk about this more at the end of the stat quest now let's see what happens when we change what we already know about the event from knowing that they love soda to knowing that they do not love candy now we have the probability that an event happens which in this case is the event that we meet someone who does not love candy but loves soda scaled by the knowledge we already have about the event in this case we already know that they do not love candy now we plug in the numbers and do the math and get 0.63 now let's compare this conditional probability where we already know that the person does not love candy to the conditional probability we calculated before where we already knew that the person loves soda in both cases we want to know the probability of the same event meeting someone who does not love candy but loves soda and that means in both cases the numerators are the same however since we have different knowledge in each case we scale the probabilities of the events differently and ultimately we get different probabilities hey look it's statsquatch statsquatch is our friend in statland and he always wants to make a bet i bet you one dollar that you can't solve the conditional probabilities without knowing the probability of not loving candy and loving soda in other words if we don't know the value for the numerators can we still solve for the conditional probabilities well even if we don't know the probability that someone does not love candy but loves soda we can multiply both sides of the top equation by the probability that someone loves soda and these two terms on the right cancel out and we are left with the probability that we meet someone that does not love candy but loves soda equal to this stuff on the left side likewise we can multiply both sides of the equation on the bottom by the probability of not loving candy and these two terms on the right side cancel out and just like before we end up with the probability of meeting someone who does not love candy but loves soda equal to this stuff on the left side now we have two things on the left side of the equal signs that are both equal to the probability that we will meet someone who does not love candy and loves soda now remember that statsquatch asked us to solve for this term and this term without this term and because both equations are equal to the term we want to omit both equations are equal to each other so let's move this up a little bit and move this over here now remember we want to solve for this term and we want to solve for this term we'll start with the term on the left first we divide both sides by the probability that someone loves soda and the probability that someone loves soda cancels out on the left side and we have solved for this term bam now let's move the thing on the right to the left and the thing on the left to the right and divide both sides by the probability that someone does not love candy and the probability that someone does not love candy cancels out on the left side and we have solved for the other term bam in both cases we won the bet with statsquatch because we no longer need to know the probability that someone does not love candy and loves soda but more importantly we have derived bayes theorem double bam bayes theorem tells us that this conditional probability which is based on knowing that the person loves soda can be derived from this conditional probability which is based on knowing that they do not love candy alternatively bayes's theorem tells us that this conditional probability which is based on knowing that the person does not love candy can be derived from this conditional probability based on knowing they love soda in general if we let a equal does not love candy and b equals love soda then we can rewrite each equation into the standard formula for bayes's theorem in other words the conditional probability given that we know one thing about an event can be derived from knowing the other thing about the event now statsquatch says dude you derived bayes theorem with just a little algebra what's the big deal when we have all of the data laid out in a nice colorful chart or in a contingency table then bayes theorem is not that big of a deal in fact when you have all of the data bayes theorem isn't even a small deal however most of the time we don't have all of the data in other words statsquatch might only tell us the probability that someone does not love candy given that they love soda is 0.71 and i'm not certain but i think the probability that someone loves soda is close to 0.6 and the probability that someone does not like candy is 0.57 and if this is all the data we have then we plug the numbers into bayes theorem and get approximately 0.75 and that means given this data which includes a guess about the probability someone loves soda the probability that someone loves soda given that we know they do not like candy is about 0.75 note attentive viewers may notice that when we calculated the conditional probability with bayes theorem which we used because we did not have all of the information the result is different from when we calculated the probability knowing everything this is because statsquatch didn't know the exact value for the probability that someone loved soda they just took a guess and while taking a guess might sound like a terrible thing to do it's the only option when we have a large population for example it would be almost impossible to ask every single person in india if they love soda so a lot of times we have to make a guess bayesian statistics is about understanding what it means to make a guess like this and all it implies bayes theorem is the basis for bayesian statistics which is this equation paired with a broader philosophy of how statistics should be calculated and we'll cover all these topics in follow-up stat quests however before we go i want to review the standard notation so if you research bayes theorem or bayesian statistics on your own you won't be totally lost like i said earlier when most people write conditional probabilities they do it differently from the examples i've given here specifically because they know that this person does not love candy they do not include it when stating the probability now the conditional probability reads the probability someone loves soda given that they do not love candy likewise because they know this person loves soda they do not include it when stating the probability now the conditional probability reads the probability that someone does not love candy given that they love soda in the end it looks like we want to calculate the probability of two different events however it is important to keep in mind that in both cases there is only one event and both conditional probabilities refer to the same yellow area in the drawing and the same yellow square in the contingency table and the only real difference between the two conditional probabilities is the given knowledge and that's why i prefer the longer slightly redundant way to write conditional probabilities because the longer way makes it obvious that in both cases we are talking about the exact same thing small bam no triple bam now it's time for some shameless self-promotion if you want to review statistics and machine learning offline check out the statquest study guides at statquest.org there's something for everyone hooray we've made it to the end of another exciting stack quest if you like this stat quest and want to see more please subscribe and if you want to support statquest consider contributing to my patreon campaign becoming a channel member buying one or two of my original songs or a t-shirt or a hoodie or just donate the links are in the description below alright until next time quest on
Info
Channel: StatQuest with Josh Starmer
Views: 26,230
Rating: 4.9407406 out of 5
Keywords: Josh Starmer, StatQuest, Machine Learning, Statistics, Data Science, Bayes, Probability
Id: 9wCnvr7Xw4E
Channel Id: undefined
Length: 13min 59sec (839 seconds)
Published: Sun Aug 15 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.