The Bayesian Trap

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

The error I think most people would make is that the False positive and false negative rate are connected. They, in fact, are not. They may be correlated in some cases, but they are each a different statistical description.

If you have the disease, a test could detect it 50% of the time. If you don't have the disease, it could falsely claim you do .000001% of the time. So, in this case, the test would detect the issue half of the time, but it would rarely make an incorrect false positive.

When you use the numbers in the video: 99% detection rate and 1% false positive rate, its easier to assume those numbers are related. Its sort of intuitive once you realize they are not mathematically connected.

edit: Thanks for the gold :-)

πŸ‘οΈŽ︎ 67 πŸ‘€οΈŽ︎ u/AgentSmith27 πŸ“…οΈŽ︎ Apr 05 2017 πŸ—«︎ replies

This is a wonderful presentation!

πŸ‘οΈŽ︎ 11 πŸ‘€οΈŽ︎ u/juliuszs πŸ“…οΈŽ︎ Apr 05 2017 πŸ—«︎ replies

Makes sense i suppose, that the chance of correct detection in a specific case is low when the chance of false positive is large compared to the frequency of occurrence.

πŸ‘οΈŽ︎ 4 πŸ‘€οΈŽ︎ u/rddman πŸ“…οΈŽ︎ Apr 05 2017 πŸ—«︎ replies

Can anyone tell me the name of the place where he is recording?

πŸ‘οΈŽ︎ 5 πŸ‘€οΈŽ︎ u/TwoPixelsRight πŸ“…οΈŽ︎ Apr 05 2017 πŸ—«︎ replies

This is also commonly referred to as the base rate fallacy.

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/FallingDarkness πŸ“…οΈŽ︎ Apr 05 2017 πŸ—«︎ replies

Shouldn't the hypothesis be higher due to symptoms and doctors guessing someone may have the disease?

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/Dr_Colossus πŸ“…οΈŽ︎ Apr 05 2017 πŸ—«︎ replies

Those insects looked hella annoying

πŸ‘οΈŽ︎ 2 πŸ‘€οΈŽ︎ u/BGsenpai πŸ“…οΈŽ︎ Apr 05 2017 πŸ—«︎ replies

Bayes Theorem is also used a lot in A.I (like the spam filtering he mentioned). The thing he mentioned where there are zero cases of something also happens a lot. The usual solution is to just add a fudge factor to avoid divisions by zero. For example, in a spam filter, how to you classify a word you never seen before? You'll need the fudge factor there.

πŸ‘οΈŽ︎ 2 πŸ‘€οΈŽ︎ u/Planetariophage πŸ“…οΈŽ︎ Apr 06 2017 πŸ—«︎ replies

This is wrong (the opening's hypothetical situation specifically). And it's the same damn mistake everyone makes.

In the opening, what are the chances you have the disease? Not much less than the 99%, despite the videos claims that you are much less likely than that. Why? Because you're not a random person, you're a person who went to the doctor and were suspected to have it.

If you test a whole population, and then focus on the subset that tested positive - they have the low probability of really being sick. Because their probability of having the disease is the 0.01 figure. If you're in the doctors office and they suspect this disease, your clearly not one of those 0.01% risk people. Your P(E) is much higher if you're showing symptoms of this disease.

This video explains why it's a bad idea to test everyone. A random persons probability of having the disease given a positive result isn't very high. Do not mistake it for meaning that if that cancer test you take comes back positive "it's ok, I'm still probably safe" - no, you're not, it's very likely you're ill.

πŸ‘οΈŽ︎ 7 πŸ‘€οΈŽ︎ u/F-0X πŸ“…οΈŽ︎ Apr 05 2017 πŸ—«︎ replies
Captions
Picture this: You wake up one morning and you feel a little bit sick. No particular symptoms, just not 100%. So you go to the doctor and she also doesn't know what's going on with you, so she suggests they run a battery of tests and after a week goes by, the results come back, turns out you tested positive for a very rare disease that affects about 0.1% of the population and it's a nasty disease, horrible consequences, you don't want it. So you ask the doctor "You know, how certain is it that I have this disease?" and she says "Well, the test will correctly identify 99% of people that have the disease and only incorrectly identify 1% of people who don't have the disease". So that sounds pretty bad. I mean, what are the chances that you actually have this disease? I think most people would say 99%, because that's the accuracy of the test. But that is not actually correct! You need Bayes' Theorem to get some perspective. Bayes' Theorem can give you the probability that some hypothesis, say that you actually have the disease, is true given an event; that you tested positive for the disease. To calculate this, you need to take the prior probability of the hypothesis was true - that is, how likely you thought it was that you have this disease before you got the test results - and multiply it by the probability of the event given the hypothesis is true - that is, the probability that you would test positive if you had the disease - and then divide that by the total probability of the event occurring - that is testing positive. This term is a combination of your probability of having the disease and correctly testing positive plus your probability of not having the disease and being falsely identified. The prior probability that a hypothesis is true is often the hardest part of this equation to figure out and, sometimes, it's no better than a guess. But in this case, a reasonable starting point is the frequency of the disease in the population, so 0.1%. And if you plug in the rest of the numbers, you find that you have a 9% chance of actually having the disease after testing positive. Which is incredibly low if you think about it. Now, this isn't some sort of crazy magic. It's actually common sense applied to mathematics. Just think about a sample size of 1000 people. Now, one person out of that thousand, is likely to actually have the disease. And the test would likely identify them correctly as having the disease. But out of the 999 other people, 1% or 10 people would falsely be identified as having the disease. So, if you're one of those people who has a positive test result and everyone's just selected at random - well, you're actually part of a group of 11 where only one person has the disease. So your chances of actually having it are 1 in 11. 9%. It just makes sense. When Bayes first came up with this theorem he didn't actually think it was revolutionary. He didn't even think it was worthy of publication, he didn't submit it to the Royal Society of which he was a member, and in fact it was discovered in his papers after he died and he had abandoned it for more than a decade. His relatives asked his friend, Richard Price, to dig through his papers and see if there is anything worth publishing in there. And that's where Price discovered what we now know as the origins of Bayes' Theorem. Bayes originally considered a thought experiment where he was sitting with his back to a perfectly flat, perfectly square table and then he would ask an assistant to throw a ball onto the table. Now this ball could obviously land and end up anywhere on the table and he wanted to figure out where it was. So what he'd asked his assistant to do was to throw on another ball and then tell him if it landed to the left, or to the right, or in front, behind of the first ball, and he would note that down and then ask for more and more balls to be thrown on the table. What he realized, was that through this method he could keep updating his idea of where the first ball was. Now of course, he would never be completely certain, but with each new piece of evidence, he would get more and more accurate, and that's how Bayes saw the world. It wasn't that he thought the world was not determined, that reality didn't quite exist, but it was that we couldn't know it perfectly, and all we could hope to do was update our understanding as more and more evidence became available. When Richard Price introduced Bayes' Theorem, he made an analogy to a man coming out of a cave, maybe he'd lived his whole life in there and he saw the Sun rise for the first time, and kind of thought to himself: "Is, Is this a one-off, is this a quirk, or does the Sun always do this?" And then, every day after that, as the Sun rose again, he could get a little bit more confident, that, well, that was the way the world works. So Bayes' Theorem wasn't really a formula intended to be used just once, it was intended to be used multiple times, each time gaining new evidence and updating your probability that something is true. So if we go back to the first example when you tested positive for a disease, what would happen if you went to another doctor, get a second opinion and get that test run again, but let's say by a different lab, just to be sure that those tests are independent, and let's say that test also comes back as positive. Now what is the probability that you actually have the disease? Well, you can use Bayes formula again, except this time for your prior probability that you have the disease, you have to put in the posterior probability, the probability that we worked out before which is 9%, because you've already had one positive test. If you crunch those numbers, the new probability based on two positive tests is 91%. There's a 91% chance that you actually have the disease, which kind of makes sense. 2 positive results by different labs are unlikely to just be chance, but you'll notice that probability is still not as high as the accuracy, the reported accuracy of the test. Bayes' Theorem has found a number of practical applications, including notably filtering your spam. You know, traditional spam filters actually do a kind of bad job, there's too many false positives, too much of your email ends up in spam, but using a Bayesian filter, you can look at the various words that appear in e-mails, and use Bayes' Theorem to give a probability that the email is spam, given that those words appear. Now Bayes' Theorem tells us how to update our beliefs in light of new evidence, but it can't tell us how to set our prior beliefs, and so it's possible for some people to hold that certain things are true with a 100% certainty, and other people to hold those same things are true with 0% certainty. What Bayes' Theorem shows us is that in those cases, there is absolutely no evidence, nothing anyone could do to change their minds, and so as Nate Silver points out in his book, The Signal and The Noise, we should probably not have debates between people with a 100% prior certainty, and 0% prior certainty, because, well really, they'll never convince each other of anything. Most of the time when people talk about Bayes' Theorem, they discussed how counterintuitive it is and how we don't really have an inbuilt sense of it, but recently my concern has been the opposite: that maybe we're too good at internalizing the thinking behind Bayes' Theorem, and the reason I'm worried about that is because, I think in life we can get used to particular circumstances, we can get used to results, maybe getting rejected or failing at something or getting paid a low wage and we can internalize that as though we are that man emerging from the cave and we see the Sun rise every day and every day, and we keep updating our beliefs to a point of near certainty that we think that that is basically the way that nature is, it's the way the world is and there's nothing that we can do to change it. You know, there's Nelson Mandela's quote that: 'Everything is impossible until it's done', and I think that is kind of a very Bayesian viewpoint on the world, if you have no instances of something happening, then what is your prior for that event? It will seem completely impossible your prior may be 0 until it actually happens. You know, the thing we forget in Bayes' Theorem is that: our actions play a role in determining outcomes, and determining how true things actually are. But if we internalize that something is true and maybe we're a 100% sure that it's true, and there's nothing we can do to change it, well, then we're going to keep on doing the same thing, and we're going to keep on getting the same result, it's a self-fulfilling prophecy, so I think a really good understanding of Bayes' Theorem implies that experimentation is essential. If you've been doing the same thing for a long time and getting the same result that you're not necessarily happy with, maybe it's time to change. So is there something like that that you've been thinking about? If so, let me know in the comments. Hey, this episode of Veritasium was supported in part by viewers like you on Patreon and by Audible. Audible is a leading provider of spoken audio information including an unmatched selection of audiobooks: original, programming, news, comedy and more. So if you're thinking about trying something new and you haven't tried Audible yet, you should give them a shot, and for viewers of this channel, they offer a free 30-day trial just by going to: audible.com/Veritasium You know, the book I've been listening to on Audible recently is called: 'The Theory That Would Not Die' by Sharon Bertsch McGrayne, and it is an incredible in-depth look at Bayes' Theorem, and I've learned a lot just listening to this book, including the crazy fact that Bayes never came up with the mathematical formulation of his rule that was done independently by the mathematician Pierre-Simon Laplace so, really I think he deserves a lot of a credit for this theory, but Bayes gets naming rights because he was first, and if you want, you can download this book and listen to it, as I have, when I've just been driving in the car or going to the gym, which I'm doing again, and so if there's a part of your day that you feel is kind of boring then I can highly recommend trying out audiobooks from Audible. Just go to: audible.com/Veritasium So as always I want to thank: Audible for supporting me, and I want to thank you for watching.
Info
Channel: Veritasium
Views: 2,677,429
Rating: 4.9445415 out of 5
Keywords: veritasium, bayes, bayes theorem, rule, prior, probability, condition probability, statistics, thomas bayes, bayes rule, bayesian, inference, math, mathematics, richard price, nate silver, sharon mcgrayne, signal and the noise, theory that would not die, trap
Id: R13BD8qKeTg
Channel Id: undefined
Length: 10min 36sec (636 seconds)
Published: Wed Apr 05 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.