False Positives & Negatives for COVID-19 tests | Using Bayes' Theorem to Estimate Probabilities

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
supposed to get tested for SARS Cove to the corona virus that causes Koba 19 what are the chances that you actually are positive or actually are negative well you might expect that that depends on the accuracy of the test and indeed it does and different types of tests have different types of accuracies but what you might not expect is that it also depends on the prevalence of the disease within the population indeed in this video we're going to look at Bayes theorem which is a very powerful little piece of mathematics that gives us some insight on what the probability that you might have or not have a disease actually is now before we get into the mathematics me to talk a little bit about what do we mean when we say a test is accurate in fact test accuracy actually sub divides into two different components there's one factor called sensitivity and then there's another factor called specificity what's the difference the idea with sensitivity is whether the test is sensitive enough to detect when someone actually has the virus if you have a test with a low level of sensitivity you might get a lot of false negatives because even though they may have the disease the test is not sensitive enough to pick it up and then specificity is the other side specificity says do we know they actually have this specific disease if you have a test with too low of a specificity it may give you a lot of false positives where the test tells you have the disease but you actually don't okay so what is the situation in terms of sensitivity and specificity or the tests available for the krona virus well it turns out there's actually two different categories of tests of reducing right now the first are genetic tests and this is the ones you've seen most of the news what number of new cases are reported that genetic test is one there it goes and swabs your nasal cavity and then it used something called PCR or polymerase chain reaction to basically rapidly expand the amount of genetic code in the sample and that is able to test the actual genetic markers to see whether or not you have the test now the good news that these genetic tests have an extremely extremely high amount of specificity that is if it says you're positive you're extremely likely to actually be positive of course we can never be 100% certain maybe some sample gets confused in a lab but nevertheless very high degrees of specificity unfortunately sensitivity isn't always as high for example if you get swabbed you may actually have the disease but it hasn't managed to pick up the virus in that particular swab and this is more likely the case if it's early on you have not developed as large a bloom of virus population in your may look now and indeed depending on the specific test this could be better or worse one example for instance the United States that's very common is the Abbott lab test and one study of this showed that it might have as much as a 15% false negative rate that is when it tests and says you're negative but you actually are positive and numbers like this are hugely problematic it's hard to tell what the true numbers are in any particular case it requires a lot of study to investigate so those are the genetic tests which in general have very very high levels of specificity but perhaps somewhat lower levels of sensitivity the other completely different category of tests are called antibody tests this is a collection of your blood to see whether you have the antibodies created by your immune system to fight the disease and it has the benefit that it could look way in the past and if you retain those antibodies for some time even after you've gone clear you no longer have any symptoms you no longer infectious it might still show those anybody's similar into genetic tests the antibody tests also have some levels of false negatives but what is perhaps more problematic is that they also have some amount of false positives at least it depends on which particular anybody tests and this can be a really big problem indeed one of the mechanisms by which this might happen is that it may detect something that was an antibody from some other providers that doesn't necessarily cause ko bid 19 specifically and we're going to see that at least when the prevalence of the virus is low within society that the probability that you actually have the disease despite testing in that scenario was going to be far larger than you might initially suspect all right so let's do some math no to get started on our probability computation I want to begin with the idea of conditional probabilities and the first thing I'll do would just give some variable names just to make our notation a bit simpler we'll say calm is just going to be yes you actually have the disease and DF or disease free plus or that you test positive and minus for that you test negative okay with those label stated what is conditional probability what I want to consider are things like for example the probability of cold bar plus Oh what's going on now the vertical bar that we have here means given so this is red what is the probability that you have coded 19 given that you tested positive in other words we want to know if you go out and you get this positive result what is the probability that you actually have it that's what this conditional probability is representing as a whole bunch of different conditional probabilities for example another one is what is the chance that you're disease-free if you test positive this would be a false positive it's saying you test positive but you don't actually have it you are disease free so that's another thing I'd be really interested in knowing likewise you can ask what is the probability that you have coma 19 despite the fact that you test negative that's a false negative and then finally we can ask what's the probability that you're disease-free given that you test negative all of these numbers are between 0 & 1 or 100% so how do you compute conditional probabilities well one of the most powerful tools that we have is called Bayes theorem and Bayes theorem is a way to relate one kind of conditional probability with another so on the left hand side this P of a given B I'm just using a generic amv right now this is the probability that a is true given that you know that B is true and sometimes conditional probabilities are no one easy to compute and sometimes they're challenging now the power of Bayes theorem is that this conditional probability on the left and compute well you can relate this to the probability of B given a that is you swapped the order around instead of being given B and asking the probability of a we're doing a and asked the probability of B sometimes one or the other of these might be easier and B base theorem gives this nice result the other expressions P of a and P of B this is just what the generic probability of either a would be if you don't know anything else now I have a whole video introducing this theorem improving it from some elementary theorems and conditional probability and you can check the link in the description for that if you wish but we're going to use it for this business of testing okay let me return the names at least to what we were talking about before with Koba 19 or B disease free and custom positive or testing negative this is just base there and stated with those variables so in this case I am asking if you test positive what is the chance that you actually have the disease now there's a numerator and a denominator in this expression so if I focus just on the numerator what's going on here this is a product of two probabilities and one of the facts of probabilities is that for independent events the multiplication of probabilities is basically saying what's the probability that all of these things are true at the same time it's an ant favor so the words the numerator is saying what is the probability that I both have Kovan 19 and then I test for Cova 19 now if I then focus on the denominator what's the probability that I test positive well the SEC that complicated the probability that you test positive actually depends on two different cases there's one case which is where you have the disease and then you test positive and then there's another case which is the false positive case which is you do not have the disease but you still test positive nevertheless in Bayes theorem is is sometimes called the two bucket problem and this denominator can basically be split into these two different cases and so what you do that when you get is something slightly more complicated hear the expression whom I had been broken up a bit the numerator is actually same but in the denominator you break it up into these two different cases the left of the two cases is the probability that you have kovin 19 and test positive and the right one in the denominator is the probability that you do not that you're disease-free but then nevertheless you test positive so the denominators being broken up into these two different cases now I do want to be clear I'm am a professor I'm not an epidemiologist or a biologist or a medical doctor so I'm not trying to make actual predictions about the prevalence of Koba 19 in society but I was going to make up a toy example that hopefully is at least within the ballpark of being some real numbers so how about this let's imagine that I have a test that is 95 percent sensitive 99 percent specific and that the prevalence of the disease is 1% in the population so again no claims but whether those numbers are representing an actual test for the actual disease but nevertheless we have a good example okay so let's look at our formula well the first thing I'm going to see is that I have the probability of having coded 19 in two different places and here if there's a 1% prevalence among the population what I'm saying is that you just pull people at random there's a 1% chance that they actually have it then the peak of Kovan 19 is well 0.01 so I'm just gonna replace those numbers with 0.01 I then see two places where I have the probability of testing positive if you're given code 19 now this is in my data as well I said that we have a 95% sensitive test and those statements are conditional probabilities the statement that it is 95 percent sensitive means that 95 percent of the times when you're given that you have coronavirus you're gonna get test positive part so both of these numbers are 0.95 and I can plug those in as well ok two more to go I have the probability that you test positive given that you're actually disease-free these were the false positives you don't have the disease but you still test positive the fact that our test is 99% specific means that this value is going to be well point 0 1 the chance of your test positive despite not having the disease it's just what 1% of time because 99% specific then finally what's the probability that you don't the disease well you didn't know anything else you said 1% of the population has it a 99 therefore does not you put in point 9 9 here but of all those numbers and you get approximately 0.49 or approximately 50% these decimals be multiplied by a 100 to convert to percents know that she seems surprising perhaps indeed we start with a test that is 99% specific in other words it's got a very very low rate of false positives and yet you get this positive result and only half the time you actually have the disease well why would that be well the issue is the prevalence the disease is so rare in society if you assumed in having 1% and the result of this is that the false positive rate comes to be almost as large as the actual rate of prevalence itself and that's sort of a 50/50 chance to illustrate this bit more cleanly let me just imagine I have 100 random people and I'm gonna go and test them now if I have a 1 percent prevalence then well approximately 1 of these people are actually sick at any given moment and so when in tests all 100 so there's this one sick person and with a 95 percent sensitivity that means most this time this one sick person will be detected but the first positive rate of 1% also means that one person here on average is also going to be a false positive so when you test out 100 people you get this one false positive person this one actually sick person who test positive you have two people that test positive and therefore approximately a 50% chance that any individual one of them actually has a disease so the point here is to illustrate that if you have a test perhaps some of these antibody tests that are being tested for their own efficacy if they have not sufficiently high levels of specificity a 99% which seems like a big number is actually not that great because even with 99% if you have 1% prevalent population it still is only accurate 50% of the time I think at least that might initially seem a bit counterintuitive we can do the same basic computation for false negatives but it's not so unintuitive this time indeed if I just say ok exact same status except we're not a plus I put a negative for testing negative it's the same formula other than that well if I plug in all the numbers then I now get approximately zero point zero zero zero five or put oh five of a percent basically the one percent prevalence wasn't large enough to just change the probability away from what you would have expected it to be if I just sort of naively guessing this particular result whereas in the previous case the prevalence worked against the idea of false positives who looked very different results if you have a question about this video leave them down in the comments if you enjoyed it give it a like for the have YouTube algorithm and we'll do some more math in the next video
Info
Channel: Dr. Trefor Bazett
Views: 39,084
Rating: 4.9250002 out of 5
Keywords: Math, Example, Bayes, Bayes' Theorem, Conditional Probability, False Positive, False Negative, Bayesian Inference, Bayesian modelling, COVID, coronavirus, test, Abbott, genetic, antibody, serology, accuracy, specificity, sensitivity, specific, hypothesis, tests, Bayesian Trap, How accurate is COVID-19 testing
Id: VuskwsIW02M
Channel Id: undefined
Length: 13min 58sec (838 seconds)
Published: Thu May 21 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.