Does math belong in the courtroom?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
this video was sponsored by brilliant there's a famous problem known as the birthday problem or the birthday paradox which isn't really a paradox it's just a counterintuitive problem that says if you have 23 people in a room there's roughly a 50% chance that at least two of them have the same birthday and we just care about the month and day not the year one way to think about this is that in a group of 23 people there will be 253 comparisons made as there are 253 possible pairs that exist so because of this the probability grows much faster than you'd expect I mean already at 70 people you get to a 99.9 percent chance to share the same birthday due to the over 2400 possible pairs that exist but what if you were asked this instead how many people need to be in a room for there to be a 50% chance that one of them is born on January 1st this is counterintuitive as well but in the opposite direction this answer is now higher than you would think as its 253 yes the same as the number of pairs we just saw now if you asked a random person who's never heard of the birthday paradox this question they would likely say 182 or 183 half the number of days in a year rounded to an integer the reason the actual answer is so much higher is because in a room of 183 people were basically guaranteed that several will have the same birthday as we just saw so this might be a hundred and eighty-three people but it won't be a hundred and eighty-three different birthdays it'll be less due to overlap so it takes 253 people until we get to 183 expected birthdays with the overlap and that would account for half the year giving us the 50% figure so in a room of 23 people there's a decent chance 50% that two have the same birthday however in that same room there's a pretty small chance that any of them were born on a randomly chosen day like January 1st both of these are counterintuitive but in very different ways and we'll see later how this concept came up in a very famous murder case now in a previous video I mentioned a few cases in which math or really statistics and probability were used in the courtroom to convict or sometimes wrongly convict a variety of people and to this day I still get comments like how can people be the stupid to which I'd say have you met people but sometimes they'll get something like stats should never be used in the courtroom and this isn't reasonable either we often have to use them we just have to use them correctly the case this was in reference to or at least one of them was that of Sally Clark the woman who lost both her children due to SIDS or sudden infant death syndrome but at the time they weren't sure of this so she was actually convicted of murdering them because two kids from the same family dying of SIDS is extremely unlikely what happened was they calculated the chance of this happening two kids in the same family dying from SIDS they found it to be about one in 73 million so seem like something criminal may have been going on and she was eventually convicted of murder and spent a few years in jail when in fact she was not guilty there are several reasons why these stats weren't used correctly but one big thing that seems to be kind of common in court is to throw these crazy probabilities like oh there's a one in 100 million chance that this happened if you ever hear that it might not mean what you think even if the one in 73 million odds were correct all that means is if you grabbed a random parent with two kids there's a one in 73 million chance both of them will die of SIDS that's reasonable it doesn't happen often if you have to realize that events with really low probability happen all the time when you consider everyone in the world people win lotteries others are struck by lightning and so on these things happen so one in 73 million when there are whatever billions of parents in the world means you're gonna find these multiple deaths due to SIDS but taking all of those cases and saying oh there's a small chance of that happening then accusing them of murder is a gross misuse of statistics it's like taking all the lottery winners and saying mmm the chance of winning is pretty low so you probably cheated it's just illogical still that doesn't mean we shouldn't use stats and probability though like take the for drink that happened in the 1800s what happened here was a woman named Sylvia Howland had died leaving her two million dollars state behind part of that estate was left to her niece Hedy Robinson as mentioned in the will but that niece claimed in fact all of it belonged to her because she said she had an earlier version of a will that said she the niece would inherit everything and all other wills are invalid however that earlier will was rejected because it was assumed that it was a forgery and thus a lawsuit began that would be decided by mathematics now with forgery there's typically analysis done to show subtle differences in key parts of the signature but in this case it was the opposite the two signatures were so identical it was assumed that one was just a trace copy of the other here are the two actual signatures now is this just a coincidence or is this proof of a forgery well what they did was took 42 actual signatures made by Sylvia Howland and made comparisons between different pairs doing the math that means there would be 861 pairs of signatures to compare and they looked at all the down strokes that coincided basically when there was a perfect match in that part of the signature they found on average about 20 percent of the time the downstrokes did coincide as n of two randomly selected signatures and a randomly selected down stroke of some letter there's a 20% chance they look basically identical but there were 30 down strokes throughout the signature meaning the probability of two signatures being a perfect match and all downstrokes looking the same is well one in something sextillion thus the court ruled against the niece and claimed it must be a forgery now what this example be accepted today maybe but it is one of the earliest and most famous uses of mathematics in the courtroom now two more recent cases are that of Kristen Gilbert and Lucia de Burke these are two totally different cases of two different unrelated women but both were nurses both were accused of secretly murdering their patients by providing overdoses of certain medications in both cases statistics were used and in both cases the nurses were sentenced to life in prison however only one of them was actually a murderer see in both cases it took a while for suspicions to come up but eventually analysis was done on the number of deaths or incidents that occurred in total while the respective nurses were present and while they were absent in both cases you can see there's definitely a reason to be suspicious more incidents happening when these nurses were present but again if you took every single nurse in the world and ran the same test wouldn't you likely find some nurses just by chance have these similar numbers and if so does that mean we should just ignore the numbers see it all came down to hypothesis testing well hypothesis testing was one aspect to draw a comparison if you have reason to assume some coin is biased you could flip it let's say 10 times if you got all heads does that mean it's biased well not necessarily I could have happened by chance with a fair coin but the results seem to point to some kind of bias and the more flips you do the more confident you become of that bias in the case of Kristen Gilbert the numbers showed there was less than a 1 in 100 million chance of that many deaths occurring during her shift if everything was random I'd be like flipping 27 heads in a row that one out of a hundred million doesn't mean what you think though remember just means if you pick a random nurse there's a small chance of having such a high number of deaths during their shift but there's not even a hundred million nurses on the planet so it's still a low number but even so the judge did not allow this to be used in court simply because juries typically don't understand what these numbers actually mean it took other evidence that proved Kristen Gilbert was guilty and she was the one that yes was the murderer the one were the statistics weren't used now of lucien to burke statistics were used incorrectly there's a lot to this but the main thing to understand is that you are allowed to multiply probabilities in certain circumstances like when events are totally independent when that criteria is not met or you're simply not being asked about the intersection of two event you can get wildly wrong answers after multiplying probabilities in this case they looked at those same numbers we solved but for three different hospitals the nurse worked at and they found the p-values or the probability of at least that many deaths occurring assuming everything was random but they multiplied the p-values found for each Hospital together to get a final probability and this was a mistake it may seem valid assuming we have three independent tests then we can multiply but with p-values Strait multiplication causes the final result to get too small too fast just imagine you have a genuinely fair coin you flip it ten times and you get maybe six heads which has a p-value of like 0.75 very likely to happen if it's fair but since it's less than one if you do this enough through multiplication you could get the combined result to be as small as you want it making it appear like there is bias yes the overall p-value can be reduced more and more with the more trials and tests you have but you have to use more sophisticated methods than just multiplication like the one I saw most sources was Fisher's method which is used for combining p-values from different independent tests with the straight multiplication they found there was a one in 340 to a million chance of that many deaths occurring throughout lucious shifts if everything was random the number should have been more like one in a million if they calculated things correctly but the media ran with the one in 342 million figure and 342 million is way more than the number of nurses on the planet so that combined with some other evidence led to lucious spending around seven years in jail during that time researchers started to think a miscarriage of justice had occurred and a lot of the evidence wasn't entirely true or was calculated incorrectly and in 2010 she was finally released deemed not guilty after all and she's appeared in media many times since telling her story and apparently she did receive compensation for her years spent in prison please note there are more details to all of these cases like in that last one they found an entry and Lucia's diary that said today I gave in to my compulsion when she was asked about this you said it was in reference to reading tarot cards to patients something hospitals do not love but Lucia did enjoy it with friends and family that story did check out but at the time it was just laughed at so it's not just the numbers the numbers simply played a role sometimes a big role in these cases and for a random jury if you just start throwing numbers like one in a million at them you know it'll be easy to make them think that it means one thing when it really means something else okay for the people saying we need to chill with the statistics let's look at a story that involves murder and DNA profiling usually DNA analysis is a done deal for people when they hear oh they found Jon's DNA on this murder weapon you don't really question that but how do we know it's John's DNA yes everyone's DNA is unique and if we knew the exact letter sequence of someone's DNA we could identify them with no question but that isn't feasible to do right now so for DNA profiling while there are many methods we'll discuss a graph known as an electropherogram which I know so little about but here's what the literature says these display genetic loci humans have millions of these pairs you see that are all situated at different points along horizontal axis however there are certain thirteen that are of particular interest because they differ more between different people for the animations here I just put 13 single spikes rather than 13 pairs because they're just one in B room but if you pick two people and looked at those 13 pairs for each of them it's basically a guarantee they wouldn't all be in the same spot on that horizontal axis some might but not all though in fact it turns out on average for any one locus pair there's roughly a 7.5 percent chance that it will be found in the same horizontal location for two given people so if you had a DNA profile and just knew the location of one pair of Peaks and you knew bob has peaks at the same location that does not mean Bob as your guy because about 1 in every 13 people will have Peaks also at this location but these are independent so the chance two people have two pairs in the same locations is about 0.5 6% for all 13 locations it comes out to 1 in about a hundred trillion as in given a full set of these 13 pairs and a hundred trillion hypothetical people lined up actually more you probably find one match now if you find that Bob as a match let's say you basically know he's the only match and a prime suspect since well 100 trillion is well beyond the number of people that have ever existed on this planet but still it is probability now time for the murder story that I'm going to keep a real short in December of 1972 a nurse named Diana Sylvester that's the third nurse in this video came home from work after an all-night shift soon after her landlord heard oddly loud noises from upstairs she went to investigate and when she got to the room a man was standing there in the doorway who apparently said go away we're making love and he said it in an aggressive manner the landlord was still suspicious so they went to call the police and when the police showed up they found Diana Sylvester dead beneath a lit Christmas tree there was DNA evidence that was gathered but at this time DNA profiling wasn't available so all the police could do was go off of the landlords description of the man which would be these here from this the police were able to convict no one and the case went cold for 30 years until 2003 when DNA profiling was available and law enforcement got the funding to use these techniques on old unsolved cases and with this they found a match one match named John Puckett this guy had previously been convicted of rape years ago and although old now past photos showed that he matched all those descriptions given by the landlord 30 years earlier seems like a guaranteed conviction but there were two problems one the DNA sampled the use had been degraded so of those 13 pairs we need to analyze for a basically guaranteed match we only had a few and on top of that there was research going on that the set point five percent figure was in fact very wrong yes DNA profiling has been questioned a lot and here's a reason why if there's a 7.5 percent chance two people share certain pair of those genetic spikes then the chance two people share nine of them is one in thirteen billion as in given nine of these pairs you can expect one in thirteen billion people have exactly those in the same locations the problem was there was a recent study done in which of a sample of 60,000 people looking at nine of those locus pairs and that's any nine of the 13 on file for each person they found 90 matches 90 matches out of 60,000 people shouldn't it be one in thirteen billion there's got to be something wrong which is why they thought the 7.5 percent was incorrect except it's not at all if the one in thirteen billion figure is correct which heavy research suggested then the expected matches would be 98 in a sample of 60,000 people the study was pretty on par with expectation so how does one in 13 billion translate to 90 matches in a group of 60,000 well the same way that a 1 in 365 chance of having some similarity like a birthday translates to a very possible one match in a group of just 23 in both cases the question comes down to pairs of people and how many share any birthday or DNA profile not a specific one see if you were given a single DNA profile with those 9 pairs like that from a murder scene and said what is the chance someone in this group is a match that is like asking if I were given some date what is the chance a person has that birthday now the probability is much lower because you're looking for something specific it's going to take many more people before you're likely to find that given birthday or DNA profile so a match is more meaningful just finding any match among two or more people is more likely because there are way more comparisons that have to be made everyone to one another rather than everyone to just one desired target so think about it like this you may find a birthday match in a group of 23 people and you may find several matches in a group of like 50 but that does not change how rare it is for someone to have any given birthday it's 1 in 365 so using that reasoning we may find 90 matches in this group of 60,000 from all the pairs that exist but that does not change how rare it is for someone to have a specific set of 9 genetic loci which yes comes out to one in about 13 billion people so these researchers defending the accused murderer said hey our understanding of DNA profiling is actually wrong how else could there be 90 matches but really these numbers were all accurate it was the birthday paradox and analyzing pairs that they overlooked okay so the numbers are correct meaning John Puckett is looking more guilty but there weren't nine of those pairs unfortunately in the actual DNA sample there were only five plus a few others whose magnitude was extremely low so we couldn't make a super confident match on those now with basically five pairs the probability comes out to about one in a million people having these same five we've seen this before though one in a million isn't much when considering everyone in the world if one a million people have those five pairs then even in the country of 300 million yes this was in the US you'd find hundreds of matches so instead of a one in a million chance of him being innocent there's not even a 1% chance of him being guilty there are hundreds of other people you could gather that would all have the same DNA profile you guys see how chaotic this can get we can't just ignore the stats especially since DNA profiling is essentially statistics but it's so easy for people to get lost in these numbers they did look beyond all this though they realized yes John Puckett was the right age was previously convicted rapist had all those descriptions and matched at least five of the 13 site locations so you watching this video right now some maybe many of you are saying oh my god convict this guy he's guilty and what you're doing is basically just statistics you're saying where are the odds there happens to be another guy that aligns so perfectly with all that data you might not have a number but it's not 100% confidence and there isn't some percentage that is required for a conviction to be made but unless we know for sure it's this guy which we don't you're making a decision based on probability whether you know the actual numbers or not from what I read this man could be innocent doesn't seem likely to me but I didn't read about a confession or whatever but he was convicted and sent to jail now all of these cases aren't a matter of what is right and wrong to do I don't think there is really a right answer regarding how to use numbers in courts besides use them correctly and there's a reason we have to prove people are guilty beyond a reasonable doubt you'll find there's rarely a 100 percent chance that someone is guilty even with Ted Bundy people had their doubts and unfortunately this means that occasionally innocent people will be sent to prison we can't always be sure but if we required 100% certainty then most murderers and rapists would be set free the thing we can take away from this though is that it is important for people to know it's okay to question numbers and be cautious of taking them at face value because often there is some other meaning very much hidden beneath the surface now being more exposed to these kinds of problems and getting hands-on practice with statistical reasoning and probability is definitely the best way to become more comfortable with all the numbers you come across every day which is why I'm happy to have brilliant as a sponsor of this video brilliant actually has a few courses that relate to what we saw here like one would be there applied probability course that covers a variety of probability concepts and shows you how they're put to use to solve real-world problems and this even includes the case of Sally Clark as we saw earlier in this video as well as some of the mistakes that were made in the calculations that were carried out during this trial or you can explore physics of the everyday that has some really interesting applications of math and physics like blood stain pattern analysis car collisions tire tracks and more what I like about brilliant is that they include all these intuitive animations and visuals plus they always test you on your knowledge which is the best way to make sure you have a true understanding of whatever it is you're learning but on top of what we've seen they have dozens of other courses in math science and engineering for you to choose from also the first 200 people to sign up with the link below or by going to brilliant org slash Zack star who will get 20% off their annual premium subscription and with that I'm going to end that video there thanks as always to my supporters on patreon social media links to follow me are down below and I'll see you guys in the next video
Info
Channel: Zach Star
Views: 276,277
Rating: 4.9407439 out of 5
Keywords:
Id: wgWNtlz-2vM
Channel Id: undefined
Length: 22min 40sec (1360 seconds)
Published: Tue Mar 31 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.