(clicking sounds) - [Brandon] Hello and welcome. Brandon here, thanks
for choosing my video. If you like the video,
please give it a thumbs up. If you think someone you
know can also benefit by watching, please share. And, as always, please subscribe,
I appreciate it very much. So, let's go ahead and get started. So, here we are in video
two of logistic regression. Now, if you did not watch video one, I highly recommend going
back and watching that one, and then coming back to this one. So, the first thing that we're gonna do in this video is just a
basic review of probability. So, remember that probability
is the outcomes of interest divided by the number of
all possible outcomes. Let's look at a few examples. So, let's say we flip a fair coin. The probability of heads is one over two. The outcome of interest
is flipping heads up out of two possibilities, heads or tails, probability is point five. How about rolling a fair die? So, what's the probability
of rolling a one or a two? So, that's two outcomes of
interest divided by six possible: one, two, three, four, five, or six. That's one over three, or
a probably of .333, or 1/3. How about a deck of playing cards? Standard playing cards. So, what's the probability
of randomly pulling out a card that's a diamond? So, there are 13 of each suite. 13 diamonds, 13 hearts,
13 clubs, and 13 spades in a normal deck of cards. The probability of pulling out a diamond is 13 out of those 52, which
is obviously 1/4, or .25 So, again that's just basic
probability, and really that's all you need to
understand about probability to grasp the basics of
logistic regression. Okay, so that was probability. Now, what about odds. What are odds? You hear them everyday. So, the odds is the probability
of something occurring divided by the probability
of it not occurring. So, the probability of an event divided by the probability of a non-event, an event not occurring. So, we think of it as the
odds is p the probability divided by one minus p. Remember, probability can
only be as high as one. So, if p is the probability,
the probability of it not happening is one minus p. Let's go ahead and look at some of our other examples again, in this context. So, about flipping a fair coin. So, the odds of getting
heads is point five. That's your probability of getting a heads of the event occurring
divided by point five. Which, again, is the
probability of it not occurring. Which, in this case is getting a tails. So, for this fair coin
flip it's point five divided by point five, or one. So, the odds are one. Sometimes you'll see them
written one to one, or 1:1. So, this means that odds are even, and that makes sense in this case, cause we're flipping a fair coin. So, how about rolling
your fair die from before? So, what are the odds of a one or a two? Now, the probability of a one
or a two is .333 repeating. We found that out from the last slide. So, that means the odds of not getting a one or a two is .666 repeating. So, we divide that out. That's one divided by two, or point five. So, in this case the odds
of getting a one or a two is 1/2, or .5, or you can write as 1:2. Now, how about our deck of playing cards? So, what are the odds of
pulling a diamond card out? Now, we found out the
probability is .25 or 1/4. So, that means the probability of not pulling out a diamond card is the remainder or .75, so that's 1/3, or .333 repeating,
or the odds are 1:3. So, again the odds are
related to the probability, but it's expressed in a different way. As the probability of an event occurring divided by the probability
of that same event not occurring. So, we've talked about probability, we've talked about odds. Now, we're going to talk
about the odds ratio. Now, the odds ratio is
exactly what it says it is. It's a ratio of two odds. So, remember our fair coin
flip from the last slide. The probability of heads is point five, and therefore the odds of
getting heads is one, or 1 to 1. Now, let's say we have an
unfair coin, or a loaded coin. Now, in this coin the probability of getting heads is point
seven, not point five. That means the odds of getting heads is point seven divided by the probability of not getting heads. Which in this case is only point three. The probability of tails is point three. So, we divide those and we end up with the odds of getting heads is 2.333 repeating in this loaded coin. So, a fair coin in the odds of head are one to one, and a loaded coin flip down here at the bottom in this case, the odds are 2.333 to 1
in favor of getting heads. So, the odds ratio is
just a ratio of two odds. Now, if we wrote everything
out it would look like this. So, on the top we have the
odds for the first event. So, remember this is just how we figured out odds from the slide before. So, if the probability
of event one divided by one minus the probability of event one. So, that is the odds for
that event there on the top. Now, on the bottom we had the same thing for the other event. So, if the probability for
the event on the bottom divided by one minus the
probability of that event. So, we just have two odds
stacked on top of each other. Now, in this case we can just go ahead and plug everything in. Now, I usually put the
larger odds on the top, or the larger probability on top. You don't have to, it
doesn't affect it in any way. You just wanna make sure you interpret it correctly once you do the calculation. So, I'm gonna put the loaded coin on top. So, we have point seven
divided by point three, that's our loaded coin,
divided by our fair coin. Which is point five divided by point five. So, we go ahead and multiply
those two fractions together, and we end up with point three five divided by point one five, and we do that division
we have an odds ratio of two point three three three. Now, in this case that comes out very easy because the fair coin odds is one. So, that is why the loaded coin flip is the same as the odds
ration over on the right. It just happens to be
how it is in this example by sheer luck basically. So, what does this mean? It means the odds of getting heads on the loaded coin are two
point three three three times greater than the fair coin. But, this is how the probability,
odds, and odds ratio work. And it is central to
understanding and interpreting the output from logistic regression. So, speaking of the odds ratio. Let's go ahead and talk about the role of the odds ratio in logistic regression. Now, in this slide we
will use a very brief, very simple example. It is not related to
our overarching problem on home mortgages, but
again just to give you a quick insight into how
to interpret the odds ratio from the output of a computer, if you need to do that very quickly. So, what is the odds ratio? Well, in logistic
regression the odds ratio for a variable and independent variable represents how the odds change
with a one unit increase in that variable holding all
other variables constant. Now, if you're new to logistic regression that may not make a
lot of sense right now. But, hopefully in the next
minute or two it will. Let's just look at a fictitious example. Let's say we were looking
at a study that involved a persons body weight, and
whether or not they have sleep apnea. If you don't know, sleep
apnea is a condition where people stop breathing momentarily and often repeatedly in their sleep. Now, of course that can cause
a lot of health problems. So, we're going to look at how body weight is related to whether or not a person is diagnosed with sleep apnea or not. So, we did this analysis and in SPSS, or R, or mini tab, or whatever. Our weight variable had an odds ratio in the output of one point zero seven. Now, what does that mean? Well, this means that a one pound increase in body weight increases the
odds of having sleep apnea by a factor of one point zero seven. Now, that also means seven percent. So, O7 is seven percent. If we're by a factor of two, I would have made a 100 percent increase. That's how we can kind of go back and forth between
odds and percentages. So, this not very high
because we're looking at only one pound
increments in body weight. Which is actually a relatively small way to measure. Now, using that information we can also find out some other things for other amounts of body weight. So, a ten pound increase in body weight increases the odds to 1.98 or increases the odds by 98 percent. Or almost doubles a persons odds of having sleep apnea, and a 20 pound increase in body weight raises the odds to 3.87, or by almost a factor of four, and we will learn how to do
these calculations later. The thing about logistic
regression is that this holds true at any point
in the weight spectrum. So, if I went from a weight
of 200 pounds to 201 pounds, it would be one point zero seven. If I went from 150 pounds to 151 pounds, the odds ration would still
be one point zero seven. If I went from a 10 pound increase, let's say 200 to 210
that would have the same odds ration of one point
nine eight as going from 130 to 140 it would still be the odd ratio of one point nine eight. So, the odds ratio holds
true for any interval, that same interval along
the weight spectrum. And again, we'll talk
about that in future videos as we go forward. The last slide is a warning. It is very important to
separate probability and odds. In the previous example a
person gaining 20 pounds increases their odds of
sleep apnea by almost a factor of four, regardless
of their starting weight. Cause remember, that 20
pound increase applies at any point in the weight spectrum. However, the probability
of having apnea is lower in people with lower body
weight to begin with. So, why is that important? So, while the odds are four times greater, the probability may still be low. We have to separate odds and probability. So, even though gaining 20 pounds increases the odds by a factor of four, the reality is is that
people with lower body weight have a starting probability
that's low to begin with. So, basically what this
means is that the odds can have a large magnitude change even if the underlying probabilities are low. And here's the last example
just off the top of my head. Let's say we have two probabilities, the first probability is that
you are struck by lightning. The second probability is
that you are hit by a meteor falling out of the sky. Now, the probability of
either one of those happening is minuscule. Very, very, very low. However, the probability of
being struck by lightning is higher than being hit by a
meteor falling out of the sky. So, the odds of being hit by lightning are probably going to be much higher. Even though the
probabilities to begin with are very, very, very, low. So, we have to keep in mind the difference between probability and
odds, as we go forward interpreting logistic regression problems. So, I'll see you in video number three. Thanks for watching. (clicking sounds)