The following content is
provided under a Creative Commons license. Your support will help MIT
OpenCourseWare continue to offer high quality educational
resources for free. To make a donation or view
additional materials from hundreds of MIT courses, visit
MIT OpenCourseWare at ocw.mit.edu. PROFESSOR: Let us start. So as always, we're to have
a quick review of what we discussed last time. And then today we're going
to introduce just one new concept, the notion of
independence of two events. And we will play with
that concept. So what did we talk
about last time? The idea is that we have an
experiment, and the experiment has a sample space omega. And then somebody comes and
tells us you know the outcome of the experiments happens to
lie inside this particular event B. Given this information,
it kind of changes what we know about
the situation. It tells us that the outcome
is going to be somewhere inside here. So this is essentially
our new sample space. And now we need to we reassign
probabilities to the various possible outcomes, because, for
example, these outcomes, even if they had positive
probability beforehand, now that we're told that B occurred,
those outcomes out there are going to have
zero probability. So we need to revise
our probabilities. The new probabilities are
called conditional probabilities, and they're
defined this way. The conditional probability that
A occurs given that we're told that B occurred is
calculated by this formula, which tells us the following-- out of the total probability
that was initially assigned to the event B, what fraction of
that probability is assigned to outcomes that also
make A to happen? So out of the total probability
assigned to B, we see what fraction of that total
probability is assigned to those elements here that
will also make A happen. Conditional probabilities are
left undefined if the denominator here is zero. An easy consequence of the
definition is if we bring that term to the other side, then we
can find the probability of two things happening by taking
the probability that the first thing happens, and then, given
that the first thing happened, the conditional probability that
the second one happens. Then we saw last time that we
can divide and conquer in calculating probabilities of
mildly complicated events by breaking it down into
different scenarios. So event B can happen
in two ways. It can happen either together
with A, which is this probability, or it can happen
together with A complement, which is this probability. So basically what we're saying
that the total probability of B is the probability of this,
which is A intersection B, plus the probability of that,
which is A complement intersection B. So these two facts here,
multiplication rule and the total probability theorem, are
basic tools that one uses to break down probability
calculations into a simpler parts. So we find probabilities of
two things happening by looking at each one at a time. And this is what we do to break
up a situation with two different possible scenarios. Then we also have
the Bayes rule, which does the following. Given a model that has
conditional probabilities of this kind, the Bayes rule
allows us to calculate conditional probabilities in
which the events appear in different order. You can think of these
probabilities as describing a causal model of a certain
situation, whereas these are the probabilities that you get
after you do some inference based on the information that
you have available. Now the Bayes rule, we derived
it, and it's a trivial half-line calculation. But it underlies lots
and lots of useful things in the real world. We had the radar example
last time. You can think of more
complicated situations in which there's a bunch or lots of
different hypotheses about the environment. Given any particular setting in
the environment, you have a measuring device that
can produce many different outcomes. And you observe the final
outcome out of your measuring device, and you're trying
to guess which particular branch occurred. That is, you're trying to guess
the state of the world based on a particular
measurement. That's what inference
is all about. So real world problems only
differ from the simple example that we saw last time in that
this kind of tree is a little more complicated. You might have infinitely
many possible outcomes here and so on. So setting up the model may be
more elaborate, but the basic calculation that's done based
on the Bayes rule is essentially the same as
the one that we saw. Now something that we discuss
is that sometimes we use conditional probabilities to
describe models, and let's do this by looking at a
model where we toss a coin three times. And how do we use conditional
probabilities to describe the situation? So we have one experiment. But that one experiment consists
of three consecutive coin tosses. So the possible outcomes, our
sample space, consists of strings of length 3 that tell
us whether we had heads, tails, and in what sequence. So three heads in a row is
one particular outcome. So what is the meaning
of those labels in front of the branches? So this P here, of course,
stands for the probability that the first toss
resulted in heads. And let me use this notation
to denote that the first was heads. I put an H in toss one. How about the meaning of
this probability here? Well the meaning of this
probability is a conditional one. It's the conditional probability
that the second toss resulted in heads,
given that the first one resulted in heads. And similarly this label here
corresponds to the probability that the third toss resulted in
heads, given that the first one and the second one
resulted in heads. So in this particular model that
I wrote down here, those probabilities, P, of obtaining
heads remain the same no matter what happened in
the previous toss. For example, even if the first
toss was tails, we still have the same probability, P, that
the second one is heads, given that the first one was tails. So we're assuming that no matter
what happened in the first toss, the second toss will
still have a conditional probability equal to P. So that
conditional probability does not depend on what happened
in the first toss. And we will see that this is a
very special situation, and that's really the concept of
independence that we are going to introduce shortly. But before we get to
independence, let's practice once more the three skills that
we covered last time in this example. So first skill was
multiplication rule. How do you find the
probability of several things happening? That is the probability that
we have tails followed by heads followed by tails. So here we're talking about this
particular outcome here, tails followed by heads
followed by tails. And the way we calculate such
a probability is by multiplying conditional
probabilities along the path that takes us to this outcome. And so these conditional
probabilities are recorded here. So it's going to be (1 minus P)
times P times (1 minus P). So this is the multiplication
rule. Second question is how do we
find the probability of a mildly complicated event? So the event of interest here
that I wrote down is the probability that in the
three tosses, we had a total of one head. Exactly one head. This is an event that can
happen in multiple ways. It happens here. It happens here. And it also happens here. So we want to find the total
probability of the event consisting of these
three outcomes. What do we do? We just add the probabilities
of each individual outcome. How do we find the probability
of an individual outcome? Well, that's what we just did. Now notice that this outcome
has probability P times (1 minus P) squared. That one should not be there. So where is it? Ah. It's this one. OK, so the probability of this
outcome is (1 minus P times P) times (1 minus P), the
same probability. And finally, this one is again
(1 minus P) squared times P. So this event of one head can
happen in three ways. And each one of those three ways
has the same probability of occurring. And this is the answer. And finally, the last thing that
we learned how to do is to use the Bayes rule to calculate and make an inference. So somebody tells you that there
was exactly one head in your three tosses. What is the probability
that the first toss resulted in heads? OK, I guess you can guess the
answer here if I tell you that there were three tosses. One of them was heads. Where was that head
in the first, the second, or the third? Well, by symmetry, they should
all be equally likely. So there should be probably
just 1/3 that that head occurred in the first toss. Let's check our intuition
using the definitions. So the definition of conditional
probability tells us the conditional probability
is the probability of both things happening. First toss is heads, and we have
exactly one head divided by the probability
of one head. What is the probability that the
first toss is heads, and we have exactly one head? This is the same as the event
heads, tails, tails. If I tell you that the first is
heads, and there's only one head, it means that the
others are tails. So this is the probability of
heads, tails, tails divided by the probability of one head. And we know all of these
quantities probability of heads, tails, tails is P times
(1 minus P) squared. Probability of one
head is 3 times P times (1 minus P) squared. So the final answer is 1/3,
which is what you should have a guessed on intuitive
grounds. Very good. So we got our practice on
the material that we did cover last time. Again, think. There's basically three basic
skills that we are practicing and exercising here. In the problems, quizzes, and in
the real life, you may have to apply those three skills in
somewhat more complicated settings, but in the
end that's what it boils down to usually. Now let's focus on this special
feature of this particular model that I
discussed a little earlier. Think of the event heads
in the second toss. Initially, the probability of
heads in the second toss, you know, that it's P, the
probability of success of your coin. If I tell you that the first
toss resulted in heads, what's the probability that the
second toss is heads? It's again P. If I tell you that
the first toss was tails, what's the probability that
the second toss is heads? It's again P. So whether I tell
you the result of the first toss, or I don't tell
you, it doesn't make any difference to you. You would always say the
probability of heads in the second toss is going to P, no
matter what happened in the first toss. This is a special situation to
which we're going to give a name, and we're going to call
that property independence. Basically independence between
two things stands for the fact that the first thing, whether
it occurred or not, doesn't give you any information, does
not cause you to change your beliefs about the
second event. This is the intuition. Let's try to translate this
into mathematics. We have two events, and we're
going to say that they're independent if your initial
beliefs about B are not going to change if I tell you
that A occurred. So you believe something
how likely B is. Then somebody comes and tells
you, you know, A has happened. Are you going to change
your beliefs? No, I'm not going
to change them. Whenever you are in such a
situation, then you say that the two events are
independent. Intuitively, the fact that A
occurred does not convey any information to you about the
likelihood of event B. The information that A provides
is not so useful, is not relevant. A has to do with
something else. It's not useful for your
guessing whether B is going to occur or not. So we can take this as a first
attempt into a definition of independence. Now remember that we have this
property, the probability of two things happening is the
probability of the first times the conditional probability
of the second. If we have independence, this
conditional probability is the same as the unconditional
probability. So if we have independence
according to that definition, we get this property that you
can find the probability of two things happening by just
multiplying their individual probabilities. Probability of heads in
the first toss is 1/2. Probability of heads in the
second toss is 1/2. Probability of heads
heads is 1/4. That's what happens if your two
tosses are independent of each other. So this property here is
a consequence of this definition, but it's actually
nicer, better, simpler, cleaner, more beautiful to take
this as our definition instead of that one. Are the two definitions
equivalent? Well, they're are almost the
same, except for one thing. Conditional probabilities are
only defined if you condition on an event that has positive
probability. So this definition would be
limited to cases where event A has positive probability,
whereas this definition is something that you can
write down always. We will say that two events are
independent if and only if their probability of happening
simultaneously is equal to the product of their two individual
probabilities. And in particular, we can have
events of zero probability. There's nothing wrong
with that. If A has 0 probability, then A
intersection B will also have zero probability, because it's
an even smaller event. And so we're going to get
zero is equal to zero. A corollary of what I just said,
if an event A has zero probability, it's actually
independent of any other event in our model, because
we're going to get zero is equal to zero. And the definition is going
to be satisfied. This is a little bit harder to
reconcile with the intuition we have about independence, but
then again, it's part of the mathematical definition. So what I want you to retain
is this notion that the independence is something that
you can check formally using this definition, but also you
can check intuitively by if, in some cases, you can reason
that whatever happens and determines whether A is going
to occur or not, has nothing absolutely to do with whatever
happens and determines whether B is going to occur or not. So if I'm doing a science
experiment in this room, and it gets hit by some noise that's
causes randomness. And then five years later,
somebody somewhere else does the same science experiment
somewhere else, it gets hit by other noise, you would usually
say that these experiments are independent. So what events happen in one
experiment are not going to change your beliefs about what
might be happening in the other, because the sources of
noise in these two experiments are completely unrelated. They have nothing to
do with each other. So if I flip a coin here today,
and I flip a coin in my office tomorrow, one shouldn't
affect the other. So the events that I get from
these should be independent. So that's usually how
independence arises. By having distinct physical phenomena that do not interact. Sometimes you also get
independence even though there is a physical interaction, but
you just happen to have a numerical accident. A and B might be physically
related very tightly, but a numerical accident happens and
you get equality here, that's another case where we
do get independence. Now suppose that we have
two events that are laid out like this. Are these two events
independent or not? The picture kind of tells
you that one is separate from the other. But separate has nothing
to do with independent. In fact, these two events are as
dependent as Siamese twins. Why is that? If I tell you that A occurred,
then you are certain that B did not occur. So information about the
occurrence of A definitely affects your beliefs about the
possible occurrence or non-occurrence of B. When the
picture is like that, knowing that A occurred will change
drastically my beliefs about B, because now I suddenly
become certain that B did not occur. So a picture like this is a
case actually of extreme dependence. So don't confuse independence
with disjointness. They're very different
types of properties. AUDIENCE: Question. PROFESSOR: Yes? AUDIENCE: So I understand
the explanation, but the probability of A intersect B
[INAUDIBLE] to zero, because they're disjoint. PROFESSOR: Yes. AUDIENCE: But then the product
of probability A and probability B, one of them
is going to be 1. [INAUDIBLE] PROFESSOR: No, suppose that
the probabilities are 1/3, 1/4, and the rest
is out there. You check the definition
of independence. Probability of A intersection
B is zero. Probability of A times the
probability of B is 1/12. The two are not equal. Therefore we do not
have independence. AUDIENCE: Right. So what's wrong with the
intuition of the probability of A being 1, and the
other one being 0? [INAUDIBLE]. PROFESSOR: No. The probability of A given
B is equal to 0. Probability of A is
equal to 1/3. So again, these two
are different. So we had some initial beliefs
about A, but as soon as we are told that B occurred, our
beliefs about A changed. And so since our beliefs
changed, that means that B conveys information about A. AUDIENCE: So can you not draw
independent [INAUDIBLE] on a Venn diagram? PROFESSOR: I can't hear you. AUDIENCE: Can you draw independence on a Venn diagram? PROFESSOR: No, the Venn diagram
is never enough to decide independence. So the typical picture in which
you're going to have independence would be one event
this way, and another event this way. You need to take the probability
of this times the probability of that, and check
that, numerically, it's equal to the probability of
this intersection. So it's more than
a Venn diagram. Numbers need to come
out right. Now we did say some time ago
that conditional probabilities are just like ordinary
probabilities, and whatever we do in probability theory
can also be done in conditional universes. Talking about conditional
probabilities. So since we have a notion of
independence, then there should be also a notion of
conditional independence. So independence was defined
by the probability that A intersection B is equal to the
probability of A times the probability of B. What would be a reasonable
definition of conditional independence? Conditional independence would
mean that this same property could be true, but in a
conditional universe where we are told that the certain
event happens. So if we're told that the event
C has happened, then were transported in a
conditional universe where the only thing that matters are
conditional probabilities. And this is just the same plain,
previous definition of independence, but applied in
a conditional universe. So this is the definition of
conditional independence. So it's independence, but with
reference to the conditional probabilities. And intuitively it has, again,
the same meaning, that in the conditional world, if I tell you
that A occurred, then that doesn't change your
beliefs about B. So suppose you had a
picture like this. And somebody told you that
events A and B are independent unconditionally. Then somebody comes and tells
you that event C actually has occurred, so we now live
in this new universe. In this new universe, is the
independence of A and B going to be preserved or not? Are A and B independent
in this new universe? The answer is no, because in the
new universe, whatever is left of event A is this piece. Whatever is left of event
B is this piece. And these two pieces
are disjoint. So we are back in a situation
of this kind. So in the conditional universe, A and B are disjoint. And therefore, generically,
they're not going to be independent. What's the moral of
this example? Having independence in the
original model does not imply independence in a conditional
model. The opposite is also possible. And let's illustrate
by another example. So I have two coins, and both
of them are badly biased. One coin is much biased
in favor of heads. The other coin is much biased
in favor of tails. So the probabilities
being 90%. Let's consider independent flips
of coin A. This is the relevant model. This is a model of two
independent flips of the first coin. There's going to be two flips,
and each one has probability 0.9 of being heads. So that's a model that describes
coin A. You can think of this as a conditional
model which is a model of the coin flips conditioned on the
fact that they have chosen coin A. Alternatively we could be
dealing with coin B In a conditional world where we
chose coin B and flip it twice, this is the
relevant model. The probability of two heads,
for example, is the probability of heads the first
time, heads the second time, and each one is 0.1. Now I'm building this into a
bigger experiment in which I first start by choosing one of
the two coins at random. So I have these two coins. I blindly pick one of them. And then I start
flipping them. So the question now is, are the
coin flips, or the coin tosses, are they independent
of each other? If we just stay inside this
sub-model here, are the coin flips independent? They are independent, because
the probability of heads in the second toss is the same,
0.9, no matter what happened in the first toss. So the conditional probabilities
of what happens in the second toss are not
affected by the outcome of the first toss. So the second toss and the first
toss are independent. So here we're just dealing
with plain, independent coin flips. Similarity the coin flips within
this sub-model are also independent. Now the question is, if we look
at the big model as just one probability model, instead
of looking at the conditional sub-models, are the coin flips
independent of each other? Does the outcome of a few coin
flips give you information about subsequent coin flips? Well if I observe ten
heads in a row-- So instead of two coin flips,
now let's think of doing more of them so that the tree
gets expanded. So let's start with this. I don't know which coin it is. What's the probability that
the 11th coin toss is going to be heads? There's complete symmetry here,
so the answer could not be anything other than 1/2. So let's justify it,
why is it 1/2? Well, the probability that the
11th toss is heads, how can that outcome happen? It can happen in two ways. You can choose coin A, which
happens with probability 1/2. And having chosen coin A,
there's probability 0.9 that it results in that you get
heads in the 11th toss. Or you can choose coin B. And
if it's coin B when you flip it, there's probably 0.1
that you have heads. So the final answer is 1/2. So each one of the coins is
biased, but they're biased in different ways. If I don't know which coin it
is, their two biases kind of cancel out, and the probability
of obtaining heads is just in the middle,
then it's 1/2. Now if someone tells you that
the first ten tosses were heads, is that going to
change your beliefs about the 11th toss? Here's how a reasonable person
would think about it. If it's coin B the probability
of obtaining 10 heads in a row is negligible. It's going to be 0.1
to the 10th. If it's coin A. The probability
of 10 heads in a row is a more reasonable
number. It's 0.9 to the 10th. So this event is a lot more
likely to occur with coin A, rather than coin B. The plausible explanation of
having seen ten heads in a row is that I actually chose coin A.
When you see ten heads in a row, you are pretty certain that
it's coin A that we're dealing with. And once you're pretty certain
that it's coin A that we're dealing with, what's the
probability that the next toss is heads? It's going to be 0.9. So essentially here I'm doing
an inference calculation. Given this information, I'm
making an inference about which coin I'm dealing with. I become pretty certain that
it's coin A, and given that it's coin A, this probability
is going to be 0.9. And I'm putting an approximate
sign here, because the inference that I did
is approximate. I'm pretty certain it's coin A.
I'm not 100% certain that it's coin A. But in any case what happens
here is that the unconditional probability is different from
the conditional probability. This information here makes
me change my beliefs about the 11th toss. And this means that the 11th
toss is dependent on the previous tosses. So the coin tosses have
now become dependent. What is the physical link that
causes this dependence? Well, the physical link is
the choice of the coin. By choosing a particular coin,
I'm introducing a pattern in the future coin tosses. And that pattern is what
causes dependence. OK, so I've been playing a
little bit too loose with the language here, because we
defined the concept of independence of two events. But here I have been referring
to independent coin tosses, where I'm thinking about
many coin tosses, like 10 or 11 of them. So to be proper, I should have
defined for you also the notion of independence of
multiple events, not just two. We don't want to just say coin
toss one is independent from coin toss two. We want to be able to say
something like, these 10 then coin tosses are all independent
of each other. Intuitively what that means
should be the same thing-- that information about some of
the coin tosses doesn't change your beliefs about the remaining
coin tosses. How do we translate that into
a mathematical definition? Well, an ugly attempt
would be to impose requirements such as this. Think of A1 being the event that
the first flip was heads. A2 is the event of that the
second flip was heads. A3, the third flip, was
heads, and so on. Here is an event whose
occurrence is not determined by the first three coin flips. And here's an event whose
occurrence or not is determined by the fifth
and sixth coin flip. If we think physically that
all those coin flips have nothing to do with each other,
information about the fifth and sixth coin flip are not
going to change what we expect from the first three. So the probability of this
event, the conditional probability, should be the
same as the unconditional probability. And we would like a relation
of this kind to be true, no matter what kind of formula you
write down, as long as the events that show up here are
different from the events that show up there. OK. That's sort of an
ugly definition. The mathematical definition that
actually does the job, and leads to all the
formulas of this kind, is the following. We're going to say that the
collection of events are independent if we can find the
probability of their joint occurrence by just multiplying
probabilities. And that will be true even if
you look at sub-collections of these events. Let's make that more precise. If we have three events, the
definition tells us that the three events are independent
if the following are true. Probability A1 and A2 and A3,
you can calculate this probability by multiplying
individual probabilities. But the same is true even if
you take fewer events. Just a few indices out
of the indices that we have available. So we also require P(A1
intersection A2) is P(A1) times P(A2). And similarly for the other
possibilities of choosing the indices. OK, so independence,
mathematical definition, requires that calculating
probabilities of any intersection of the events we
have in our hands, that calculation can be done by just
multiplying individual probabilities. And this has to apply to the
case where we consider all of the events in our hands or just sub-collections of those events. Now these relations just by
themselves are called pairwise independence. So this relation, for example,
tells us that A1 is independent from A2. This tells us that A2 is
independent from A3. This will tell us that A1
is independent from A3. But independence of all the
events together actually requires a little more. One more equality that has to do
with all three events being considered at the same time. And this extra equality
is not redundant. It actually does make
a difference. Independence and pairwise
independence are different things. So let's illustrate the
situation with an example. Suppose we have two
coin flips. The coin tosses are independent,
so the bias is 1/2, so all possible outcomes
have a probability of 1/2 times 1/2, which is 1/4. And let's consider now a bunch
of different events. One event is that the
first toss is heads. This is this blue set here. Another event is the second
toss is heads. And this is this black
event here. OK. Are these two events
independent? If you check it mathematically,
yes. Probability of A is probability
of B is 1/2. Probability of A times
probability of B is 1/4, which is the same as the probability
of A intersection B, which is this set. So we have just checked
mathematically that A and B are independent. Now lets consider a third event
which is that the first and second toss give
the same result. I'll use a different color. First and second toss to
give the same result. This is the event that
we obtain heads, heads or tails, tails. So this is the probability
of C. What's the probability of C? Well, C is made up of two
outcomes, each one of which has probability 1/4, so the
probability of C is 1/2. What is the probability
of C intersection A? C intersection A is just this
one outcome, and has probability 1/4. What's the probability of A
intersection B intersection C? The three events intersect just
this outcome, so this probability is also 1/4. OK. What's the probability
of C given A and B? If A has occurred, and B has
occurred, you are certain that this outcome here happened. If the first toss is H and the
second toss is H, then you're certain of the first
and second toss gave the same result. So the conditional probability
of C given A and B is equal to 1. So do we have independence
in this example? We don't. C, that we obtain the same
result in the first and the second toss, has probability
1/2. Half of the possible outcomes
give us two coin flips with the same result-- heads,
heads or tails, tails. So the probability
of C is 1/2. But if I tell you that the
events A and B both occurred, then you're certain
that C occurred. If I tell you that we had heads
and heads, then you're certain the outcomes
were the same. So the conditional probability
is different from the unconditional probability. So by combining these two
relations together, we get that the three events
are not independent. But are they pairwise
independent? Is A independent from B? Yes, because probability of A
times probability of B is 1/4, which is probability of
A intersection B. Is C independent from A? Well, the probability
of C and A is 1/4. The probability of C is 1/2. The probability of A is 1/2. So it checks. 1/4 is equal to 1/2 and 1/2,
so event C and event A are independent. Knowing that the first toss was
heads does not change your beliefs about whether the two
tosses are going to have the same outcome or not. Knowing that the first was
heads, well, the second is equally likely to be
heads or tails. So event C has just the
same probability, again, 1/2, to occur. To put it the opposite way,
if I tell you that the two results were the same-- so it's either heads, heads
or tails, tails-- what does that tell you
about the first toss? Is it heads, or is it tails? Well, it doesn't tell
you anything. It could be either over the
two, so the probability of heads in the first toss is equal
to 1/2, and telling you C occurred does not
change anything. So this is an example that
illustrates the case where we have three events in which
we check that pairwise independence holds for
any combination of two of these events. We have the probability of their
intersection is equal to the product of their
probabilities. On the other hand, the three
events taken all together are not independent. A doesn't tell me anything
useful, whether C is going to occur or not. B doesn't tell me
anything useful. But if I tell you that both A
and B occurred, the two of them together tell me something
useful about C. Namely, they tell me that C
certainly has occurred. Very good. So independence is this somewhat
subtle concept. Once you grasp the intuition of
what it really means, then things perhaps fall in place. But it's a concept where
it's easy to get some misunderstanding. So just take some
time to digest. So to lighten things up, I'm
going to spend the remaining four minutes talking about the
very nice, simple problem that involves conditional
probabilities and the like. So here's the problem,
formulated exactly as it shows up in various textbooks. And the formulation says
the following. Well, consider one of those
anachronistic places where they still have kings or queens,
and where actually boys take precedence
over girls. So if there is a boy-- if the royal family has a boy,
then he will become the king even if he has an older sister
who might be the queen. So we have one of those
royal families. That royal family had two
children, and we know that there is a king. There is a king, which means
that at least one of the two children was a boy. Otherwise we wouldn't
have a king. What is the probability that the
king's sibling is female? OK. I guess we need to make some
assumptions about genetics. Let's assume that every child
is a boy or a girl with probability 1/2, and that
different children, what they are is independent from what
the other children were. So every childbirth is basically
a coin flip. OK, so if you take that,
you say, well, the king is a child. His sibling is another child. Children are independent
of each other. So the probability that the
sibling is a girl is 1/2. That's the naive answer. Now let's try to
do it formally. Let's set up a model
of the experiment. The royal family had two
children, as we we're told, so there's four outcomes-- boy boy, boy girl, girl
boy, and girl girl. Now, we are told that there is
a king, which means what? This outcome here
did not happen. It is not possible. There are three outcomes
that remain possible. So this is our conditional
sample space given that there is king. What are the probabilities
for the original model? Well with the model that we
assume that every child is a boy or a girl independently with
probability 1/2, then the four outcomes would be equally
likely, and they're like this. These are the original
probabilities. But once we are told that this
outcome did not happen, because we have a king, then
we are transported to the smaller sample space. In this sample space, what's
the probability that the sibling is a girl? Well the sibling is a girl in
two out of the three outcomes. So the probability that
the sibling is a girl is actually 2/3. So that's supposed to
be the right answer. Maybe a little
counter-intuitive. So you can play smart and say,
oh I understand such problems better than you, here is a trick
problem and here's why the answer is 2/3. But actually I'm not fully
justified in saying that the answer is 2/3. I made lots of hidden
assumptions when I put this model down, which I
didn't yet state. So to reverse engineer this
answer, let's actually think what's the probability model for
which this would have been the right answer. And here's the probability
model. The royal family-- the royal parents decided to
have exactly two children. They went and had them. It turned out that at
least one was a boy and became a king. Under this scenario-- that they decide to have
exactly two children-- then this is the big
sample space. It turned out that
one was a boy. That eliminates this outcome. And then this picture
is correct and this is the right answer. But there's hidden assumptions
being there. How about if the royal
family had followed the following strategy? We're going to have children
until we get a boy, so that we get a king, and then
we'll stop. OK, given they have two
children, what's the probability that the
sibling is a girl? It's 1. The reason that they had two
children was because the first was a girl, so they had
to have a second. So assumptions about
reproductive practices actually need to come in,
and they're going to affect the decisions. Or, if it's one of those ancient
kingdoms where a king would always make sure too
strangle any of his brothers, then the probability that the
sibling is a girl is actually 1 again, and so on. So it means that one needs to be
careful when you start with loosely worded problems to
make sure exactly what it means and what assumptions
you're making. All right, see you next week.