The following content is
provided under a Creative Commons license. Your support will help MIT
OpenCourseWare continue to offer high quality educational
resources for free. To make a donation or view
additional materials from hundreds of MIT courses, visit
MIT OpenCourseWare at ocw.mit.edu. JOHN TSITSIKLIS: Today we're
going to finish our discussion of the Poisson process. We're going to see a few of
its properties, do a few interesting problems, some more
interesting than others. So go through a few examples and
then we're going to talk about some quite strange things
that happen with the Poisson process. So the first thing is to
remember what the Poisson processes is. It's a model, let's say, of
arrivals of customers that are, in some sense, quote
unquote, completely random, that is a customer can arrive
at any point in time. All points in time are
equally likely. And different points in time
are sort of independent of other points in time. So the fact that I got an
arrival now doesn't tell me anything about whether there's
going to be an arrival at some other time. In some sense, it's a continuous
time version of the Bernoulli process. So the best way to think about
the Poisson process is that we divide time into extremely
tiny slots. And in each time slot, there's
an independent possibility of having an arrival. Different time slots are
independent of each other. On the other hand, when the slot
is tiny, the probability for obtaining an arrival during
that tiny slot is itself going to be tiny. So we capture these properties
into a formal definition what the Poisson process is. We have a probability mass
function for the number of arrivals, k, during an interval
of a given length. So this is the sort of basic
description of the distribution of the number
of arrivals. So tau is fixed. And k is the parameter. So when we add over all k's, the
sum of these probabilities has to be equal to 1. There's a time homogeneity
assumption, which is hidden in this, namely, the only thing
that matters is the duration of the time interval, not where
the time interval sits on the real axis. Then we have an independence
assumption. Intervals that are disjoint are
statistically independent from each other. So any information you give me
about arrivals during this time interval doesn't change my
beliefs about what's going to happen during another
time interval. So this is a generalization
of the idea that we had in Bernoulli processes that
different time slots are independent of each other. And then to specify this
function, the distribution of the number of arrivals, we
sort of go in stages. We first specify this function
for the case where the time interval is very small. And I'm telling you what those
probabilities will be. And based on these then, we do
some calculations and to find the formula for the distribution
of the number of arrivals for intervals of
a general duration. So for a small duration, delta,
the probability of obtaining 1 arrival
is lambda delta. The remaining probability is
assigned to the event that we get to no arrivals during
that interval. The probability of obtaining
more than 1 arrival in a tiny interval is essentially 0. And when we say essentially,
it's means modular, terms that of order delta squared. And when delta is very small,
anything which is delta squared can be ignored. So up to delta squared terms,
that's what happened during a little interval. Now if we know the probability
distribution for the number of arrivals in a little interval. We can use this to get the
distribution for the number of arrivals over several
intervals. How do we do that? The big interval is composed
of many little intervals. Each little interval is
independent from any other little interval, so is it is
as if we have a sequence of Bernoulli trials. Each Bernoulli trial is
associated with a little interval and has a small
probability of obtaining a success or an arrival during
that mini-slot. On the other hand, when delta
is small, and you take a big interval and chop it
up, you get a large number of little intervals. So what we essentially have here
is a Bernoulli process, in which is the number of
trials is huge but the probability of success during
any given trial is tiny. The average number of trials
ends up being proportional to the length of the interval. If you have twice as large an
interval, it's as if you're having twice as many over these
mini-trials, so the expected number of arrivals will
increase proportionately. There's also this parameter
lambda, which we interpret as expected number of arrivals
per unit time. And it comes in those
probabilities here. When you double lambda, this
means that a little interval is twice as likely to
get an arrival. So you would expect
to get twice as many arrivals as well. That's why the expected number
of arrivals during an interval of length tau also scales
proportional to this parameter lambda. Somewhat unexpectedly, it turns
out that the variance of the number of arrivals is also
the same as the mean. This is a peculiarity
that happens in the Poisson process. So this is one way of thinking
about Poisson process, in terms of little intervals, each
one of which has a tiny probability of success. And we think of the distribution
associated with that process as being
described by this particular PMF. So this is the PMF for the
number of arrivals during an interval of a fixed
duration, tau. It's a PMF that extends all
over the entire range of non-negative integers. So the number of arrivals you
can get during an interval for certain length can
be anything. You can get as many arrivals
as you want. Of course the probability of
getting a zillion arrivals is going to be tiny. But in principle, this
is possible. And that's because an interval,
even if it's a fixed length, consists of an infinite
number of mini-slots in some sense. You can divide, chop
it up, into as many mini-slots as you want. So in principle, it's
possible that every mini-slot gets an arrival. In principle, it's possible to
get an arbitrarily large number of arrivals. So this particular formula here
is not very intuitive when you look at it. But it's a legitimate PMF. And it's called the
Poisson PMF. It's the PMF that describes
the number of arrivals. So that's one way of thinking
about the Poisson process, where the basic object of
interest would be this PMF and you try to work with it. There's another way of thinking
about what happens in the Poisson process. And this has to do with letting
things evolve in time. You start at time 0. There's going to be a time at
which the first arrival occurs, and call that time T1. This time turns out to have an
exponential distribution with parameter lambda. Once you get an arrival,
it's as if the process starts fresh. The best way to understand why
this is the case is by thinking in terms of
the analogy with the Bernoulli process. If you believe that statement
for the Bernoulli process, since this is a limiting case,
it should also be true. So starting from this time,
we're going to wait a random amount of time until we get the
second arrival This random amount of time, let's
call it T2. This time, T2 is also going
to have an exponential distribution with the same
parameter, lambda. And these two are going to be
independent of each other. OK? So the Poisson process has all
the same memorylessness properties that the Bernoulli
process has. What's another way of thinking
of this property? So think of a process where
you have a light bulb. The time at the light bulb burns
out, you can model it by an exponential random
variable. And suppose that they tell you
that so far, we're are sitting at some time, T. And I tell you
that the light bulb has not yet burned out. What does this tell you about
the future of the light bulb? Is the fact that they didn't
burn out, so far, is it good news or is it bad news? Would you rather keep this light
bulb that has worked for t times steps and is still OK? Or would you rather use a new
light bulb that starts new at that point in time? Because of the memorylessness
property, the past of that light bulb doesn't matter. So the future of this light bulb
is statistically the same as the future of a
new light bulb. For both of them, the time until
they burn out is going to be described an exponential
distribution. So one way that people described
the situation is to say that used is exactly
as good as a new. So a used on is no worse
than a new one. A used one is no better
than a new one. So a used light bulb that
hasn't yet burnt out is exactly as good as
a new light bulb. So that's another way of
thinking about the memorylessness that we have
in the Poisson process. Back to this picture. The time until the second
arrival is the sum of two independent exponential
random variables. So, in principle, you can use
the convolution formula to find the distribution of T1
plus T2, and that would be what we call Y2, the time until
the second arrival. But there's also a direct
way of obtaining to the distribution of Y2, and this is
the calculation that we did last time on the blackboard. And actually, we did
it more generally. We found the time until the
case arrival occurs. It has a closed form formula,
which is called the Erlang distribution with k degrees
of freedom. So let's see what's
going on here. It's a distribution
Of what kind? It's a continuous
distribution. It's a probability
density function. This is because the time is a
continuous random variable. Time is continuous. Arrivals can happen
at any time. So we're talking
about the PDF. This k is just the parameter
of the distribution. We're talking about the
k-th arrival, so k is a fixed number. Lambda is another parameter of
the distribution, which is the arrival rate So it's a PDF over
the Y's, whereas lambda and k are parameters of
the distribution. OK. So this was what we knew
from last time. Just to get some practice, let
us do a problem that's not too difficult, but just to see how
we use the various formulas that we have. So Poisson was a mathematician,
but Poisson also means fish in French. So Poisson goes fishing. And let's assume that fish
are caught according to a Poisson process. That's not too bad
an assumption. At any given point in time, you
have a little probability that a fish would be caught. And whether you catch one now
is sort of independent about whether at some later time a
fish will be caught or not. So let's just make
this assumption. And suppose that the rules of
the game are that you-- Fish are being called it the
certain rate of 0.6 per hour. You fish for 2 hours,
no matter what. And then there are two
possibilities. If I have caught a fish,
I stop and go home. So if some fish have been
caught, so there's at least 1 arrival during this interval,
I go home. Or if nothing has being caught,
I continue fishing until I catch something. And then I go home. So that's the description of
what is going to happen. And now let's starts asking
questions of all sorts. What is the probability that
I'm going to be fishing for more than 2 hours? I will be fishing for more than
2 hours, if and only if no fish were caught during those
2 hours, in which case, I will have to continue. Therefore, this is just
this quantity. The probability of catching
2 fish in-- of catching 0 fish in the next
2 hours, and according to the formula that we have, this is
going to be e to the minus lambda times how much
time we have. There's another way of
thinking about this. The probability that I fish for
more than 2 hours is the probability that the first catch
happens after time 2, which would be the integral
from 2 to infinity of the density of the first
arrival time. And that density is
an exponential. So you do the integral of an
exponential, and, of course, you would get the same answer. OK. That's easy. So what's the probability of
fishing for more than 2 but less than 5 hours? What does it take for
this to happen? For this to happen, we need to
catch 0 fish from time 0 to 2 and catch the first fish
sometime between 2 and 5. So if you-- one way of thinking about what's
happening here might be to say that there's a
Poisson process that keeps going on forever. But as soon as I catch the
first fish, instead of continuing fishing and obtaining
those other fish I just go home right now. Now the fact that I go home
before time 5 means that, if I were to stay until time
5, I would have caught at least 1 fish. I might have caught
more than 1. So the event of interest here
is that the first catch happens between times 2 and 5. So one way of calculating
this quantity would be-- Its the probability that the
first catch happens between times 2 and 5. Another way to deal with it
is to say, this is the probability that I caught 0 fish
in the first 2 hours and then the probability that I
catch at least 1 fish during the next 3 hours. This. What is this? The probability of 0 fish in
the next 3 hours is the probability of 0 fish
during this time. 1 minus this is the probability
of catching at least 1 fish, of having
at least 1 arrival, between times 2 and 5. If there's at least 1 arrival
between times 2 and 5, then I would have gone home
by time 5. So both of these, if you plug-in
numbers and all that, of course, are going to give
you the same answer. Now next, what's the probability
that I catch at least 2 fish? In which scenario are we? Under this scenario, I go home
when I catch my first fish. So in order to catch
at least 2 fish, it must be in this case. So this is the same as the event
that I catch at least 2 fish during the first
2 time steps. So it's going to be the
probability from 2 to infinity, the probability that
I catch 2 fish, or that I catch 3 fish, or I catch
more than that. So it's this quantity. k is the number of fish
that I catch. At least 2, so k goes
from 2 to infinity. These are the probabilities of
catching a number k of fish during this interval. And if you want a simpler form
without an infinite sum, this would be 1 minus the probability
of catching 0 fish, minus the probability of
catching 1 fish, during a time interval of length 2. Another way to think of it. I'm going to catch 2 fish, at
least 2 fish, if and only if the second fish caught in this
process happens before time 2. So that's another way of
thinking about the same event. So it's going to be the
probability that the random variable Y2, the arrival time
over the second fish, is less than or equal to 2. OK. The next one is a
little trickier. Here we need to do a little
bit of divide and conquer. Overall, in this expedition,
what the expected number of fish to be caught? One way to think about it is
to try to use the total expectations theorem. And think of expected number of
fish, given this scenario, or expected number of fish,
given this scenario. That's a little more complicated
than the way I'm going to do it. The way I'm going to do is
to think as follows-- Expected number of fish is the
expected number of fish caught between times 0 and 2 plus
expected number of fish caught after time 2. So what's the expected number
caught between time 0 and 2? This is lambda t. So lambda is 0.6 times 2. This is the expected number of
fish that are caught between times 0 and 2. Now let's think about the
expected number of fish caught afterwards. How many fish are being
caught afterwards? Well it depends on
the scenario. If we're in this scenario,
we've gone home and we catch 0. If we're in this scenario, then
we continue fishing until we catch one. So the expected number of fish
to be caught after time 2 is going to be the probability
of this scenario times 1. And the probability of that
scenario is the probability that they call it's 0 fish
during the first 2 time steps times 1, which is the number of
fish I'm going to catch if I continue. The expected total fishing time
we can calculate exactly the same way. I'm jumping to the last one. My total fishing time has a
period of 2 time steps. I'm going to fish for 2 time
steps no matter what. And then if I caught 0 fish,
which happens with this probability, my expected time
is going to be the expected time from here onwards, which is
the expected value of this geometric random variable
with parameter lambda. So the expected time
is 1 over lambda. And in our case this,
is 1/0.6. Finally, if I tell you that I
have been fishing for 4 hours and nothing has been caught so
far, how much do you expect this quantity to be? Here is the story that, again,
that for the Poisson process used is as good as new. The process does not
have any memory. Given what happens in the past
doesn't matter for the future. It's as if the process starts
new at this point in time. So this one is going to be,
again, the same exponentially distributed random
variable with the same parameter lambda. So expected time until an
arrival comes is an exponential distribut -- has an exponential distribution
with parameter lambda, no matter what has
happened in the past. Starting from now and looking
into the future, it's as if the process has just started. So it's going to be 1 over
lambda, which is 1/0.6. OK. Now our next example is going
to be a little more complicated or subtle. But before we get to the
example, let's refresh our memory about what we discussed
last time about merging Poisson independent
Poisson processes. Instead of drawing the picture
that way, another way we could draw it could be this. We have a Poisson process with
rate lambda1, and a Poisson process with rate lambda2. They have, each one of these,
have their arrivals. And then we form the
merged process. And the merged process records
an arrival whenever there's an arrival in either of
the two processes. This process in that process are
assumed to be independent of each other. Now different times in this
process and that process are independent of each other. So what happens in these two
time intervals is independent from what happens in these
two time intervals. These two time intervals to
determine what happens here. These two time intervals
determine what happens there. So because these are independent
from these, this means that this is also
independent from that. So the independence assumption
is satisfied for the merged process. And the merged process turns out
to be a Poisson process. And if you want to find the
arrival rate for that process, you argue as follows. During a little interval of
length delta, we have probability lambda1
delta of having an arrival in this process. We have probability lambda2
delta of an arrival in this process, plus second
order terms in delta, which we're ignoring. And then you do the calculation
and you find that in this process, you're going
to have an arrival probability, which is lambda1
plus lambda2, again ignoring second order in delta-- terms that are second
order in delta. So the merged process is a
Poisson process whose arrival rate is the sum of the
arrival rates of the individual processes. And the calculation we did at
the end of the last lecture-- If I tell you that the new
arrival happened here, where did that arrival come from? Did it come from here
or from there? If the lambda1 is equal to
lambda2, then by symmetry you would say that it's equally
likely to have come from here or to come from there. But if this lambda is much
bigger than that lambda, the fact that they saw an arrival
is more likely to have come from there. And the formula that captures
this is the following. This is the probability that my
arrival has come from this particular stream rather than
that particular stream. So when an arrival comes and you
ask, what is the origin of that arrival? It's as if I'm flipping a
coin with these odds. And depending on outcome of that
coin, I'm going to tell you came from there or
it came from there. So the origin of an arrival
is either this stream or that stream. And this is the probability that
the origin of the arrival is that one. Now if we look at 2 different
arrivals, and we ask about their origins-- So let's think about the origin
of this arrival and compare it with the origin
that arrival. The origin of this arrival
is random. It could be right be either
this or that. And this is the relevant
probability. The origin of that arrival
is random. It could be either here or is
there, and again, with the same relevant probability. Question. The origin of this arrival, is
it dependent or independent from the origin that arrival? And here's how the
argument goes. Separate times are
independent. Whatever has happened in the
process during this set of times is independent from
whatever happened in the process during that
set of times. Because different times have
nothing to do with each other, the origin of this, of an
arrival here, has nothing to do with the origin of
an arrival there. So the origins of different
arrivals are also independent random variables. So if I tell you that-- yeah. OK. So it as if that each time that
you have an arrival in the merge process, it's as if
you're flipping a coin to determine where did that arrival
came from and these coins are independent
of each other. OK. OK. Now we're going to use this-- what we know about merged
processes to solve the problem that would be harder to do, if
you were not using ideas from Poisson processes. So the formulation of the
problem has nothing to do with the Poisson process. The formulation is
the following. We have 3 light-bulbs. And each light bulb is
independent and is going to die out at the time that's
exponentially distributed. So 3 light bulbs. They start their lives and
then at some point they die or burn out. So let's think of this as X,
this as Y, and this as Z. And we're interested in the
time until the last light-bulb burns out. So we're interested in the
maximum of the 3 random variables, X, Y, and Z. And in
particular, we want to find the expected value
of this maximum. OK. So you can do derived
distribution, use the expected value rule, anything you want. You can get this answer using
the tools that you already have in your hands. But now let us see how we can
connect to this picture with a Poisson picture and come up
with the answer in a very simple way. What is an exponential
random variable? An exponential random variable
is the first act in the long play that involves a whole
Poisson process. So an exponential random
variable is the first act of a Poisson movie. Same thing here. You can think of this random
variable as being part of some Poisson process that
has been running. So it's part of this
bigger picture. We're still interested in
the maximum of the 3. The other arrivals are not going
to affect our answers. It's just, conceptually
speaking, we can think of the exponential random variable as
being embedded in a bigger Poisson picture. So we have 3 Poisson process
that are running in parallel. Let us split the expected time
until the last burnout into pieces, which is time until the
first burnout, time from the first until the second,
and time from the second until the third. And find the expected values of
each one of these pieces. What can we say about the
expected value of this? This is the first arrival
out of all of these 3 Poisson processes. It's the first event that
happens when you look at all of these processes
simultaneously. So 3 Poisson processes
running in parallel. We're interested in the time
until one of them, any one of them, gets in arrival. Rephrase. We merged the 3 Poisson
processes, and we ask for the time until we observe an arrival
in the merged process. When 1 of the 3 gets an arrival
for the first time, the merged process gets
its first arrival. So what's the expected
value of this time until the first burnout? It's going to be the
expected value of a Poisson random variable. So the first burnout is going
to have an expected value, which is-- OK. It's a Poisson process. The merged process of the 3 has
a collective arrival rate, which is 3 times lambda. So this is the parameter over
the exponential distribution that describes the time until
the first arrival in the merged process. And the expected value
of this random variable is 1 over that. When you have an exponential
random variable with parameter lambda, the expected value
of that random variable is 1 over lambda. Here we're talking about the
first arrival time in a process with rate 3 lambda. The expected time until
the first arrival is 1 over (3 lambda). Alright. So at this time, this bulb, this
arrival happened, this bulb has been burned. So we don't care about
that bulb anymore. We start at this time,
and we look forward. This bulb has been burned. So let's just look forward
from now on. What have we got? We have two bulbs that
are burning. We have a Poisson process that's
the bigger picture of what could happen to that light
bulb, if we were to keep replacing it. Another Poisson process. These two processes are,
again, independent. From this time until that time,
how long does it take? It's the time until either
this process records an arrival or that process
records and arrival. That's the same as the time
that the merged process of these two records an arrival. So we're talking about the
expected time until the first arrival in a merged process. The merged process is Poisson. It's Poisson with
rate 2 lambda. So that extra time is
going to take-- the expected value is going to
be 1 over the (rate of that Poisson process). So 1 over (2 lambda) is
the expected value of this random variable. So at this point, this bulb
now is also burned. So we start looking
from this time on. That part of the picture
disappears. Starting from this time, what's
the expected value until that remaining light-bulb
burns out? Well, as we said before, in
a Poisson process or with exponential random variables,
we have memorylessness. A used bulb is as good
as a new one. So it's as if we're starting
from scratch here. So this is going to be an
exponential random variable with parameter lambda. And the expected value of it is
going to be 1 over lambda. So the beauty of approaching
this problem in this particular way is, of course,
that we manage to do everything without any calculus
at all, without striking an integral, without
trying to calculate expectations in any form. Most of the non-trivial problems
that you encounter in the Poisson world basically
involve tricks of these kind. You have a question and you try
to rephrase it, trying to think in terms of what might
happen in the Poisson setting, use memorylessness, use merging,
et cetera, et cetera. Now we talked about merging. It turns out that the splitting
of Poisson processes also works in a nice way. The story here is exactly
the same as for the Bernoulli process. So I'm having a Poisson
process. And each time, with some rate
lambda, and each time that an arrival comes, I'm going to send
it to that stream and the record an arrival here with some
probability P. And I'm going to send it to the other
stream with some probability 1 minus P. So either of this
will happen or that will happen, depending on
the outcome of the coin flip that I do. Each time that then arrival
occurs, I flip a coin and I decide whether to record
it here or there. This is called splitting
a Poisson process into two pieces. What kind of process
do we get here? If you look at the little
interval for length delta, what's the probability
that this little interval gets an arrival? It's the probability that this
one gets an arrival, which is lambda delta times the
probability that after I get an arrival my coin flip came out
to be that way, so that it sends me there. So this means that this little
interval is going to have probability lambda delta P. Or
maybe more suggestively, I should write it as lambda
P times delta. So every little interval has
a probability of an arrival proportional to delta. The proportionality factor is
lambda P. So lambda P is the rate of that process. And then you go through the
mental exercise that you went through for the Bernoulli
process to argue that a different intervals here are
independent and so on. And that completes checking that
this process is going to be a Poisson process. So when you split a Poisson
process by doing independent coin flips each time that
something happens, the processes that you get is again
a Poisson process, but of course with a reduced rate. So instead of the word
splitting, sometimes people also use the words
thinning-out. That is, out of the arrivals
that came, you keep a few but throw away a few. OK. So now the last topic over
this lecture is a quite curious phenomenon that
goes under the name of random incidents. So here's the story. Buses have been running
on Mass Ave. from time immemorial. And the bus company that runs
the buses claims that they come as a Poisson process with
some rate, let's say, of 4 buses per hour. So that the expected time
between bus arrivals is going to be 15 minutes. OK. Alright. So people have been complaining
that they have been showing up there. They think the buses are
taking too long. So you are asked
to investigate. Is the company-- Does it operate according
to its promises or not. So you send an undercover agent
to go and check the interarrival times
of the buses. Are they 15 minutes? Or are they longer? So you put your dark glasses
and you show up at the bus stop at some random time. And you go and ask the guy in
the falafel truck, how long has it been since the
last arrival? So of course that guy works
for the FBI, right? So they tell you, well, it's
been, let's say, 12 minutes since the last bus arrival. And then you say,
"Oh, 12 minutes. Average time is 15. So a bus should be coming
any time now." Is that correct? No, you wouldn't
think that way. It's a Poisson process. It doesn't matter how long
it has been since the last bus arrival. So you don't go through
that fallacy. Instead of predicting how long
it's going to be, you just sit down there and wait and
measure the time. And you find that this is,
let's say, 11 minutes. And you go to your boss and
report, "Well, it took-- I went there and the time from
the previous bus to the next one was 23 minutes. It's more than the 15
that they said." So go and do that again. You go day after day. You keep these statistics of the
length of this interval. And you tell your boss it's
a lot more than 15. It tends to be more
like 30 or so. So the bus company
is cheating us. Does the bus company really run
Poisson buses at the rate that they have promised? Well let's analyze the situation
here and figure out what the length of
this interval should be, on the average. The naive argument is that
this interval is an interarrival time. And interarrival times, on the
average, are 15 minutes, if the company runs indeed Poisson
processes with these interarrival times. But actually the situation is
a little more subtle because this is not a typical
interarrival interval. This interarrival interval
consists of two pieces. Let's call them T1
and T1 prime. What can you tell me about those
two random variables? What kind of random
variable is T1? Starting from this time, with
the Poisson process, the past doesn't matter. It's the time until an
arrival happens. So T1 is going to be an
exponential random variable with parameter lambda. So in particular, the expected
value of T1 is going to be 15 by itself. How about the random
variable T1 prime. What kind of random
variable is it? This is like the first arrival
in a Poisson process that runs backwards in time. What kind of process is a
Poisson process running backwards in time? Let's think of coin flips. Suppose you have a movie
of coin flips. And for some accident, that
fascinating movie, you happen to watch it backwards. Will it look any different
statistically? No. It's going to be just the
sequence of random coin flips. So a Bernoulli process that's
runs in reverse time is statistically identical
to a Bernoulli process in forward time. The Poisson process is a
limit of the Bernoulli. So, same story with the
Poisson process. If you run it backwards in
time it looks the same. So looking backwards in time,
this is a Poisson process. And T1 prime is the time until
the first arrival in this backward process. So T1 prime is also going to
be an exponential random variable with the same
parameter, lambda. And the expected value
of T1 prime is 15. Conclusion is that the expected
length of this interval is going to
be 30 minutes. And the fact that this agent
found the average to be something like 30 does not
contradict the claims of the bus company that they're running
Poisson buses with a rate of lambda equal to 4. OK. So maybe the company can this
way-- they can defend themselves in court. But there's something
puzzling here. How long is the interarrival
time? Is it 15? Or is it 30? On the average. The issue is what do we
mean by a typical interarrival time. When we say typical, we mean
some kind of average. But average over what? And here's two different ways
of thinking about averages. You number the buses. And you have bus number 100. You have bus number 101,
bus number 102, bus number 110, and so on. One way of thinking about
averages is that you pick a bus number at random. I pick, let's say, that bus,
all buses being sort of equally likely to be picked. And I measure this interarrival
time. So for a typical bus. Then, starting from here until
there, the expected time has to be 1 over lambda, for
the Poisson process. But what we did in
this experiment was something different. We didn't pick a
bus at random. We picked a time at random. And if the picture is, let's
say, this way, I'm much more likely to pick this interval
and therefore this interarrival time, rather
than that interval. Because, this interval
corresponds to very few times. So if I'm picking a time at
random and, in some sense, let's say, uniform, so that all
times are equally likely, I'm much more likely to fall
inside a big interval rather than a small interval. So a person who shows up at the
bus stop at a random time. They're selecting an interval in
a biased way, with the bias favor of longer intervals. And that's why what they observe
is a random variable that has a larger expected
value then the ordinary expected value. So the subtlety here is to
realize that we're talking between two different kinds
of experiments. Picking a bus number at random
verses picking an interval at random with a bias in favor
of longer intervals. Lots of paradoxes that one
can cook up using Poisson processes and random processes
in general often have to do with the story of this kind. The phenomenon that we had in
this particular example also shows up in general, whenever
you have other kinds of arrival processes. So the Poisson process is the
simplest arrival process there is, where the interarrival
times are exponential random variables. There's a larger class
of models. They're called renewal
processes, in which, again, we have a sequence of successive
arrivals, interarrival times are identically distributed and
independent, but they may come from a general
distribution. So to make the same point of the
previous example but in a much simpler setting, suppose
that bus interarrival times are either 5 or 10
minutes apart. So you get some intervals
that are of length 5. You get some that are
of length 10. And suppose that these
are equally likely. So we have -- not exactly -- In the long run, we have as many
5 minute intervals as we have 10 minute intervals. So the average interarrival
time is 7 and 1/2. But if a person shows up at a
random time, what are they going to see? Do we have as many 5s as 10s? But every 10 covers twice
as much space. So if I show up at a random
time, I have probability 2/3 falling inside an interval
of duration 10. And I have one 1/3 probability
of falling inside an interval of duration 5. That's because, out of the whole
real line, 2/3 of it is covered by intervals
of length 10, just because they're longer. 1/3 is covered by the
smaller intervals. Now if I fall inside an interval
of length 10 and I measure the length of the
interval that I fell into, that's going to be 10. But if I fall inside an interval
of length 5 and I measure how long it is,
I'm going to get a 5. And that, of course, is going
to be different than 7.5. OK. And which number should
be bigger? It's the second number that's
bigger because this one is biased in favor of the
longer intervals. So that's, again, another
illustration of the different results that you get when you
have this random incidence phenomenon. So the bottom line, again, is
that if you talk about a typical interarrival time, one
must be very precise in specifying what we
mean typical. So typical means
sort of random. But to use the word random,
you must specify very precisely what is the random
experiment that you are using. And if you're not careful, you
can get into apparent puzzles, such as the following. Suppose somebody tells you the
average family size is 4, but the average person lives
in a family of size 6. Is that compatible? Family size is 4 on the average,
but typical people live, on the average, in
families of size 6. Well yes. There's no contradiction here. We're talking about two
different experiments. In one experiment, I pick a
family at random, and I tell you the average family is 4. In another experiment, I pick a
person at random and I tell you that this person, on the
average, will be in their family of size 6. And what is the catch here? That if I pick a person at
random, large families are more likely to be picked. So there's a bias in favor
of large families. Or if you want to survey, let's
say, are trains crowded in your city? Or are buses crowded? One choice is to pick a bus
at random and inspect how crowded it is. Another choice is to pick a
typical person and ask them, "Did you ride the bus today? Was it's crowded?" Well suppose
that in this city there's one bus that's extremely
crowded and all the other buses are completely
empty. If you ask a person. "Was your
bus crowded?" They will tell you, "Yes, my bus was crowded."
There's no witness from the empty buses to testify
in their favor. So by sampling people instead
of sampling buses, you're going to get different result. And in the process industry, if
your job is to inspect and check cookies, you will be
faced with a big dilemma. Do you want to find out how many
chocolate chips there are on a typical cookie? Are you going to interview
cookies or are you going to interview chocolate chips and
ask them how many other chips where there on your cookie? And you're going to
get different answers in these cases. So moral is, one has to be
very precise on how you formulate the sampling procedure
that you have. And you'll get different
answers.