So coming back into what we were
doing with discrete distributions.
And I wanted to mention one common.
Kind of the most common or
fundamental common mistake in probability.
Because a couple of the TFs mentioned
that this is coming up on the homework,
which I'm not surprised about
because this is very fundamental.
It's trying to understand the difference
between the distribution and
a random variable, right that's
what we've been talking about.
But it's subtle at first, so
you have to keep practicing that.
So this is what I call sympathetic magic.
Sympathetic magic is what I
call the mistake of confusing
a random variable with its distribution.
So an example of that.
That has come up on some of the homeworks,
is you have a sum of two random variables.
And kind of just blindly saying,
the PMF of the sum is the sum of the PMFs,
or something like that.
That is, you're supposed to be
adding random variables, and
if you instead add PMFs it
just makes no sense at all.
So that's confusing adding random
variables is not the same as adding PMFs.
And in fact, for that PMF case,
it really makes no sense at
all because if you did P(X=x).
Let's say I had x and y, we're gonna talk
a lot more later in the course about how
do we deal with sums of random variables.
But we've already talked
about to some extent,
remember we spent a lot of time
talking about the sum of binomials.
The sum of independence
binomials is binomial
if they are the same p we prove
that in about three different ways.
And we talked about dealing with
x plus y by conditioning on x.
So we have talked a little bit about that,
we're gonna do more later.
So we had different methods for dealing
with that, but we never said, well take
P(X=x) + P(Y=y), right,
that's the PMF of x plus the PMF of y.
And somehow this is
supposed to relate to x+y.
Well, first of all,
if you add up two probabilities
there's no reason from what I wrote or
think about still be less than 1, right?
Or less than or equal to 1.
That could easily exceed 1, first of all.
And secondly, this thing is a function
of x and y, little x and little y.
But if we were,
if we want the PMF of x + y,
then we're thinking of X plus
Y as our random variable.
So we should have a function of, you know,
what's the probability that x + y
= little t, something like that.
So you can't do things like that.
Similarly, you know, if you, later we're
going to deal a lot with what happens when
you transform random variables.
So if I have a random variable x and
I want to cube it, it doesn't mean
that somehow cube its distribution,
or it's pmf, or somethings like that,
you can't do things like that.
So, there's a famous saying in semantics,
that the word is not the thing,
the map is not the territory, and
that's exactly what this mistake is.
So just to push this analogy
a little bit further.
The map is not the territory, this is
like really an obvious piece of advice.
It's very rare that I've seen someone
confuse a map with the territory, right?
You don't usually like put
the map on the floor and
start walking around the map thinking
that you're exploring the territory.
That's the map, that's the territory,
we don't make that mistake.
Here though, when we're talking
about random variables and
distributions, it's more mathematical.
It's a more abstract thing, and do people
do make that mistake all the time.
So the mistake is completely analogous,
it's just in this context,
the human mind easily does that.
And in the map territory context,
it's not such a big deal.
Here's an analogy along these
lines that I like even more.
Think of the random variable is a house
rv corresponds to house and
the distribution is the blueprint for
the house, okay?
So mostly you probably don't try to
live inside the blueprint, right?
It's the same thing.
And I like this analogy even more
because now we can think of,
if you have one blueprint you could build
many houses from the same blueprint right?
Just use the blueprint and
build in different locations.
So you can have as many
random variables as you want,
all with the same distribution.
They could be iid which would mean they're
independent with the same distribution or
they could be dependent.
But they could have all
the same distribution,
there's no problem with doing that.
So let's actually call
this a random house.
The distribution is a blueprint for
building a random house.
The random variable is one
of those random houses.
So the blueprint means, it's not
just saying put this door here and
put this wall here.
It's saying you randomly choose
whether you have a blue door or
a red door with certain probabilities.
This is specifying all of
the probabilities you need for
different choices for
how to build the house.
Random variable is the house.
Okay, so that's just a very,
very common mistake
that's sort of like a meta mistake or
a class of mistakes.
Because there are many, many individual
mistakes that I attribute to this.
So, that's why I'm suggesting
to be careful about that.
All right, so now coming back
to our discrete distributions.
We only have one more famous discrete
distribution that we need for
this entire semester, and
that's called the Poisson distribution.
So that's the main topic for today.
I want to show you first of all,
what is it?
Secondly, why is it important?
Arguably the Poisson is the most
important discrete distribution
in all of statistics.
Depends on how you define importance, but
I think you can make a pretty
good argument for that.
It's named after Poisson who was a famous
brilliant French mathematician who
was the first or one of the first people
to start working with this distribution.
In the 1830s, he was doing this.
And so first let's write down the PMF and
then see what's so great about this.
Okay so here's the PMF.
We want to
know what's the probability that X = k,
k is a non-negative integer here.
So unlike the binomial which
is bounded between 0 and n,
Poisson could take any
nonnegative integer value and
the PMF is just e to the minus lambda,
lambda to the k over k factorial.
Where k, as I said,
is a non-negative integer, 0 otherwise.
And lambda is the parameter and
I call it a rate parameter for reasons
that will become clear at some point.
But for now just think of it as this is a
one parameter distribution with parameter
Traditionally called lambda, but
you could call it whatever you want.
That's just the parameter
of the distribution.
But lambda is the most common name for
that particular parameter.
So lambda is a positive constant.
And Lambda could be any
positive real number.
That's the parameter of
the Poisson distribution.
All right, so
I mean I just wrote this thing down, but
that doesn't mean that it's useful for
anything.
So why do I say that this thing is so
important?
Well, before we can do that, first,
let's check that this is a valid PMF.
That is is this actually a PMF?
Well, they're non-negative so
the only thing that we need to check
is that these numbers add up to 1.
And that's very easy, because if
you add up e to the minus lambda,
lambda to the k over k!,
e to the minus lambda is a constant.
So that just comes out,
the sum of lambda to k over k!.
Hopefully, by now, everyone recognizes
that that's the Taylor series for
e to the lambda.
So this is e to the minus lambda,
e to the lambda equals 1.
All right, so, basically all we did was
take one term from the Taylor series for
e to the lambda, and then put a constant
in front so that they add up to 1.
So that is a PMF.
And while we're doing
a calculation like this,
we may as well compute
the expected value also.
So let's find the mean, E(X).
And as notation,
we would write X as Poisson (lambda).
And we'll usually just
abbreviate that to Pois, okay?
So let's find the expected value.
Well, remember the expected
value is the sum
of the value times the probability
of the value, right?
So I'll take out the e to the minus
lambda, cuz that's just a constant.
So what we're doing is just adding up k,
that's the value,
times the probability of the value,
which is e to the minus lambda,
lambda to the k over k!,
from 0 to infinity.
But when k is 0, this is 0 anyway.
So we may as well start the sum at 1.
So let's go from 1 to infinity.
And notice we have k over k!.
Well, k factorial is k times k-1,
k-2, blah blah blah.
So the k cancels one of the ks,
k cancels the k here.
And we're left with k-1
factorial in the denominator.
So that's really just lambda
to the k over (k-1)!.
Now it's not a calculus problem.
It's just a pattern recognition
problem at this point.
When we see this thing,
it should remind us.
It should remind us of what
we just did basically, right?
That's the Taylor series for
e to the lambda.
This series looks a lot like this series.
The only difference is
that we have a (k-1)!,
not a k!, and
we're starting at 1 rather than 0.
So to make this match up better,
let's just take out one of the lambdas.
So I make this lambda to the k-1.
Lambda's just a constant, so you can
take out, put back lambdas all you want.
I put it that way so
that now this k-1 matches this (k-1).
[COUGH] Now if we want, we can just
let j equal k-1 at this point, or
just write out the first few terms and
just directly see this is exactly,
it's exactly the Taylor series for
e to the lambda again.
So, therefore, this is just lambda,
e to the minus lambda,
e to the lambda equals lambda.
So that's a very useful,
easy to remember result,
that a Poisson lambda has
expected value of lambda.
Okay, so memorizing formulas is
not that important in this class.
But this one is useful and
very easy to remember, the mean is lambda.
Okay, so
that's the expected value of a Poisson.
Now why do we care about the Poisson now?
So let me just mention a few
examples where the Poisson is used.
It's the most widely, in practice, it's
the single most-widely used distribution
as a model for
discrete data in the real world.
So, often used for applications like,
Well, let me just say what
the general application is.
Applications where we wanna
count something if it's used for
counting, cuz it's non-negative integers,
or counting a number of something.
Let's say a number of successes.
But, again, just like in the Binomial.
Just like in the Binomial,
we can define success and
failure like in a very general way.
So I'll just put successes in quote,
cuz we could define that in many
different ways, just have a word.
We're counting the number of successes
where we have a large number of trials.
So there's a large number of things, each
of which could lead to success or failure.
But the probability of success for
each one is small, right?
So let's say we had 10,000 trials, but
each one only has probably
one over 10,000 of success.
Well, we immediately know that
the expected number of successes is one,
just using linearity and
indicator random variables.
But it's that kind of thing where
we have a large number of trials,
small probability for each one, okay.
That's the general set-up and each one
with a small probability of success.
So some examples would be like number of
emails that you get in an hour.
I'm not claiming this is exactly Poisson.
I'm saying that as an initial, right, to
check whether the number of emails you got
in an hour is a Poisson or not, you'd
have to go collect data and try to see.
That's an empirical question.
That's not a mathematical question, okay?
But the claim is that this would be
a reasonable starting point as a model for
that.
In some cases, it may be very good.
In other cases, it may be bad.
But this would be a reasonable
first approximation.
That's what I'm saying,
not as an exact distribution.
So why would the number of, just
intuitively, why would that be Poisson?
Well, imagine that there
are a large number of people
who potentially could email you in
that one hour block of time, okay?
But for any individual person, unless you
have someone who's constantly emailing you
like every few seconds all day,
in that particular block of time,
it's fairly unlikely that any
specific person will email you.
But if there are a lot of
people who could email you,
then that's balancing it out, right?
So we're defining success to be For
each person, whether or
not they email you in that hour, there are
a lot of people who could email you, but
each one is fairly unlikely,
that's the set up.
Change email to phone calls,
or whatever you want here,
you can make up as many
examples as you want.
Another famous example is the number
of chocolate chips in a chocolate
chip cookie.
Number of chips in chocolate chip cookie,
now let's think about how
do we interpret that one.
Well it's like a cookie is made up of
bunches of, What do you call them?
It's like stuff in the cookie, right?
And most of the stuff is not chips, right?
I guess start with some cookie dough and
there's some number of chips you put in.
Now, each little bit of dough is
probably not a chocolate chip, right?
It's very sad, but
there are a lot of them, right?
In each little segment of the cookie,
maybe a chip there, maybe a chip there,
maybe a chip there, most of them are not,
but there are a lot of possible locations
on the cookie where there could a chip.
So usually you at least get a few,
not enough but you get a few, okay?
So that's why that's gonna be
approximately Poisson and so and so on and
it can go on as many examples as you want.
So this is, more serious examples
would be number of earthquakes,
let's say in a year in some region.
Again, I'm not saying
that's exactly Poisson, but
that would be a reasonable first
choice of a distribution because
on each particular day it's not
likely that there's an earthquake.
But there are a lot of days in a year, and
so there may be a few earthquakes, right?
So that kind of thing would
be approximately Poisson.
We haven't proven that yet.
We're gonna show,
not in complete generality, but
we're gonna show a result
along these lines that shows
where this distribution comes
from mathematically, okay?
But right now this is just intuition and
how is this used.
These examples, again I'm not saying
there are gonna be exactly Poisson.
And in fact, if you look at a lot
of cases where the Poisson is used,
actually that there's
some obvious upper bound,
whereas here we're going up to
infinity that there's no upper limit.
So most cases it's clear that the example
will not be exactly Poisson, but
it's an extremely useful
approximation in a lot of problems.
And that's called the Poisson paradigm.
Or just more simply just call
it the Poisson approximation,
Poisson paradigm, but
just to say that in words.
Let's say we have events.
A1, A2, blah, blah, blah, through A n.
Suppose we have a lot of events, right?
And they are with
P(Aj) = Pj.
So we have n is large.
So we have a large number of
things that could happen, but
each one is unlikely, right?
That's what I was talking.
I'm just trying to write this a little
more mathematically right now.
And so the Pj's, they're all small, okay?
And now we have to say
something about independence.
So the events are either, the nicest
case is where they're independent.
But actually this result is even more
general so they could be weakly dependent.
We have a mathematical
definition of independence.
It's very difficult to mathematically
define weakly dependent.
The intuition is just that independence
comes in degrees to some extent.
And independence says that.
Learning, for example, learning A1
through A3 whether they happened or
not gives us independence.
let's say, it gives us no information
whatsoever about A4, A5, A6, and so
on whether they occur.
That's what independence would mean.
Weak dependence would mean we could
get a little bit of information.
We have to quantify what does it mean
to have a little bit of information.
Well the easiest way to think of it would
just be that, let's say I knew that, for
example, if knowing that A1 happened,
suppose that tells us nothing
about whether A3 happened.
But if we know that both A1 and
A2 happened,
maybe it makes it a tiny bit more
likely or less likely that A3 happened.
So it's slightly,
slight deviations from independent.
That's what we're talking about.
Then the claim is that
the number of events that occur,
number of the Aj's that occur,
Is approximately Poisson and
we can figure out right
now what lambda would have to be for
that to make sense.
We just showed that lambda is the expected
value of the Poisson, so it makes sense
that lambda should be the expected number
of how many of these events occur.
But by linearity,
even if they're dependent,
by linearity, the expected number of
events that occur is the sum of the Pj's.
So lambda should be the sum of the Pj's.
Okay, this is approximate.
So this is also called
the Poisson approximation.
That's the same thing
as a Poisson paradigm.
So one case that we've looked at a lot
is the case where all the events
are independent and all the Pj's are the
same, they're all equal to the same P.
In that case, we just have
Bernoulli trials with the same P.
In that case, we know that
the number of events that are occur
is exactly binomial in P.
Okay, and in the minute we're gonna proof
that in fact the binomial NP does converge
to a Poisson when we get N get large and
P gets small in a certain way, okay?
But this is much much more general
than just for the binomial.
Because this is saying that Poisson
is gonna work well even if the Ps
are different, right?
Remember for the binomial we had
independent trials with the same P.
Here it says the Ps can be different and
they can be at least a little bit
dependent and it's still gonna
be a good approximation, okay?
So all right, so now let's explore
the connection with the binomial more.
So what we wanna show is that
the binomial converges to the Poisson.
So if we have, let's say X is Bin (n,p).
And I wanna show how does this relate
to the Poisson cuz that formula for
the Poisson, well it reminds us
of a Taylor series for eudx.
But what does that have
to do with probability?
Whereas, we know the binomial
is something very fundamental.
So we start with the Bin (n,p) and
we're gonna let n go to infinity.
That's just the mathematical way
of capturing the fact that we want
a large number of events.
So we wanna use this as an approximation,
while in practice n is still
gonna be some finite number.
And our hope is that here, mathematically
willing and go to an infinity.
And as an approximation what we're hoping
is that n is Even though it's finite and
is large enough so that this limits
kind of kick in and work well.
So we're gonna let n go to infinity,
we're gonna let p go to 0.
Now we're taking two limits here,
that's the way to do something
that's mathematically precise.
And in general this could
happen in different rates
that we'd have to deal with.
But the case we're most interested
in right now is where the product,
let's call it lambda = np,
is held constant.
And we can do more general
version than this, but
this is a natural one to start with.
Because remember the expected
value of binomial (n,
p) is n times p, the expected value of
a Poisson lambda we just showed is lambda.
So it makes sense to set lambda equal
to np to try to explore the connection
between those.
So this is telling us that p is going
to 0 at the same rate as n is going
to infinity cuz we're holding
the product at constant.
All right, so
now we wanna show that the PMF converges.
So find what happens to the PMF.
What happens to the P(X = k) which
we keep seeing is (n k) p to
the k (1- p) to the n- k.
So I wanna know what
happens to this thing?
Well we're treating k as fixed.
That is where we're letting n get very
large, p get very small, but then we wanna
look at one specific value, for the PMF,
so we're letting k stay constant.
All right, so that's what we want to do.
And then at this point,
We just need to just
do some algebra and
a little bit of calculus.
So let's try to, right now this is
written as a function of p and n.
It's gonna be easier to deal with,
if we get everything in terms of n.
And so
conveniently we have p = lambda over n.
So we can write everything in terms of n.
And let me also write n
choose k a different way.
n choose k carries the number of
ways to chose k people out of n
where order doesn't matter.
So that's the same thing as n.(n- 1)...,
(n- k + 1).
That would be choosing a committee in
a certain order where order matters
then we divide by K factorial to reflect
the fact that order doesn't matter.
So that's just exactly the same thing as
n choose k, according to the story, or
you can check that easily
just using factorials.
I'll plug in p equals lambda over n.
So that's lambda to the k over n to the k,
and
I'll just stick the n to the k over there.
And, then we need to deal with this thing,
(1- p) which is 1 minus lambda over n,
to the n minus k.
And, this kind of minus k thing is,
I'll just separate it out so
we can deal with that later,
cuz k is fixed.
It looks nicer to me this way.
All right, so
that's just a little bit of algebra.
And now let's take the limit of this.
K is fixed, so k factorial here
just stays as k factorial.
Now let's look at, and
lambda to the k is also fixed.
So we have a lambda to
the k over k factorial.
Which is pretty encouraging cuz
we want that for our Poisson.
[COUGH] Now,
that's lambda to the k over k factorial.
Let's look at this stuff with the ns.
On top we have k terms.
I've written a product of k terms
where it decreases by one each time.
And the bottom is n
times n times n k times.
So let's just match up terms.
Take the first n from here,
match that with that n.
The second one,
match it with the n minus 1,
the third one with the n minus 2 and
so on.
Each of those terms goes to one.
Just for example,
one of them might be n minus seven over n.
But I'm letting n go to infinity so
that the seven is completely negligible.
So this stuff with
the n's just goes to one.
Lastly, well, and in this part, n goes
to infinity so this is going to one.
One to the negative k is one so
the exponent is fixed so
this part just goes to 1.
Lastly, we just need
to consider this part.
That part I just have to remind
you of a general useful limit.
At (1 + x /n) to the n,
goes to e to the x.
Sometimes, this is even taken
as the definition of e to the x.
This is probably the most
important limit for e to the x.
I think of this as
the compound interest formula.
Because if you have like money in a bank
that's being compounded a certain
number of times per year, and if you
let the rate of compounding increase.
In the limit, you have continuous
compounding of exponential growth.
So if you haven't seen that limit before,
I think it's on the math review, or
well Bow did it in his math review.
Or just use Lopi take the log,
use Lopital's rule so
that's a famous useful limit.
So that goes to e to the x.
So therefore this thing goes
to e to the minus lambda.
And that's just the Poisson PMF.
Evaluated at k, okay?
So that shows that Binomials converge
to Poisson when we do it in this way.
Another example that I like to think
of just as more intuition on this
is counting the number of raindrops
that fall in some region.
I don't know why I like this example, but
I just find it very, very intuitive for
understanding the connection
between binomial and Poisson.
So I'm just going to draw a quick picture.
So this is the raindrop example.
So imagine we have a piece of paper, which
I'll represent as just this rectangle.
And imagine that well,
the board is vertical.
But imagine we turn this horizontal.
And then imagine we're outside and
it's raining on this.
We wanna know how many raindrops hit this
piece of paper in say one minute, okay?
Well, one way to think about that problem
would be to take this piece of paper and
break it up into lots of little squares.
So draw as many as you want, but
I'm just gonna draw like a grid.
And imagine that we've broken
it up into like millions and
millions of tiny little squares, okay?
Now, for each individual,
and I wanna count the number of
raindrops in some time interval.
For each individual square,
If we break this up into tiny enough
pieces, each individual square is unlikely
to get a raindrop hitting exactly in that,
if we make them small enough, right.
So each one is kind of unlikely, but
we have a huge number of little squares,
okay?
So we are gonna get some raindrops
hitting this piece of paper probably.
All right?
And lambda is gonna be a measure of the
intensity of how hard the rain is coming.
And, so let's think, should we use
a binomial distribution for this?
Well, if we assumed that everyone of these
squares whether it's get a drop of rain or
not, is independent of all the others.
And if we assume that they all
have the same probability P,
then it would be exactly binomial, okay?
But, I mean, I don't know enough
about rain to answer this or
not, but I'm guessing it's not
really exactly independent.
But it seems like a reasonable
approximation to treat them as
independent.
One other complication with the binomial,
with the binomial we would be assuming
that each one of these squares
could only get zero or one.
Now there's some tiny
chance that two raindrops
could fall into one of these squares.
So it's not gonna be exactly binomial,
but even if it were definitely binomial
we had binomial of like a trillion,
and then some tiny number that's
very very hard to work with.
Certainly very hard to work with by hand.
But even on a computers are gonna
have a lot of trouble handling
even a thousand factorial you're gonna
run into some computational difficulties.
Binomial has a lot of complications
to it where the poisson is much
simpler to deal with.
So a poisson seems reasonable here because
we have a huge number of little squares,
each one is very unlikely.
All right, so let's do one example, like
with the birthday problem type of setup.
And you'll have one on the next homework,
homework five related to this too.
Let's do the problem of
triple birthday matches,
as we worked out the exact answer for
the birthday problem.
What was the probability that at
least two people have the same
birthday when you have a group of people,
okay?
But now suppose we have, so
the example is, let's say we have n people
And we wanna know the probability that
there are 3 people with the same birthday.
And we only need to do it approximately.
In general, in this course,
assume that every answer you
give needs to be exact unless I say,
you know give an approximation.
In this case,
I'm saying find an approximation,
of find the approximate probability
that there are three people.
That is if you can find something,
maybe there's more than three.
Maybe there's four people
of the same birthday but
you can find some group of three people
who have the same birthday, okay?
That's the problem.
So if you try to do this in a way
that's how we solve the original
basic birthday problem,
this is actually pretty difficult,
very difficult you can try it,
but it's not gonna be nice, okay?
But with the poisson
aproximation this is actually,
should be pretty easy so
let's go through how to do that.
So we want three people to
all have the same birthday,
Let's just think about
what does that mean.
And first of all,
does this Poisson paradigm seem
appropriate for this problem?
Okay, so lets think about that.
Well, first I didn't say what n is,
so if n is 2 then it's 0.
And if n is 3 then that's a very small
number that's not going to work so well.
So assume n is reasonably
large,but n does not have to be.
A key point here is that n does
not have to be very very large.
Just like in the birthday problem,
part of the intuition,
why do you only need 23 people
when there's 265 days in the year.
Well 23 is a pretty small number,
but 23 choose 2,
is 253 is reasonably large, right?
So the more relevant
quantity is n choose 3.
So even if n is, n should at least
be 10 or 20 or something like that.
N does not have to be in the hundreds for
this to work well.
Because even with some double digit n
when you do n choose 3 it's
gonna be pretty large, right?
Okay, so there are n choose
three triplets of people.
All right?
Assuming that the people, as usual,
we assume the people are labelled
as 1 through n, take any subset of
three people, I call that a triplet.
For each triplet you want can
ask the question does that
triplet have the property all three of
them have the same birthday or not, right?
So we would create an indicator
random variable for each one.
Let's call that I sub ijk,
where if you have people ijk,
where I is less than j less than k.
So ijk or any numbers between 1 and
n in increasing order just to avoid
repeating the same thing multiple times.
Then we define this indicator we want if
they all have the same birthday, okay?
So actually I didn't write it here but
at this point we know that the expected
value we know the expected value exactly.
The expected value of
the number of triple matches,
We get that immediately by indicating
random variables linearly and symmetry.
That is, it's n choose 3 of these
indicators, and for each one,
now I'm imagining that I have the first
3 people, just for concreteness.
First person can have whatever birthday,
and
the second person has to match the first
person, that has probability 1 over 365.
And then the third person
also has to match, so
it's 1 over 365 squared, that's exact.
So, that's the exact
number of triple matches.
Notice though, that if there is a group of
four people who have the same birthday,
we are counting that as,
for each set of three.
So there are four choices of of
three people out of the four, and
we're counting that as four matches.
Okay, so that's the exact answer for
the expected value, but I said find
the approximate probability, I didn't
say find the exact expected value, okay?
So now we're gonna use
the Poisson approximation, and
so let's let x equal
number of triple matches.
There's no possible way that x
could b exactly poisson because
the number of triple matches
couldn't be more than the number
of triplets where the poisson
has no upper bound, okay?
But we're claiming that
it's approximately Poisson.
And so let's talk a little bit more
about why is that approximation valid.
And lambda is what we just computed,
it's the expected value.
So let's talk about why is that
a reasonable approximation.
Again I'm assuming n is reasonably large,
it doesn't have to be very large.
So why is that valid?
Well, even if n is like 15 or 20,
n choose 3 is pretty a big number, okay?
So the number of possible,
the number of trials is fairly large.
The probability of success for each trial,
each trial is a group of three people,
is very small, right?
Very, very unlikely, okay?
So we have a large number of
possible triple birthday matches.
Each one's unlikely.
And then the other
question is independence.
Well, we don't exactly have
independence here because,
for example, I123 and
I124 are not independent, for example.
Because if this event occurs, that is,
if this equals 1, it means the first
three people all have the same birthday.
And now I want to know whether people
124 have the same birthday, well,
we just kinda got a head start there.
We already knew, if we know that this
event, if this indicator equals 1,
then we're already kinda part way there.
So they're not completely independent.
But this will be an example of
weak dependence in the sense that
the probabilities are small anyway,
first of all.
And secondly, that's a head start, but
we still need person number four to match.
And that's just like for a specific case
where I define these with some overlap.
But, for example, I123 and
I456, those are independent.
So there's not that much dependence there.
So this seems like
a reasonable approximation.
And then, to finish the calculation,
well, we would just say,
we want to know what's the probability
of at least one triple match?
And as if often the case,
it's easier to do the complements.
So that's 1 minus the probability
that x equals zero.
And as an approximation,
that's gonna be 1 minus
e to the minus lambda,
lambda to the zero over zero factorial.
Which, of course,
it just 1 minus e to the minus lambda.
So it's 1 minus e to the minus lambda,
where lambda is this.
So notice that this is something
that you could calculate
easily using a calculator or a computer.
You don't have to do like
some complicated sum and
evaluate a lot of binomial
coefficients and all that stuff.
So this is very, very useful for
getting nice quick approximations.
All right, so
that's the Poisson distribution,
that's our last discrete distribution.
So congratulations on that.
And I'll see you next time.
