The following content is
provided under a Creative
Commons license.
Your support will help MIT
OpenCourseWare continue to
offer high quality educational
resources for free.
To make a donation or view
additional materials from
hundreds of MIT courses, visit
MIT OpenCourseWare at
ocw.mit.edu.
PROFESSOR: So by now you have
seen pretty much every
possible trick there is in
basic probability theory,
about how to calculate
distributions, and so on.
You have the basic tools to
do pretty much anything.
So what's coming after this?
Well, probability is useful for
developing the science of
inference, and this is a subject
to which we're going
to come back at the end
of the semester.
Another chapter, which is what
we will be doing over the next
few weeks, is to deal with
phenomena that evolve in time.
So so-called random processes
or stochastic processes.
So what is this about?
So in the real world, you don't
just throw two random
variables and go home.
Rather the world goes on.
So you generate the random
variable, then you get more
random variables, and things
evolve in time.
And random processes are
supposed to be models that
capture the evolution of random
phenomena over time.
So that's what we
will be doing.
Now when we have evolution in
time, mathematically speaking,
you can use discrete time
or continuous time.
Of course, discrete
time is easier.
And that's where we're
going to start from.
And we're going to start from
the easiest, simplest random
process, which is the so-called
Bernoulli process,
which is nothing but just a
sequence of coin flips.
You keep flipping a coin
and keep going forever.
That's what the Bernoulli
process is.
So in some sense it's something
that you have
already seen.
But we're going to introduce a
few additional ideas here that
will be useful and relevant as
we go along and we move on to
continuous time processes.
So we're going to define the
Bernoulli process, talk about
some basic properties that the
process has, and derive a few
formulas, and exploit the
special structure that it has
to do a few quite interesting
things.
By the way, where does the
word Bernoulli come from?
Well the Bernoulli's were a
family of mathematicians,
Swiss mathematicians and
scientists around the 1700s.
There were so many of
them that actually--
and some of them had the
same first name--
historians even have difficulty
of figuring out who
exactly did what.
But in any case, you can imagine
that at the dinner
table they were probably
flipping coins and doing
Bernoulli trials.
So maybe that was
their pass-time.
OK.
So what is the Bernoulli
process?
The Bernoulli process is nothing
but a sequence of
independent Bernoulli
trials that you can
think of as coin flips.
So you can think the result
of each trial
being heads or tails.
It's a little more convenient
maybe to talk about successes
and failures instead
of heads or tails.
Or if you wish numerical values,
to use a 1 for a
success and 0 for a failure.
So the model is that each one
of these trials has the same
probability of success, p.
And the other assumption is
that these trials are
statistically independent
of each other.
So what could be some examples
of Bernoulli trials?
You buy a lottery ticket every
week and you win or lose.
Presumably, these are
independent of each other.
And if it's the same kind of
lottery, the probability of
winning should be the same
during every week.
Maybe you want to model
the financial markets.
And a crude model could be that
on any given day the Dow
Jones is going to go up
or down with a certain
probability.
Well that probability must be
somewhere around 0.5, or so.
This is a crude model of
financial markets.
You say, probably there
is more into them.
Life is not that simple.
But actually it's a pretty
reasonable model.
It takes quite a bit of work
to come up with more
sophisticated models that can
do better predictions than
just pure heads and tails.
Now more interesting, perhaps
to the examples we will be
dealing with in this class--
a Bernoulli process is a good
model for streams of arrivals
of any kind to a facility.
So it could be a bank, and
you are sitting at
the door of the bank.
And at every second, you check
whether a customer came in
during that second or not.
Or you can think about arrivals
of jobs to a server.
Or any other kind of requests
to a service system.
So requests, or jobs, arrive
at random times.
You split the time
into time slots.
And during each time slot
something comes or something
does not come.
And for many applications, it's
a reasonable assumption
to make that arrivals on any
given slot are independent of
arrivals in any other
time slot.
So each time slot can be viewed
as a trial, where
either something comes
or doesn't come.
And different trials are
independent of each other.
Now there's two assumptions
that we're making here.
One is the independence
assumption.
The other is that this number,
p, probability
of success, is constant.
Now if you think about the bank
example, if you stand
outside the bank at 9:30 in
the morning, you'll see
arrivals happening at
a certain rate.
If you stand outside the bank
at 12:00 noon, probably
arrivals are more frequent.
Which means that the given
time slot has a higher
probability of seeing an arrival
around noon time.
This means that the assumption
of a constant p is probably
not correct in that setting,
if you're talking
about the whole day.
So the probability of successes
or arrivals in the
morning is going to be smaller
than what it would be at noon.
But if you're talking about a
time period, let's say 10:00
to 10:15, probably all slots
have the same probability of
seeing an arrival and it's
a good approximation.
So we're going to stick with
the assumption that p is
constant, doesn't change
with time.
Now that we have our model
what do we do with it?
Well, we start talking about
the statistical properties
that it has.
And here there's two slightly
different perspectives of
thinking about what a
random process is.
The simplest version is to think
about the random process
as being just a sequence
of random variables.
We know what random
variables are.
We know what multiple random
variables are.
So it's just an experiment that
has associated with it a
bunch of random variables.
So once you have random
variables, what do you do
instinctively?
You talk about the
distribution of
these random variables.
We already specified for the
Bernoulli process that each Xi
is a Bernoulli random variable,
with probability of
success equal to p.
That specifies the distribution
of the random
variable X, or Xt, for
general time t.
Then you can calculate
expected values and
variances, and so on.
So the expected value is, with
probability p, you get a 1.
And with probability
1 - p, you get a 0.
So the expected value
is equal to p.
And then we have seen before a
formula for the variance of
the Bernoulli random variable,
which is p times 1-p.
So this way we basically now
have all the statistical
properties of the random
variable Xt, and we have those
properties for every t.
Is this enough of a
probabilistic description of a
random process?
Well, no.
You need to know how the
different random variables
relate to each other.
If you're talking about a
general random process, you
would like to know things.
For example, the joint
distribution of X2,
with X5, and X7.
For example, that might be
something that you're
interested in.
And the way you specify it is
by giving the joint PMF of
these random variables.
And you have to do that for
every collection, or any
subset, of the random
variables you
are interested in.
So to have a complete
description of a random
processes, you need to specify
for me all the possible joint
distributions.
And once you have all the
possible joint distributions,
then you can answer, in
principle, any questions you
might be interested in.
How did we get around
this issue for
the Bernoulli process?
I didn't give you the joint
distributions explicitly.
But I gave them to
you implicitly.
And this is because I told you
that the different random
variables are independent
of each other.
So at least for the Bernoulli
process, where we make the
independence assumption, we know
that this is going to be
the product of the PMFs.
And since I have told you what
the individual PMFs are, this
means that you automatically
know all the joint PMFs.
And we can go to business
based on that.
All right.
So this is one view of what a
random process is, just a
collection of random
variables.
There's another view that's a
little more abstract, which is
the following.
The entire process is to be
thought of as one long
experiment.
So we go back to the
chapter one view of
probabilistic models.
So there must be a sample
space involved.
What is the sample space?
If I do my infinite, long
experiment of flipping an
infinite number of coins,
a typical outcome of the
experiment would be a sequence
of 0's and 1's.
So this could be one possible
outcome of the experiment,
just an infinite sequence
of 0's and 1's.
My sample space is the
set of all possible
outcomes of this kind.
Here's another possible
outcome, and so on.
And essentially we're dealing
with a sample space, which is
the space of all sequences
of 0's and 1's.
And we're making some sort of
probabilistic assumption about
what may happen in
that experiment.
So one particular sequence that
we may be interested in
is the sequence of obtaining
all 1's.
So this is the sequence that
gives you 1's forever.
Once you take the point of view
that this is our sample
space-- its the space of all
infinite sequences--
you can start asking questions
that have to do
with infinite sequences.
Such as the question, what's the
probability of obtaining
the infinite sequence that
consists of all 1's?
So what is this probability?
Let's see how we could
calculate it.
So the probability of obtaining
all 1's is certainly
less than or equal to the
probability of obtaining 1's,
just in the first 10 tosses.
OK.
This is asking for more things
to happen than this.
If this event is true, then
this is also true.
Therefore the probability of
this is smaller than the
probability of that.
This event is contained
in that event.
This implies this.
So we have this inequality.
Now what's the probability of
obtaining 1's in 10 trials?
This is just p to the 10th
because the trials are
independent.
Now of course there's no reason
why I chose 10 here.
The same argument goes
through if I use an
arbitrary number, k.
And this has to be
true for all k.
So this probability is less
than p to the k, no matter
what k I choose.
Therefore, this must be less
than or equal to the limit of
this, as k goes to infinity.
This is smaller than
that for all k's.
Let k go to infinity, take k
arbitrarily large, this number
is going to become arbitrarily
small.
It goes to 0.
And that proves that the
probability of an infinite
sequence of 1's is equal to 0.
So take limits of both sides.
It's going to be less than
or equal to the limit--
I shouldn't take a limit here.
The probability is less than or
equal to the limit of p to
the k, as k goes to infinity,
which is 0.
So this proves in a formal way
that the sequence of all 1's
has 0 probability.
If you have an infinite number
of coin flips, what's the
probability that all of the coin
flips result in heads?
The probability of this
happening is equal to zero.
So this particular sequence
has 0 probability.
Of course, I'm assuming here
that p is less than 1,
strictly less than 1.
Now the interesting thing is
that if you look at any other
infinite sequence, and you try
to calculate the probability
of that infinite sequence, you
would get a product of (1-p)
times 1, 1-p times 1, 1-p,
times p times p,
times 1-p and so on.
You keep multiplying numbers
that are less than 1.
Again, I'm making the
assumption that p is
between 0 and 1.
So 1-p is less than 1,
p is less than 1.
You keep multiplying numbers
less than 1.
If you multiply infinitely
many such numbers, the
infinite product becomes 0.
So any individual sequence in
this sample space actually has
0 probability.
And that is a little bit
counter-intuitive perhaps.
But the situation is more like
the situation where we deal
with continuous random
variables.
So if you could draw a
continuous random variable,
every possible outcome
has 0 probability.
And that's fine.
But all of the outcomes
collectively still have
positive probability.
So the situation here is
very much similar.
So the space of infinite
sequences of 0's and 1's, that
sample space is very much
like a continuous space.
If you want to push that analogy
further, you could
think of this as the expansion
of a real number.
Or the representation of a
real number in binary.
Take a real number, write it
down in binary, you are going
to get an infinite sequence
of 0's and 1's.
So you can think of each
possible outcome here
essentially as a real number.
So the experiment of doing an
infinite number of coin flips
is sort of similar to the
experiment of picking a real
number at random.
When you pick real numbers at
random, any particular real
number has 0 probability.
So similarly here, any
particular infinite sequence
has 0 probability.
So if we were to push that
analogy further, there would
be a few interesting
things we could do.
But we will not push
it further.
This is just to give you an
indication that things can get
pretty subtle and interesting
once you start talking about
random processes that involve
forever, over the infinite
time horizon.
So things get interesting even
in this context of the simple
Bernoulli process.
Just to give you a preview of
what's coming further, today
we're going to talk just about
the Bernoulli process.
And you should make sure before
the next lecture--
I guess between the exam
and the next lecture--
to understand everything
we do today.
Because next time we're going
to do everything once more,
but in continuous time.
And in continuous time, things
become more subtle and a
little more difficult.
But we are going to build on
what we understand for the
discrete time case.
Now both the Bernoulli process
and its continuous time analog
has a property that we call
memorylessness, whatever
happened in the past does
not affect the future.
Later on in this class we're
going to talk about more
general random processes,
so-called Markov chains, in
which there are certain
dependences across time.
That is, what has happened in
the past will have some
bearing on what may happen
in the future.
So it's like having coin flips
where the outcome of the next
coin flip has some dependence
on the previous coin flip.
And that gives us a richer
class of models.
And once we get there,
essentially we will have
covered all possible models.
So for random processes that
are practically useful and
which you can manipulate, Markov
chains are a pretty
general class of models.
And almost any real world
phenomenon that evolves in
time can be approximately
modeled using Markov chains.
So even though this is a first
class in probability, we will
get pretty far in
that direction.
All right.
So now let's start doing a few
calculations and answer some
questions about the
Bernoulli process.
So again, the best way to think
in terms of models that
correspond to the Bernoulli
process is in terms of
arrivals of jobs
to a facility.
And there's two types of
questions that you can ask.
In a given amount of time,
how many jobs arrived?
Or conversely, for a given
number of jobs, how much time
did it take for them
to arrive?
So we're going to deal with
these two questions, starting
with the first.
For a given amount of time--
that is, for a given number
of time periods--
how many arrivals have we had?
How many of those Xi's
happen to be 1's?
We fix the number
of time slots--
let's say n time slots--
and you measure the number
of successes.
Well this is a very familiar
random variable.
The number of successes in n
independent coin flips--
or in n independent trials--
is a binomial random variable.
So we know its distribution is
given by the binomial PMF, and
it's just this, for k going
from 0 up to n.
And we know everything by now
about this random variable.
We know its expected
value is n times p.
And we know the variance, which
is n times p, times 1-p.
So there's nothing new here.
That's the easy part.
So now let's look at the
opposite kind of question.
Instead of fixing the time and
asking how many arrivals, now
let us fix the number of
arrivals and ask how much time
did it take.
And let's start with the time
until the first arrival.
So the process starts.
We got our slots.
And we see, perhaps, a sequence
of 0's and then at
some point we get a 1.
The number of trials it took
until we get a 1, we're going
to call it T1.
And it's the time of
the first arrival.
OK.
What is the probability
distribution of T1?
What kind of random
variable is it?
We've gone through
this before.
The event that the first arrival
happens at time little
t is the event that the first
t-1 trials were failures, and
the trial number t happens
to be a success.
So for the first success to
happen at time slot number 5,
it means that the first 4 slots
had failures and the 5th
slot had a success.
So the probability of this
happening is the probability
of having failures in the first
t -1 trials, and having
a success at trial number 1.
And this is the formula for
t equal 1,2, and so on.
So we know what this
distribution is.
It's the so-called geometric
distribution.
Let me jump this through
this for a minute.
In the past, we did calculate
the expected value of the
geometric distribution,
and it's 1/p.
Which means that if p is small,
you expect to take a
long time until the
first success.
And then there's a formula also
for the variance of T1,
which we never formally derived
in class, but it was
in your textbook and it just
happens to be this.
All right.
So nothing new until
this point.
Now, let's talk about
this property, the
memorylessness property.
We kind of touched on this
property when we discussed--
when we did the derivation
in class of the
expected value of T1.
Now what is the memoryless
property?
It's essentially a consequence
of independence.
If I tell you the results of my
coin flips up to a certain
time, this, because of
independence, doesn't give you
any information about the coin
flips after that time.
So knowing that we had lots of
0's here does not change what
I believe about the future coin
flips, because the future
coin flips are going to be just
independent coin flips
with a given probability,
p, for obtaining tails.
So this is a statement that I
made about a specific time.
That is, you do coin flips
until 12 o'clock.
And then at 12 o'clock,
you start watching.
No matter what happens before 12
o'clock, after 12:00, what
you're going to see is just
a sequence of independent
Bernoulli trials with the
same probability, p.
Whatever happened in the
past is irrelevant.
Now instead of talking about
the fixed time at which you
start watching, let's think
about a situation where your
sister sits in the next room,
flips the coins until she
observes the first success,
and then calls you inside.
And you start watching
after this time.
What are you're going to see?
Well, you're going to see a coin
flip with probability p
of success.
You're going to see another
trial that has probability p
as a success, and these are all
independent of each other.
So what you're going to see
starting at that time is going
to be just a sequence of
independent Bernoulli trials,
as if the process was starting
at this time.
How long it took for the first
success to occur doesn't have
any bearing on what is going
to happen afterwards.
What happens afterwards
is still a sequence of
independent coin flips.
And this story is actually
even more general.
So your sister watches the coin
flips and at some point
tells you, oh, something
really interesting is
happening here.
I got this string of a
hundred 1's in a row.
Come and watch.
Now when you go in there and
you start watching, do you
expect to see something
unusual?
There were unusual things
that happened before
you were called in.
Does this means that you're
going to see unusual things
afterwards?
No.
Afterwards, what you're going
to see is, again, just a
sequence of independent
coin flips.
The fact that some strange
things happened before doesn't
have any bearing as to what is
going to happen in the future.
So if the roulettes in the
casino are properly made, the
fact that there were 3 reds in
a row doesn't affect the odds
of whether in the next
roll it's going to
be a red or a black.
So whatever happens in
the past-- no matter
how unusual it is--
at the time when you're called
in, what's going to happen in
the future is going to be just
independent Bernoulli trials,
with the same probability, p.
The only case where this story
changes is if your sister has
a little bit of foresight.
So your sister can look ahead
into the future and knows that
the next 10 coin flips will be
heads, and calls you before
those 10 flips will happen.
If she calls you in, then what
are you going to see?
You're not going to see
independent Bernoulli trials,
since she has psychic powers
and she knows that the next
ones would be 1's.
She called you in and you will
see a sequence of 1's.
So it's no more independent
Bernoulli trials.
So what's the subtle
difference here?
The future is independent from
the past, provided that the
time that you are called and
asked to start watching is
determined by someone who
doesn't have any foresight,
who cannot see the future.
If you are called in, just on
the basis of what has happened
so far, then you don't have any
information about the future.
And one special case is
the picture here.
You have your coin flips.
Once you see a one that happens,
once you see a
success, you are called in.
You are called in on the basis
of what happened in the past,
but without any foresight.
OK.
And this subtle distinction is
what's going to make our next
example interesting
and subtle.
So here's the question.
You buy a lottery ticket every
day, so we have a Bernoulli
process that's running
in time.
And you're interested in the
length of the first string of
losing days.
What does that mean?
So suppose that a typical
sequence of events
could be this one.
So what are we discussing
here?
We're looking at the first
string of losing days, where
losing days means 0's.
So the string of losing days
is this string here.
Let's call the length of that
string, L. We're interested in
the random variable, which is
the length of this interval.
What kind of random
variable is it?
OK.
Here's one possible way you
might think about the problem.
OK.
Starting from this time, and
looking until this time here,
what are we looking at?
We're looking at the time,
starting from here, until the
first success.
So the past doesn't matter.
Starting from here we
have coin flips
until the first success.
The time until the
first success
in a Bernoulli process--
we just discussed that it's a
geometric random variable.
So your first conjecture would
be that this random variable
here, which is 1 longer than the
one we are interested in,
that perhaps is a geometric
random variable.
And if this were so, then you
could say that the random
variable, L, is a geometric,
minus 1.
Can that be the correct
answer?
A geometric random variable,
what values does it take?
It takes values 1,
2, 3, and so on.
1 minus a geometric would
take values from 0,
1, 2, and so on.
Can the random variable
L be 0?
No.
The random variable L
is the length of a
string of losing days.
So the shortest that L could
be, would be just 1.
If you get just one losing day
and then you start winning, L
would be equal to 1.
So L cannot be 0 by definition,
which means that L
+ 1 cannot be 1,
by definition.
But if L +1 were geometric,
it could be equal to 1.
Therefore this random
variable, L
+ 1, is not a geometric.
OK.
Why is it not geometric?
I started watching
at this time.
From this time until the first
success, that should be a
geometric random variable.
Where's the catch?
If I'm asked to start watching
at this time, it's because my
sister knows that the next
one was a failure.
This is the time where the
string of failures starts.
In order to know that they
should start watching here,
it's the same as if
I'm told that the
next one is a failure.
So to be asked to start watching
at this time requires
that someone looked
in the future.
And in that case, it's no longer
true that these will be
independent Bernoulli trials.
In fact, they're not.
If you start watching here,
you're certain that the next
one is a failure.
The next one is not an
independent Bernoulli trial.
That's why the argument that
would claim that this L + 1 is
geometric would be incorrect.
So if this is not the correct
answer, which
is the correct answer?
The correct answer
goes as follows.
Your sister is watching.
Your sister sees the first
failure, and then tells you,
OK, the failures--
or losing days--
have started.
Come in and watch.
So you start to watching
at this time.
And you start watching until
the first success comes.
This will be a geometric
random variable.
So from here to here, this
will be geometric.
So things happen.
You are asked to
start watching.
After you start watching, the
future is just a sequence of
independent Bernoulli trials.
And the time until the first
failure occurs, this is going
to be a geometric random
variable with parameter p.
And then you notice that the
interval of interest is
exactly the same as the length
of this interval.
This starts one time step
later, and ends
one time step later.
So conclusion is that L is
actually geometric, with
parameter p.
OK, it looks like I'm
missing one slide.
Can I cheat a little
from here?
OK.
So now that we dealt with the
time until the first arrival,
we can start talking about
the time until the second
arrival, and so on.
How do we define these?
After the first arrival happens,
we're going to have a
sequence of time slots with no
arrivals, and then the next
arrival is going to happen.
So we call this time
that elapses--
or number of time slots after
the first arrival
until the next one--
we call it T2.
This is the second inter-arrival
time, that is,
time between arrivals.
Once this arrival has happened,
then we wait and see
how many more it takes until
the third arrival.
And we call this
time here, T3.
We're interested in the time of
the k-th arrival, which is
going to be just the
sum of the first k
inter-arrival times.
So for example, let's say Y3
is the time that the third
arrival comes.
Y3 is just the sum of T1,
plus T2, plus T3.
So we're interested in this
random variable, Y3, and it's
the sum of inter-arrival
times.
To understand what kind of
random variable it is, I guess
we should understand what kind
of random variables these are
going to be.
So what kind of random
variable is T2?
Your sister is doing her coin
flips until a success is
observed for the first time.
Based on that information about
what has happened so
far, you are called
into the room.
And you start watching until a
success is observed again.
So after you start watching,
what you have is just a
sequence of independent
Bernoulli trials.
So each one of these
has probability
p of being a success.
The time it's going to take
until the first success, this
number, T2, is going to be again
just another geometric
random variable.
It's as if the process
just started.
After you are called into the
room, you have no foresight,
you don't have any information
about the future, other than
the fact that these
are going to be
independent Bernoulli trials.
So T2 itself is going to be
geometric with the same
parameter p.
And then you can continue the
arguments and argue that T3 is
also geometric with the
same parameter p.
Furthermore, whatever happened,
how long it took
until you were called in, it
doesn't change the statistics
about what's going to happen
in the future.
So whatever happens
in the future is
independent from the past.
So T1, T2, and T3 are
independent random variables.
So conclusion is that the time
until the third arrival is the
sum of 3 independent geometric
random variables, with the
same parameter.
And this is true
more generally.
The time until the k-th arrival
is going to be the sum
of k independent random
variables.
So in general, Yk is going to be
T1 plus Tk, where the Ti's
are geometric, with the same
parameter p, and independent.
So now what's more natural
than trying to find the
distribution of the random
variable Yk?
How can we find it?
So I fixed k for you.
Let's say k is 100.
I'm interested in how
long it takes until
100 customers arrive.
How can we find the distribution
of Yk?
Well one way of doing
it is to use this
lovely convolution formula.
Take a geometric, convolve it
with another geometric, you
get something.
Take that something that you
got, convolve it with a
geometric once more, do this 99
times, and this gives you
the distribution of Yk.
So that's definitely doable,
and it's extremely tedious.
Let's try to find
the distribution
of Yk using a shortcut.
So the probability that
Yk is equal to t.
So we're trying to find
the PMF of Yk.
k has been fixed for us.
And we want to calculate this
probability for the various
values of t, because this
is going to give
us the PMF of Yk.
OK.
What is this event?
What does it take for the k-th
arrival to be at time t?
For that to happen, we
need two things.
In the first t -1 slots,
how many arrivals
should we have gotten?
k - 1.
And then in the last slot, we
get one more arrival, and
that's the k-th one.
So this is the probability that
we have k - 1 arrivals in
the time interval
from 1 up to t.
And then, an arrival
at time t.
That's the only way that it
can happen, that the k-th
arrival happens at time t.
We need to have an arrival
at time t.
And before that time,
we need to have
exactly k - 1 arrivals.
Now this is an event
that refers--
t-1.
In the previous time slots we
had exactly k -1 arrivals.
And then at the last time slot
we get one more arrival.
Now the interesting thing is
that this event here has to do
with what happened from time
1 up to time t -1.
This event has to do with
what happened at time t.
Different time slots are
independent of each other.
So this event and that event
are independent.
So this means that we can
multiply their probabilities.
So take the probability
of this.
What is that?
Well probability of having a
certain number of arrivals in
a certain number of time slots,
these are just the
binomial probabilities.
So this is, out of t - 1 slots,
to get exactly k - 1
arrivals, p to the k-1, (1-p)
to the t-1 - (k-1),
this gives us t-k.
And then we multiply with this
probability, the probability
of an arrival, at time
t is equal to p.
And so this is the formula for
the PMF of the number--
of the time it takes until
the k-th arrival happens.
Does it agree with the formula
in your handout?
Or its not there?
It's not there.
OK.
Yeah.
OK.
So that's the formula and it is
true for what values of t?
[INAUDIBLE].
It takes at least k time slots
in order to get k arrivals, so
this formula should be
true for k larger
than or equal to t.
For t larger than
or equal to k.
All right.
So this gives us the PMF of
the random variable Yk.
Of course, we may also be
interested in the mean and
variance of Yk.
But this is a lot easier.
Since Yk is the sum of
independent random variables,
the expected value of Yk is
going to be just k times the
expected value of
your typical t.
So the expected value of Yk is
going to be just k times 1/p,
which is the mean of
the geometric.
And similarly for the variance,
it's going to be k
times the variance
of a geometric.
So we have everything there is
to know about the distribution
of how long it takes until
the first arrival comes.
OK.
Finally, let's do a few
more things about
the Bernoulli process.
It's interesting to talk about
several processes at the time.
So in the situation here of
splitting a Bernoulli process
is where you have arrivals
that come to a server.
And that's a picture of which
slots get arrivals.
But actually maybe you
have two servers.
And whenever an arrival comes to
the system, you flip a coin
and with some probability, q,
you send it to one server.
And with probability 1-q, you
send it to another server.
So there is a single
arrival stream, but
two possible servers.
And whenever there's an arrival,
you either send it
here or you send it there.
And each time you decide where
you send it by flipping an
independent coin that
has its own bias q.
The coin flips that decide
where do you send it are
assumed to be independent from
the arrival process itself.
So there's two coin flips
that are happening.
At each time slot, there's a
coin flip that decides whether
you have an arrival in this
process here, and that coin
flip is with parameter p.
And if you have something that
arrives, you flip another coin
with probabilities q, and 1-q,
that decides whether you send
it up there or you send
it down there.
So what kind of arrival process
does this server see?
At any given time slot, there's
probability p that
there's an arrival here.
And there's a further
probability q that this
arrival gets sent up there.
So the probability that this
server sees an arrival at any
given time is p times q.
So this process here is going to
be a Bernoulli process, but
with a different parameter,
p times q.
And this one down here, with the
same argument, is going to
be Bernoulli with parameter
p times (1-q).
So by taking a Bernoulli
stream of arrivals and
splitting it into
two, you get two
separate Bernoulli processes.
This is going to be a Bernoulli
process, that's
going to be a Bernoulli
process.
Well actually, I'm running
a little too fast.
What does it take to verify that
it's a Bernoulli process?
At each time slot,
it's a 0 or 1.
And it's going to be a 1, you're
going to see an arrival
with probability p times q.
What else do we need to verify,
to be able to tell--
to say that it's a Bernoulli
process?
We need to make sure that
whatever happens in this
process, in different time
slots, are statistically
independent from each other.
Is that property true?
For example, what happens in
this time slot whether you got
an arrival or not, is it
independent from what happened
at that time slot?
The answer is yes for the
following reason.
What happens in this time slot
has to do with the coin flip
associated with the original
process at this time, and the
coin flip that decides
where to send things.
What happens at that time slot
has to do with the coin flip
here, and the additional coin
flip that decides where to
send it if something came.
Now all these coin flips are
independent of each other.
The coin flips that determine
whether we have an arrival
here is independent from the
coin flips that determined
whether we had an
arrival there.
And you can generalize this
argument and conclude that,
indeed, every time slot here
is independent from
any other time slot.
And this does make it
a Bernoulli process.
And the reason is that, in the
original process, every time
slot is independent from
every other time slot.
And the additional assumption
that the coin flips that we're
using to decide where to send
things, these are also
independent of each other.
So we're using here the basic
property that functions of
independent things remain
independent.
There's a converse
picture of this.
Instead of taking one stream
and splitting it into two
streams, you can do
the opposite.
You could start from two
streams of arrivals.
Let's say you have arrivals of
men and you have arrivals of
women, but you don't
care about gender.
And the only thing you record
is whether, in a given time
slot, you had an
arrival or not.
Notice that here we may have
an arrival of a man and the
arrival of a woman.
We just record it with a 1, by
saying there was an arrival.
So in the merged process, we're
not keeping track of how
many arrivals we had total.
We just record whether
there was an
arrival or not an arrival.
So an arrival gets recorded here
if, and only if, one or
both of these streams
had an arrival.
So that we call a merging
of two Bernoull-- of two
processes, of two arrival
processes.
So let's make the assumption
that this arrival process is
independent from that
arrival process.
So what happens at the
typical slot here?
I'm going to see an arrival,
unless none of
these had an arrival.
So the probability of an arrival
in a typical time slot
is going to be 1 minus the
probability of no arrival.
And the event of no arrival
corresponds to the first
process having no arrival,
and the second
process having no arrival.
So there's no arrival in the
merged process if, and only
if, there's no arrival in the
first process and no arrival
in the second process.
We're assuming that the two
processes are independent and
that's why we can multiply
probabilities here.
And then you can take this
formula and it simplifies to p
+ q, minus p times q.
So each time slot of the merged
process has a certain
probability of seeing
an arrival.
Is the merged process
a Bernoulli process?
Yes, it is after you verify the
additional property that
different slots are independent
of each other.
Why are they independent?
What happens in this slot has to
do with that slot, and that
slot down here.
These two slots--
so what happens here,
has to do with what
happens here and there.
What happens in this slot has
to do with whatever happened
here and there.
Now, whatever happens here and
there is independent from
whatever happens
here and there.
Therefore, what happens here
is independent from what
happens there.
So the independence property
is preserved.
The different slots of this
merged process are independent
of each other.
So the merged process is itself
a Bernoulli process.
So please digest these two
pictures of merging and
splitting, because we're going
to revisit them in continuous
time where things are little
subtler than that.
OK.
Good luck on the exam and
see you in a week.
