Hello and welcome back.
This is the outline of my lecture notes.
In Section 1, I’ll introduce probability
and distributions associated with random variables.
Section 2 will discuss multivariate distributions,
and some special distributions are discussed
in Section 3.
In Section 4, elementary statistical inferences
that connect this course with mathematical
statistics (2) will be introduced.
Last in Section 5, we discuss consistency
and limiting distributions, based on which
the central limit theorem is derived.
In this video, I introduce the basic probability
theory.
To begin with, we will go over several definitions
and notations that are used to define a probability.
The sample space of an experiment is a collection
of every possible outcome of the experiment.
For example, consider an experiment of tossing
a coin.
In this case, the sample space is a head or
tail.
Then, a random experiment is defined in the
following way.
First, the sample space of a random experiment
can be described before performing the experiment.
Second, the experiment can be repeated under
the same conditions.
With these two properties, a random experiment
is defined.
Can you think of any example that is not a
random experiment?
An example of such an experiment is having
a letter grade by taking a mathematical statistics
course in the university.
Possible outcomes will be A+, A0, A-, ... F.
So, all possible outcomes of the experiment
are known in advance, and the first condition
is satisfied.
By taking my course, you are doing this experiment
this semester and I know some of you are doing
this experiment twice or even three times.
In this case, because you can have more knowledge
by taking this course more than once, you
cannot repeat this experiment under the same
conditions.
So, the second condition is violated.
Then, what is the example of a random experiment?
The first example is tossing a coin.
The sample space of a coin toss is a head
and tail.
Plus the coin can be repeatedly tossed under
the same conditions.
So, tossing a coin is a random experiment.
The second example is casting two dice, where
one is red and the other is white.
So, if we express the outcome using a parenthesis,
where the first element is from the red die
and the second element is from the white die,
then the sample space is (1,1), (1,2), (1,3),
(1,4), (1,5), (1,6), ..., (6,6).
And the two dice can be repeatedly cast under
the same conditions.
So, casting two dice is also a random experiment.
Under the random experiment, let me define
a probability using relative frequency.
From now on, we use the Omega to denote the
sample space of an experiment, s to denote
an element of the Omega, and C to represent
a collection of elements of the Omega.
The collection of elements is called an event.
For example, when we toss a coin, the sample
space Omega is a head and tail {H,T}, where
an element is either a head or tail.
A simple event can be to get a head, and then
we write C={H}.
Suppose this random experiment is repeated
N times.
We count the number f of times that the event
C has occurred throughout the N repetitions.
Then we can calculate the ratio, f/N, where
this N is the total number of repetitions
and this f is the total number of times of
having the event C.
The ratio f/N is called the relative frequency
of the event C in the N experiments.
As N increases (so N goes to infinity), the
relative frequency seems to converge to a
certain number p.
This p is called the probability of C, the
probability that the outcome of the random
experiment is in C, or the probability measure
of C.
From the frequency approach, the probability
of the event C is defined as the number of
elements in C divided by the number of elements
in Omega.
So in a coin toss example, the number of elements
in Omega is 2 because we have a head and tail.
And the number of elements in C is a head,
so there is only one.
So that’s gonna be 1/2.
If you do this experiment over and over again,
the relative frequency of having a head will
converge to 1/2.
Consider the example of casting two dice,
one for red and the other for white.
The event C is the collection of every ordered
pair of the sample space where the sum of
the pair is equal to seven.
So, we can express each outcome using a parenthesis
again, where the first element is for the
red die and the second element is for the
white die.
So, the event C is the collection (1,6), (2,5),
(3,4), (4,3), (5,2), (6,1).
Here (1,6) and (6,1) are distinct because
the first element is for the red die and the
second element is for the white die.
Suppose the two dice are cast 400 times and
the frequency of C is 60.
Then the relative frequence of the event C
is 60/400, which is equal to 0.16.
This would be close to the actual probability
of the event C, which is 6 out of 36.
Here if you just look at this event C, we
have 6 elements.
And then the total number of elements in the
sample space will be 6 times 6.
So, the probability is 1/6, or 0.167.
If you do this experiment over and over again,
this relative frequency will converge to this
0.167.
Do you know the Korea lotto 6/45?
This is a lottery game where 6 winning numbers
are selected out of numbers ranging from 1
to 45.
If you guess the correct combination of 6
numbers, you win the lotto jackpot.
In this case, what is the probability that
a certain number belongs to the correct combination
of 6 numbers?
And in this case, each number has the equal
probability of being selected as one of the
6 winning numbers, so the probability of becoming
a winning number in each round is 6/45 = 0.133.
So, if you randomly sample 6 out of the 45
numbers repeatedly, we can easily show that
the relative frequency converges to 0.133.
Later we are going to do some computer simulations
for this lotto.
So far, I have discussed a probability based
on relative frequency.
Now consider some further examples.
What is the probability that it will rain
tomorrow?
When you buy a share of stock today, what
is the probability that the stock price will
go up tomorrow?
In these examples, we know all possible outcomes
of the experiment in advance.
In the first example, the sample space is
rain and no rain.
In the second example, the sample space is
to go up and go down.
In both examples, however, the experiment
cannot be repeated under the same conditions
because there is only one specific date tomorrow,
so we cannot repeat the specific date many
times.
So, sometimes it is difficult to use strong
frequency interpretations for probabilities,
and subjectivity can be used to define a probability.
Here is an example of a probability based
on subjectivity.
Every year we have the Yonsei-Korea rivalry.
Suppose that you receive $1 from your friend
if Yonsei wins in the rivalry and give $X
to your friend if Yonsei loses.
John will join the bet when X=1 and Jane will
join the bet when X=9.
Between John and Jane, whose probability that
Yonsei wins in the next rivalry is higher?
In the case of John, his fortune doesn’t
have any difference whether Yonsei wins or
loses, because he just receives $1 if Yonsei
wins and gives $1 if Yonsei loses.
So, John’s probability that Yonsei wins
is 50-50.
In the case of Jane, she receives only $1
when Yonsei wins, but she is willing to pay
much more dollars when Yonsei loses.
This means she thinks the probability that
Yonsei wins is much higher than the probability
of losing.
Specifically, the threshold value of X that
you join the bet will determine the subjective
probability that Yonsei wins in the next rivalry.
So, the John’s probability of Yonsei’s
winning is 1/2, which is 0.5.
And the Jane’s probability of Yonsei’s
winning is 9/10, which is equal to 0.9.
So, by doing this, we can determine the subjective
probability.
In the next video, we are going to talk about
the set theory.
