- [Brandon] So in this video,
we will jump into Excel
and calculate Fisher's LSD test statistic
and Fisher's LSD threshold.
Now this video is sort of the companion
to the previous video in this playlist,
where we talked about ANOVA post hoc
or ANOVA multiple comparison procedures
at a conceptual level.
So if you haven't watched that,
you might wanna go ahead
and watch that first,
and then come back here.
Also, if you look in the
description for this video,
there will be a link where you can go
and download this exact spreadsheet
and follow along, if you like.
So let's go ahead and get started.
Now here in columns A, B, and C
we have our roller coaster
data from the previous video.
And these are 25 wait times
for each roller coaster
at Kings Island amusement
park and it is in minutes.
So the first thing we will do,
like I always encourage you
to do, is create a graphic
where we can visualize our data.
And for an ANOVA, a one
way ANOVA in this case,
a box plot is a good visual.
Let's go ahead and select our data here.
We're gonna go to insert,
we're going to charts, all charts,
then select box and whisker, and click OK.
So I'm gonna go ahead and make it bigger
and we will add our legend.
Let's take a look at what we got here.
So we can say that the Beast has the
lowest mean wait time, here in the blue.
The Banshee has the
middle average wait time
here in the orange,
and Diamondback has the
highest average wait
time, here in the gray.
So what we're trying to find
out here is which of these
differences is statistically significant.
So we can obviously see there
is a very big difference
between the Beast mean wait time
and the Diamondback mean wait time,
that's the blue and the gray.
However, we have further refinement.
So is the mean wait time
between the Beast and Banshee,
here in blue and orange,
is that significant?
How about Banshee and Diamondback here
in the orange and the gray?
We don't know that for sure.
So what we can do is use
ANOVA post hoc analysis to find out.
So in this video, we're
going over Fisher's LSD test
but like I said in the previous video,
there are others like Tukey, Bonferroni,
and the Scheffé's test,
they essentially do the same thing.
But to teach this topic
and to learn about it,
we're gonna use Fisher's
LSD 'cause it's probably
the easiest to do and understand.
Just for now I'm gonna take this chart,
make it a bit smaller,
and move it over here to the
far right so we can see it,
but we don't need it right now.
So, there is no Fisher's LSD
procedure or test in Excel.
So we're going to use Excel to get us
part of the way there and
then do the rest manually.
But luckily, it's pretty easy.
So the first thing we want
to do is calculate a one way
or a single factor ANOVA
using Excel's built in
data analysis tool pack.
So we'll go ahead and go to
data up here in the ribbon,
data analysis, ANOVA
single factor and click OK.
Now I already had the data selected,
but we'll go ahead and do it again.
Then we'll select labels in first row,
that's important if you
have the label names up here
in the first row, gotta
have that selected.
Alpha we'll keep at .05,
output range we will put in
J1 here and go ahead and click OK.
So here is our single factor ANOVA table
comparing the three means
over here in columns
A, B, and C for the roller coasters.
So again from the previous
video, we know that this model
is significant, our P
value is very, very small,
the F value is very high.
Then we have our mean
square, which is down here
in M13 where I have my mouse.
I'm gonna go ahead and
make that an orange color.
There we go.
And for the degrees of freedom here
I'm gonna make that a different color,
let's pick green.
'Cause we're gonna need those
numbers here in a second.
The great thing about Excel is it gets us
part of the way there and
then we can use the numbers
it creates to fill out
what we're gonna do here
in the middle of the sheet.
So the first thing I wanna do is calculate
the differences for each
pair of roller coasters.
So let's go ahead and do
Banshee and Diamondback first.
So this is simply equals,
we'll go over here and find
the average of the mean
for Banshee, which is here.
Minus the mean for the diamond back,
which is down here, and click Enter.
And that's the difference.
Do the same thing for the
Beast and Diamondback,
so equals the Beast average wait time,
minus the Diamondback mean
wait time, click Enter.
Same thing here for Banshee.
So equals, let's go ahead
and do the Beast first,
keep it consistent, minus the Banshee
and there are differences.
So it's simple subtraction for each pair
of our roller coasters.
Now let's move down here
to the middle of the sheet.
If you look, you can
see that the denominator
of our T statistic here is
present in both formulas.
So the square root of the MSE
times one divided by N sub I,
plus one divided by N sub J
that exists in the T value formula here,
it also exists in the
LSD formula below that.
So what we'll do is we'll
calculate that separately,
then we can use it in both.
So let's go ahead and
calculate the standard error
of the difference using a combination
of what Excel gives us and
some manual calculations.
So we'll go equals square root, knew that.
Now we have the MSE that comes
from our ANOVA chart here,
multiplied by, do a
parentheses, one divided by 25
because remember, each roller
coaster has 25 observations.
Plus one divided by 25,
close that parentheses,
and then close the final
paren and we have 4.05716.
Remember, that's the same
standard error of the difference
we found in the previous video.
So that number represents
the denominator here
in the T statistic and then the right half
of the LSD calculation here below.
So the next thing we will
need is our critical T value.
And again, that is based on
the level of significance
we have chosen, which is .05
and our degrees of freedom,
which is 72, over here in the green.
So to calculate this, we have
equals T.INV.2T for two-tail.
Our probability is .05, our
degrees of freedom are 72,
over here on the right, close the paren,
and we have 1.9934 and so on.
Okay, so now let's go down here to our
test statistic matrix and
calculate our T values
for each pair of roller coasters.
So we're gonna use this formula
under Fisher's LSD test statistic.
And because we've already
calculated the denominator
and the numerator up here
in our difference matrix,
this is very easy.
So for Banshee and
Diamondback, we have equals.
The numerator is just a
difference, we already have that,
it's up here, so a negative 16.466
divided by our standard
error of the difference
that we have here, and we hit Enter,
and it's negative 4.05 or negative 4.06.
Same thing for the Beast and Diamondback.
So equals the difference up here,
divided by the standard
error of the difference here,
we have negative 6.467.
Same thing for Banshee and Beast.
Equals difference
divided by standard error
of the difference, you
have negative 2.40853.
Now what do we do with those numbers?
Well, we compare them
to our T critical here.
And again, ignore the negative signs,
it's a two-tailed test,
we don't have to worry
about the negative signs or anything.
So, is
4.058
larger than our critical T of 1.99?
Yes.
Therefore, the difference
between Banshee and Diamondback
is statistically significant.
Same thing for the Beast and Diamondback,
and the same thing for
Banshee and the Beast.
So all three pairs of our coaster data
are statistically significant,
and therefore, different from each other.
Okay to make this easier, let's go ahead
and just type this in here,
I didn't do this to begin with.
Let's go ahead and put an LSD field here.
So we're going to calculate
Fisher's LSD threshold.
So again, this is easy.
First thing we need is T alpha
over two, well we have that,
it's right about it, multiplied by
the standard error of the
difference, which is right here,
'cause remember, this is the
right half of this equation.
Hit that.
So it's 8.0878 and so forth.
Now what we're going to do
is compare this LSD number
to the absolute value of the difference.
And if the absolute
value of the difference
is greater than or equal to the LSD,
then we can conclude that that
difference is significant.
So what we can do is say,
if
the absolute value
of the difference
is greater than or equal to
the LSD,
then we'll say significant
otherwise not significant,
and therefore, it is significant.
Now we do the same thing
for the other calculations.
Equals if absolute value
of the difference
is greater than or equal to
the LSD
significant, otherwise not significant.
Significant.
Alright, and the last
one, same pattern here.
Absolute value
of the difference,
greater than or equal to LSD
significant,
not significant.
Significant.
So I went ahead and did the
conditional formulas in there,
you could have easily looked
at it and been able to tell
that all three are significant,
but I like kinda working
some Excel work in as
we, as we learn this.
So, all three pairwise
differences are significant
using both the test statistic method here,
and the LSD threshold method here.
So if we go back to our chart,
make it a little bit bigger,
then we can see obviously
that Beast and Banshee
are different, but also Beast and Banshee
and Banshee and Diamondback.
So all three pairwise
comparisons are statistically
significant and therefore, we can conclude
that the means are not
equal to each other.
So I hope this was helpful and you can see
kind of where these numbers come from,
and then create some Excel
work to go ahead and take
Excel's ANOVA a bit further to do your
multiple comparison
test using Fisher's LSD.
Thank you for watching,
look forward to seeing you
again in the next video.
Bye bye.
