Hi all Dr. Clark here again today
laboratory number three or lab number
three and it's a you know fairly easy
lab at first I'm going through the
scientific method most of you are fairly
familiar with the scientific method
where you might have an observation and
then based on that observation you have
a question based on that question you
have an explanation for that question
which we call a hypothesis I remember a
hypothesis is not an educated guess a
hypothesis is an explanation for a
natural phenomena then based on that
hypothesis you're gonna have some
predictions based on those predictions
you would design an experiment you would
collect data you analyze the data
present your results and complete the
scientific method so we're gonna look at
two main pieces of the scientific method
today in lab the first one is creating a
hypothesis okay so procedure one and two
is really going to just get you in the
mindset of reading an observation making
an observation per se and then creating
a hypothesis around that observation
now remember hypotheses don't have to be
correct actually more often than not
hypotheses are wrong and normally an
individual create many hypotheses for a
given observation and only one of them
will be somewhat supported okay and this
is where students get into a lot of
trouble and even where you know I would
say advanced students even grad students
get into trouble sometimes when
hypotheses testing and that is it's
human nature to always want to be right
we always want to make something correct
and I've seen individuals do research
where their hypothesis and their mind
should have been supported
but the data didn't support it so they
fudged the data so it did support it and
now not as not good science that is
really poor science and that's why
hypotheses testing I shouldn't really
occur all right instead you should
create a hypothesis and then be testing
against the null hypothesis which we'll
talk about in a second okay but no
matter what okay you're going to fall in
this trap of designing hypotheses and
then trying to support one that you
think is correct and I urge you to fight
that Tend temptation again science
should be about the truth not what we
perceive to be the truth which is why in
science we should never use the term
proof or disproof and and in fact in my
courses you'll miss points if you use
the term proof or disproof because
proving something is left to mathematics
and left to mathematical equations that
were created with one given outcome and
that's the purpose of a proof in science
because science is always changing
biology is the study of life and life is
always evolving we're never going to
prove anything now that doesn't mean it
can't be supported and it's not the
likely explanation for something it just
means that down the road there could be
a study that doesn't support your data
or doesn't support this idea whether
that be considered the theory or a rule
or a law or anything like that there is
a chance that down the road there will
be data that's collected an experiment
that's design that doesn't support that
explanation and that's why we can't
prove things
and for that matter can't disprove
things so the terminology that we often
use when we talk about hypotheses is
this is the hypothesis that is supported
by data or supported by the most data or
this is the tentative explanation for
that natural phenomena and that that way
we cover our bases if five years down
there ten years 15 20 years down the
road someone creates you know a
hypothesis experiment full design and
that shows that hey our explanation is
wrong then you don't look like a goat
you don't look like someone who is
making things up at the time it probably
was the best explanation for the natural
phenomenon and when we go through this
course you'll see when I talk about
Mendelian genetics when I talk about
Darwin evolution there's a lot of
situations where there was hypotheses
that had been supported by data or we
thought was supported by data at the
time only to find out that later on when
technology caught up and with
technological advances we found out that
these hypotheses or these pieces of the
hypothesis were wrong and not supported
at all and so again hypotheses are
tentatively accepted and they're not
proven and they're not correct I
they're just tentatively accepted or
supported by the data collected all
right so that's the first piece
hypothesis testing procedure 1 and
procedure to the second piece that
you're going to venture into is this
piece analyzing data this is the piece
that so many students and even advanced
students have an issue with and it's
mainly because as biologists or you know
beginning scientists people are not
taught how to analyze data correctly
they're not taught window
Datta they're not taught you know what
method should you use these kind of
things and even more than that when they
do analyze data they rarely know what's
going into the data
why are you analyzing the data in that
form so today you're going to calculate
or in this lab you're going to calculate
a t-test sometimes called the student
t-test by hand and then I'm going to
show you through Excel how to do it
quicker and more efficiently but
nonetheless using both techniques you
should be able to see that what's going
into the statistical analysis and what's
coming out and they should make more
sense to you okay so the first couple
procedures you can see here procedure 1
designing hypotheses procedure to
designing more hypotheses remember
hypotheses are explanations for natural
phenomenon that are testable you have to
be able to test the the hypothesis okay
next um after hypothesis you would
design an experiment I designed the
experiment for you in this lab and
hypothetically and I collected the data
for you so the hypothesis here is plants
that receive fertilizer will grow faster
or will grow taller than those that do
not receive fertilizer that's that's an
OK hypotheses it is an explanation for a
natural phenomena you provide fertilizer
to a plant you'd expect it to grow
taller than a plant that doesn't receive
fertilizer however growing taller is
kind of a very loose term and it's hard
to test so instead of testing that what
we're design is a null hypothesis this
is the one that's going to be tested
there is no difference in growth between
plants that receive fertilizer and
plants that do not receive fertilizer by
saying this we can that difference could
be can
a statistical difference so now we have
a means by which we can test this
experiment and this hypotheses by saying
that there's no difference then is if
there is a difference then we know that
the null hypothesis is rejected if there
is no difference then we know that we
have to tentatively accept the null
hypothesis and you'll see what I'm
talking about as we progress and as you
collect data so you can have an
experimental group which is going to
receive fertilizer and a control group
which does not receive fertilizer and
here you can see the data when I flip
over the data table I collected the data
for you so I have plants so each
individual plant numbered and then I
have their initial heights so how you
know what was their initial size when
the experiment started and then how much
did they grow each week now for four
weeks now you can see here plant number
nine this plant died in Week three okay
so instead of you know trying to
calculate an adjusted growth rate of
this plant like it was still alive you
don't want to do that you actually just
want to remove it from the data set and
this is why you want to have sample
sizes that are large enough and normally
we want sample sizes in in like
ecological studies and things like that
and studies that might get published you
want a sample size for each group at
about 30 or a little more okay because
occasionally you'll have weird things
like a plant dying or a plant and maybe
in this case you know one plant that
started really small maybe half the size
of all the rest of the plant and so you
can get some very interesting pieces
that that pop up in your data set so by
having a much larger sample size you can
remove some of those outliers now
there are good ways to do this and bad
ways to this remember not out all
outliers is bad data right sometimes
outliers is what allows you to see you
know the big picture you can see that
samples might be cycling so you might
have outliers over here and or it might
be bimodal and you have outliers over
here and you have outliers over here and
nothing in the middle
will you have to separate kind of
populations we'll look at some of this
data as we progress but here you can see
the data that's been collected for you
okay and then this table I've already
done things like the average height the
average height squared the sum of all
the averages the average of the average
is okay and the square root of those
averages averages squared of the
averaged averages who down here you're
gonna calculate that yourself now I will
show you in just a second how to do this
in Excel fairly quickly but nonetheless
if you're taking this course for credit
you need to calculate it by hand you
need to calculate all these equations by
hand so starting off calculating the
mean and the sum and then you're going
to calculate how much variance there is
so the sample variance by using this
formula then using the sample variance
to calculate your standard error mean
then you can use the standard error mean
for both the control and experiment to
get the standard error the differences
so this is going to take into account
sample size differences between the two
samples so control we already know has
only nine plants because we're going to
remove the dead one and the experiment
that is ten and then that will give you
the capability of calculating a T
statistic the T is just going to be a
calculated variable that we can compare
to statistics on a T table which is
provided for you and Appendix C or you
can just Google
t-table and you can see a table of
values that have calculated the
differences between two groups and when
the difference is big enough or when the
difference is occupying an alpha level
you know whatever the alpha level is 5
percent or 10 percent or 2 percent or
whatever it is that's the value that you
have to have or have to be higher than
it's normal it's normally it's it's
normally represented and what we call
the T critical value or the T table
value okay and to also look at that you
need the degrees of freedom and degrees
of freedom is just a number of controls
plus a number of experiments minus 2 so
if we jump to this table I can explain
that a little the table a little bit
better
Hey oh sorry I didn't get that number
down right ok but here you go you'll
have to excuse the different page things
I don't know why the PDF did this but in
your lab manual and won't be this way
but here's Appendix C T distribution
table degree of freedom so again in our
experiment we have 19 plants mine from
the control 10 from the experiment so we
have 19 plants but it's minus 2 so we
should be looking at the degree of
freedom of 17 in biology we always use
95% confidence interval which means that
we accept 5 percent air or we have an
alpha level of 0.05 this means that 95%
of the time your outcome your study will
be the same ok so if you did this test
and experiment and you did it a hundred
times 95 of them would give you the same
results I
if if it's significant 95 of those
hundred times you would get a difference
between the control plants and the
experiment funds five percent of the
time or potential of five percent of the
time you would get something different
and that's not to say that if you
actually did this on here times you
would get you know five out of the
hundred not coming up with the same
answer it's just you know it's it's a
statistics statistical test that says
that we accept five percent error now
someone that's in the field of maybe
physics or chemistry they might be only
okay with accepting one person there so
they might be out here someone in the
fields maybe psychology or sociology
would be you know accepting you know a
lot more error because when you're
dealing with humans humans tend to mess
with experiments like I guess you could
say so they tend to alter experiments
maybe they don't know that they're
altering experiment that they do so so
more air is often associated with some
of those experiments and so if we go and
we take that 17 over and we're looking
at two point one one zero so our value
calculated T value calculate has to be
bigger than that in order to say that
control plants and experimental plants
growth rates are statistically different
from each other so flipping over we can
flip over to excel I'm going to show you
how to do this in Excel through using
functions in Excel and how you can
calculate things using typing your own
equations or using a built in process
within Excel okay so you can see here
I've already calculated the averages for
week r for growth rate of week 1 2 3 4
of all these plants this is how you do
it so if you type in
equals average and then select what you
want it'll give you the average so I can
do that down here so if you sorry about
that so if you like it equals average
then I open parentheses select the
numbers on interest and get an average
of close it and there's the average now
another neat trick with Excel is you
don't have to retype that so you can
just grab this little black box in the
corner here and drag it down and get the
averages of all those and now we want to
square those averages so I can just take
go equals select the number then I want
to square okay you're the hat put it to
there and I just squared that number now
I can do the same I can drag it down and
there's the squared of all those numbers
remember this little looking thing is
the sum so we can add these all together
pretty quick we can just go sum type the
word sum select the numbers that we're
interested in summing all right and get
that sum now maybe we're interested in
the average so we summed it so we're now
we're interested in the average of all
these averages so what basically means
we're looking for the average growth
rate for just experimental plans
there's the average growth rate across
the entire experiment for the control
plants here's going to be the average
growth rate so we're going to select
that number we're going to divide it by
10 I should be able to do that in your
head but there's the average growth rate
for experimental plans if we want to get
these numbers we can take this and
square it and then we can take this and
divide it by the number of plants and
there's your numbers okay that's all the
numbers you need to go through and
calculate standard air you know a
standard air to the means standard
deviation
all these other things that you need to
ultimately collect ultimately calculate
a t-test now there's an easier way to do
this okay but knowing what goes into a
t-test is very important but most people
would just say let's do this as fast as
possible
to see if there's a difference between
the control growth rate and the
experimental growth rate so you can do
that by going into Excel I have a
program called pop tools if you want it
you can just it's free you just go you
know Google pop tools Excel download it
and it'll go into your add-ins another
way you can do this is you can go data
and you can go data analysis this is
another tool package ok most people will
have this one so I'll go to data
analysis and I'll do data analysis so
we're going to do a t-test to sample
assuming that there's equal variance so
we're assuming that the variance and
control would be the same as the
variance in experimental okay that
there's nothing that is going to cause
this to have more very variance than
this so we're going to go ahead and
select that now we need to select the
variables that we're interested in
remember we're interested in finding out
is the average growth rate of control
different than the average growth rate
of experiment so to do that we can just
select the average growth rate of the
control plants now yes there are only
nine control plants but that's okay we
don't have to have equal numbers and
then there are ten experimental plans we
don't really need to tell them what you
know the mean difference would be
because maybe we don't know we do need
to tell them the Alpha level it's
already here most biology or most
general statistical package will start
with an alpha levels you
point zero five and then we need to tell
where we want it to show up so we're
going to say give me an output range I
want the output range to be right here
okay and then just select it hit OK now
you can see what's going on now we can
compare our numbers the same as what you
calculated by hand so here we can
compare you know the mean so it
calculated the mean for us
look at that mean same so this is a
control variable one's our control okay
variable - okay look at that mean same
now your means might not be the same
because I didn't around at all the
computer's not rounding at all so when
you're rounding you're gonna have a
little bit different numbers we didn't
calculate variance but we could and in
later videos I'll show you how to
calculate variance okay I talked about
you degrees of freedom and I told you it
was 17 okay that's exactly what they
calculated also okay so you have nine
plans from your control
ten plans from your experiment - to 17
the T statistic is - 20.7 - if you
calculated it by hand and you round it a
lot of you will get somewhere around 18
but or 19 it just depends on how you
round if you didn't round it all you
will get 20 point seven - but crucial
parts of this are really looking at well
how does that 20 compare now remember if
we go back to the table what's our value
got to be bigger then two point one one
zero so that value minus 20 is much
larger than two point one one zero okay
and remember t-test it's this is a two
sample t-test or two-tailed t-test so it
doesn't matter if it's bigger or smaller
it really makes no difference just as
long as it's different than the
on the table when you guys calculated by
hand my equation gives you the absolute
okay so this will be positive if you
calculated my hand okay it doesn't
matter for the computer it doesn't
matter is they make it negative really
makes no difference because it's telling
you that it could go either direction
but then what we're interested here is
this p value this p value was telling
you that yes it is less than 0.05 for a
one tailed test and it's also less than
0.5 for two-tailed test so no matter
which direction you go the two variables
that we are looking at are statistically
different from each other and in this
case given our mean given our growth
rate differences we can say that
tentatively except that plants receiving
fertilizer
grow more than plants that don't receive
fertilizer so that's how you do it you
know if you're not getting the Excel
thing and you need more help there will
be more Excel examples that I give you I
like everyone to use Excel and there's
many classes you can because it's so
much easier than doing it by paper it's
a digital world we should be doing this
by you know computers and it just makes
it a lot easier so I'll show you a lot
more tips and tricks when it comes to
you know entering data into Excel
analyzing data in Excel making graph
making you know different kind of tables
and things like that in future labs yeah
so with that hopefully this helped and
hopefully you can complete lab 3 if you
have to for my class otherwise if you're
just watching this for pointers in
biology I hope you like
