KOHITIJ KAR: I'm going to talk
about psychophysics and data
analysis.
I didn't put that
here because I'm not
going to show much about
data analysis today.
So it's kind of ironic,
maybe, that the last tutorial
is about psychophysics.
Because typically, when we think
of problems in neuroscience,
behavior almost motivates
every other problem.
Because if the brain wasn't
able to produce the behaviors
that we thought of
display, we wouldn't
have studied the brain.
We don't study our toenail as
often as we study the brain.
So obviously,
behavior is important.
And it motivates
a lot of problems.
If you kind of remember, this
was one of my slides last time,
kind of give a
picture of how people
think in system
neuroscience, like there's
a sensory stimulus.
And then this stimulus
goes into the brain
and evokes some
perception, sensation,
whatever you may call it.
The example I gave
was a glass of water.
You can see a glass of water.
And I can ask you, was
there water in the glass?
And you needed to
have certain sort
of perception evoked
through this image
to answer me correctly.
So if I show you this glass,
the answer will be no.
If I show you this
glass, the answer is yes.
And those are based on
some sort of perception
that is generated by this.
I spoke about studying
encoding models,
studying decoding models.
And I also showed
you this other link
that goes directly from
the sensory stimulus
to the perception.
So today, I'm going to
talk about this link.
I think last time [INAUDIBLE]
spoke about encoding models.
I spoke about decoding models.
So today, I'm going to just
let the brain [INAUDIBLE]..
So psychophysics is
a quantitative study
of relationship between
this physical stimulus.
Could be, like, visual,
auditory, touch.
And the perception that
it basically evokes.
In this case, I will be
mostly talking about humans.
So the whole tutorial is kind
of divided into three parts.
The first part, I
will talk mostly
about three methods of
measuring perception.
The second part I'll talk
about a specific task, two
alternative forced choice
experiments and a little bit
introduction to signal
detection theory.
The third part I will give
you a brief introduction
to Amazon Mechanical Turk.
None of them are comprehensive.
So if you're interested, I
can post more stuff on Slack.
Or you can pick up a book and
read about any of these items.
So the first, you should
treat this tutorial
as, like, a global overview.
And maybe some
take home messages
from each one of them,
what is good, what is bad.
This is not a
comprehensive review
of any of these
particular techniques.
So three methods for
measuring perception--
so the three primary methods
that I'm going to talk about
are magnitude estimation,
matching, and then detection
and discrimination.
So let's see an example
of magnitude estimation.
OK.
All right.
So I'm going to start
doing an experiment here.
So you can maybe tell me
what you think of this.
So here is a line that--
I'm calling this 50.
So you can think of this
whole screen as 100,
and this is the center.
So the task would go
something like this.
You would fix it at the center.
I would tell you that
the top line is 50.
What do you think this line is?
So what do you think
this line might be?
AUDIENCE: 35.
KOHITIJ KAR: 30, 30, 35?
OK.
So I'm not going to do the
whole experiment here, though.
But you will get a lot
of lines like this.
I can put arbitrary numbers
here and then keep going.
You will see, like, there are
many lines that will be shown.
And at the end of the day, I can
plot, given the true distance
of the true length of this
line, what was the perceived
magnitude of this.
So this is one way of
basically knowing our idea
of how long things are.
So if we want to probe how
long we think things are,
we can do it like this. ,
Basically reference to some
ground truth, like this.
Similarly, I can ask you what
is the brightness of this dot.
So I don't think
you even saw that.
So what's going to
happen is, if you fixate,
there will be a dot that
comes up around here
that will be of intensity 50.
And then you will see
another dot come up.
And the task is what
is the intensity
of that dot compared to 50.
So I just put some
random number here.
I just put a 0.
So you can see it right now.
Does anybody see that dot?
It's really, really-- I
think the light is not
helping at all.
I don't think you
will see the dot.
It was very light.
Anyways, what
happens, generally,
is that I think it's
intuitive enough
that at the end of the day,
you will get a graph like this.
You have basically changed
stimulus intensity.
And then you can plot what
was the magnitude estimate
that you get.
So you'll get a nonlinear
curve for brightness.
And you will get a straight
line for apparent length,
which was the first task.
So this is an interesting
point here, though.
So this was a behavior that we
observed when some people went
to model this behavior.
So Stevens came up with this
power law that encapsulated
a full range of possible
data sets, like,
a lot of variations in
dot intensity brightness,
length, color.
So he came up with this
formula, which is basically
a model with only two
free parameters, which
is one is a constant
that controls
the size of the response.
And there's another
exponential which controls
whether it's linear or linear.
So this one is a straight line.
And if it's not one,
it's kind of curved.
So here is an example
where many, many behaviors
were explained by a
very simple model.
And depending on who you are
listening to in this course,
you'll often hear things
like we need understanding.
CNNs are not good.
We need understanding.
We need some sort
of, like, really kind
of very simple principles.
And I think what they mean by
simple principles are models
that have low parameters.
Apparently, CNN has
many parameters.
The model.
This model has a low parameter.
So it's an elegant
solution, according to them.
It is not necessary
that the brain
has such an elegant or
low parameter solution.
So it's just a side note.
These are many sort of
continuums that were tested.
And you can see
people have measured.
These are basically estimated
the B parameters for them.
So for brightness, it's 0.33.
This is what we were checking.
So the next technique that
I'm going to talk about
is matching.
So the first one was
magnitude estimation.
You show something, and you
then ask, like, how long is it,
or how bright is it.
The next one is matching.
So you have two
things side by side,
and you have to
basically match them up.
So I have ground
truth on one side.
And then I don't
have ground truth--
I have a test of basically
[INAUDIBLE] on the other side.
So let's see an
example of such a case.
So I can put any color
as the target color here.
So let's say I make use
of the color spectrum
and I put some
color here, a color
that is sort of a mixture
of red, green, and blue.
So this is a mixture of
this much amount of red,
this much amount of green,
this much amount of blue.
So the experimenter just gives
you this sort of palette.
And you have to change the
knobs in these three tubes
to then match the color of
this color, this patch to that.
Right?
So what happens at
the end of the day
is that the subjects
are going to keep
turning some of these
knobs and they will try
to match these two things up.
So for example, the way to
match this would be you go here.
I mean, I can just directly
click on the answer.
So this is the
distribution of points.
To save time, this
is the distribution
of points that will produce
the exact same color patch.
So you can imagine
one experiment here.
So let's say I just assume that
color is a three dimensional
space.
And I just put anything
that I see here,
any color that I find
from any natural image,
and I keep putting
these things here.
And you start pulling
these things around.
So for every image, you would
put a dot, three dots somewhere
in this space.
So ideally, what
should happen is,
if I give you enough
images, you should be
able to trace out these curves.
And these curves, if they
have physical correlates
within the brain, you might
be able to use these graphs
as sort of an hypothesis space.
So these three, in other
terms, could become
basic sets of color perception.
So you can look at any
region that you think,
oh, this region encodes color.
So let me see if the neurons
or something in that region
has sort of responses based
on the color spectrum,
has these kind of,
like, profiles.
So this is kind
of what happened.
You read this paper.
It's an old paper.
They found these things in
the [INAUDIBLE],, as well.
So it's kind of a very
similar technique.
It's not as clean.
The data isn't as clean.
But they do kind
of show that you
can use psychophysics data
to then go into experiments
to figure out stuff.
So this worked.
And this was great.
Because these are
only three dimensions.
And then, you can
basically project any color
into these three dimensions.
It might not be the
same for objects.
So that's another question
that we had, is, like,
how many dimensions
do the objects span?
That's a separate question.
We don't know the psychophysics
experiment, maybe,
to do for that.
But this is a good example
of where psychophysics really
works.
OK.
So the third one,
and this is probably
the most used one nowadays, is
detection or discrimination.
So [INAUDIBLE] subject's
task is to detect very small
differences in the stimulus.
So typically, this
is done by using
three of these methods,
either of the three.
Some of them are not so good.
Some of them are kind of OK-ish.
So method of adjustment--
so you can think of method
of adjustment as this example--
that you get fitted for a
new eyeglasses prescription.
Typically, the doctor
drops in a different lens
and ask you if this lens is
better than the other one.
So the method of
adjustment is very
well-seen in this experiment.
Again, maybe it's better
to turn off the lights.
I don't know.
But do you see a dot?
Who sees a dot in here?
Oh, good.
So basically, the
task is you'll see
a dot that is barely visible.
And then you have to--
oh.
You still see the dot?
OK.
[INTERPOSING VOICES]
KOHITIJ KAR: Yeah.
I think that's probably OK.
I think the first one was fine.
Yeah.
The first light levels were OK.
So here's the dot.
And the task would be, like,
can you reduce the intensity?
You have to keep reducing
it until the dot goes away.
Now it kind of
looks very stupid.
But this was probably
the first idea
that came to people's head.
So don't get excited
by easy ideas.
That's one lesson could be had.
So again, so you keep
doing it multiple times.
A dot, again, shows up.
And then you have to move it
until you don't see it anymore.
And then, every time, I
record what level you went.
And then, I can
basically give a number
for your brightness
perception, like,
at what point, at what
threshold you stopped
seeing this particular dot.
So this is one way of, again
quantifying this behavior
of luminous perception.
I can change the
size of the dot and I
can ask this question again.
I can change it from
a dot to a square.
I can ask the same
question again.
So I can ask, how does
that particular value
depend on the shape of
the image or the object
that I'm showing you.
So the problem with this is
that it is a terrible method.
And the reason is that it
depends on the subject a lot.
It's a very subjective measure.
So if I was a
person who thought,
oh, I'm going to
just fool the doctor,
because I know the answers, I
can really play around with it.
And there is no
way for the doctor
or for the experimenter to
test how many times I'm just
reporting something because I
think that is the right answer
and I'm kind of
motivated somehow
to just give the
right answer and not
give a good estimate of my
own psychophysical state.
So that's why I don't think
people use this method that
much unless they have to
because the subjects cannot do
a certain kind of
task or something.
So typically, this
is not used at all,
the method of adjustment.
The yes/no method,
on the other hand,
is still used in some studies.
The yes/no method, again,
you change the intensity
of the dots.
In this case, it will be dots
because I was showing dots.
You keep changing
the intensity of dots
from a very low value to a high
value, and you keep asking,
do you see the dot?
Do you see the dot?
So you can do staircase methods
of making the dot disappear
or stuff like that.
So again, there was this
example-- you can go --
I will give you the
link to this website.
And you can go and do
a lot of tasks here.
And I'll tell you in a second
why that might be useful.
This is a task where
the dot already came.
I didn't cue you guys, so you
have to fixate on the center
and tell me if you
see the dot or not.
I don't think-- oh, yeah.
You see it?
No?
Do you see it?
There you go.
No?
That was there.
It's just a small dot
that's showing up here.
You see it now?
You see it now?
Anyways, these
experiments-- so if you
have done psychophysics
experiment,
you'll be in a confined
room in a lot of darkness.
So whoever said this other--
saw the dots would be
part of this curve that
detected the dot at a
lower dot intensity.
And if you didn't
see it, you probably
would be part of this graph.
But now, I could have
known this answer
because there was
a dot intensity
even for a very low level.
And I could have just said,
yeah, of course there is a dot.
And I can have something--
a graph that looks
like maybe up from here.
So this is the problem
with this kind of study
is because there is
no way to measure
a false positive because all
the trials are signal trials.
There is signal in every trial.
So there is no
noise in any trial.
So that's why I think it
took a while for maybe
psychophysicists to realize
that it's also important to give
catch trials in experiments
where you know that the ground
truth is that there is no dot.
But the subject has some
prior bias and brings that in.
So as I said, all the trials
here were signal trials.
And there are no catch trials.
So you only get hits and misses.
We don't get any estimate
of the false alarm.
So that was fixed in the
false choice experiment.
In this experiment,
you have to say yes
or you have to also say
no it wasn't present.
So your correct answer matters.
So this was fixed for the
first trials experiment.
I'm going to start showing you
this specific forced choice
experiment now, which is a
two alternate forced choice.
So there will be
two alternatives,
and you have to say is that
alternative a or alternative b.
For example, I can
show the dot on--
sorry, go ahead.
AUDIENCE: [INAUDIBLE]
KOHITIJ KAR: There
is no way to get
to a catch trial in those
previous experiments
without having a condition
that has no signal.
So if every condition that
is tested has a signal,
then there is no way to test
whether there is a subject that
is just knows that--
oh, it's going to be this
answer, and keep saying yes.
AUDIENCE: [INAUDIBLE]
KOHITIJ KAR: I think
there are examples
in the history of
psychophysics where--
yeah.
AUDIENCE: [INAUDIBLE]
KOHITIJ KAR: That
generally doesn't happen.
I think we don't hear
about them because we
have gone past those stages.
But I think sometimes
there are still
studies you'll find that
where they haven't corrected
for the false alarm, or haven't
gotten [INAUDIBLE] measures.
AUDIENCE: [INAUDIBLE]
KOHITIJ KAR: Yeah, exactly.
So it's useful to remember
how we got into this position,
and what are the pitfalls
of one versus the other.
OK, so this is a two
AFC task, and I was
explaining this task where--
instead of the dot
showing up always
on top of the
fixation point, it can
show up either on their
left or on the right
or not show up at all.
And by doing it this
way, the answer--
you would basically get rid
of the previous problem of not
having any false positives.
I showed you all these
different techniques
to give you a brief--
maybe, a kind of history on
some of these techniques that
were used before.
But I want to get to
some real problems.
What people really use in
different kind of studies for--
and what kind of
psychophysics is used.
So I think I told you that
in this lecture before,
I was using ventral
stream for decoding.
I was going to use
some studies that are
related to the dorsal stream.
I also mentioned
this paper last time.
So I'm going to motivate
the rest of the tutorial
based on a task that is
typically thought to be linked
with the dorsal stream.
And I might have criticized
this task because this motion is
very relevant for us.
We probably look at
objects moving around.
But there has been
more than three decades
of work done on dot motion.
So I cannot ignore it.
So I'm going to give you
an explanation of how
we think about this
kind of a task where--
let's say, you have 10
dots here, and all of them
are moving upward.
It's called random
dot motion stimulus.
So that's why I'm calling
the coherence 100%.
If I show subjects this
moving dot and I ask them,
do you see the dots
moving up or down?
And I'm basically
plotting their responses
as a function of the coherence.
So in this case, the
coherence is 100,
and a stimulus looks like this.
I show the stimulus for
100 millisecond or 300
millisecond, what do you think
the proportion of upward choice
will be?
1, right?
Everybody will see up.
If I have no signal in
this dot motion stimulus,
if it's at 0 coherence,
which means every dot is just
moving randomly, you're
going to be in the center.
If you're an ideal observer,
if you have some biases,
you might not have that.
Then again, the same thing.
If the dots are moving down,
you will have no trials
where you actually say up.
So you are perfectly
correct again.
So you can test everybody
in the center as well
and then draw the entire graph.
And typically, if
you don't know that,
this is known as the
psychometric function.
People quantify
motion perception
based on some properties of
these psychometric function.
So the two properties
that are mostly used
are one is this PSE, which is
point of subjective equality.
You can also think of this
as some kind of threshold.
So point of
subjective equality is
because at 0% coherence
for an ideal observer,
there will be a chance in
saying up or down because there
is no signal.
The other one is
slope sensitivity.
So you can think of this as
that how much motion energy
do I need to insert before
the performance of the subject
goes up by a certain
amount of value.
So if instead of a subject that
had a psychometric function
like this, if the
subject had this kind
of psychometric
function, it would
mean that the subject is
more sensitive to the task.
So for the same amount
of increasing motion--
the motion performance,
that subject
needed less amount of motion to
be introduced to the stimulus,
right?
So here is a way to think about
a good subject, bad subject,
or something I'm doing
to the subject that
is making him bad or good.
Now, these sort of
graphs have been
used to study certain
perceptual phenomena
that we observe a lot.
So I'm going to show you one
such phenomena you might have
already seen this many times.
How many guys are--
you know about a
waterfall illusion?
You guys know about this?
Everyone knows about this.
OK, so the waterfall
illusion, although it
looks very different when
you look at a waterfall,
so I'm just going
to say it anyway.
So if you look at a
waterfall for a long time
and then you move your gaze
and look at a stationary car,
the car seems to
be moving upward.
It's because of [INAUDIBLE]
caused in the neurons
and adaptation and all that.
So I'll just show a
demonstration of it here.
So if you look at the Center
for a long, long time.
So here, you have to
focus on the center for 40
milliseconds--
or sorry, seconds.
The second was already over.
So yeah, look at the center.
Don't move your eyes
if you want to see
what's going to happen next.
When this counter goes to 0,
either keep looking directly
at the screen or look at
the back of your hands, OK?
So keep looking at it.
Feel at one with the stimulus.
Things are going in,
things are coming out,
things are happening,
and it's going to change.
So the moment the
stimulus changes,
either look at the back of your
hand or look at the screen.
OK.
All right.
Oh, OK.
So this is--
I guess, so whoever saw
the back of their hand--
so if you tried later,
the back of the hand trick
is not going to work because
you are probably already
have passed the
adaptation stage.
But you might have
seen the clouds move.
And if you see the
back of your hand,
you see the back
of your hand move.
And that's because of--
it's called motion after effect.
And we have these rich
phenomena in the world
and we see this all the time.
We are probably
influenced by them.
But as psychophysicists, we
have decided to ignore that.
We will bring this
into dots, OK?
So motion after affect, the
way it's studied in the lab--
so we have this nice
little psychometric curve
from the experiment that
I just told you before.
So now, I'm going to
repeat this experiment.
But the only difference will
be that before I do the test,
I will show you a stimulus
that is always moving upward.
So I'm going to adapt you with
stimulus that moves upward,
and then I will test you with
all the different stimuli.
So the way this
motion after effects
manifests in this
type of task is
that if you show
this for a long time
and then show a
random moving dot,
you're going to
expect your report
to be somewhere down here.
So you're going to see more
down than up because you have
been adapted to one direction.
So that was the
motion after effect
that I showed before
captured like this.
So now, you can do the entire
motion strength dimension,
and you'll get a
graph like this.
And typically, you can quantify
how much motion aftereffect
you have had by looking at the
difference between these two
points.
So that quantifies
this perception.
AUDIENCE: [INAUDIBLE]
KOHITIJ KAR: In
this one, there is
no reward for correct
or incorrect answer.
The subject doesn't
know what is the correct
or what is incorrect.
AUDIENCE: [INAUDIBLE]
KOHITIJ KAR: They're
getting money
to sit there and do the task.
But it's mostly--
I think that kind of
motivation, if you compare it
to the monkeys, it's not
there in this kind of task.
It's most likely have to
sign-- they have volunteered
to work for science, and they're
giving their honest answer.
And it's up to the experimenter
to kind of figure out
how to keep things
in check so that they
don't get biased responses.
OK, so that is the
example of the two
alternate forced
choice experiment
that I was talking about.
And I will try to motivate
the signal detection theory
problem with this previous
task of motion direction
discrimination.
So signal detection theory
means exactly what it says.
It's a theory to detect signals
from various sensory inputs
or sensory stimulus.
So there are three
main messages from--
that you can take
home from this theory.
So one is that your ability
to perform a detection
or discrimination task is
limited by internal noise.
So if you were noiseless, you
would have been really good.
But if you have
some noise, and that
is what is basically
limiting your performance.
The other thing is that signal
strength and criterion--
and I'm going to describe
both of these soon.
So signal strength and
criterion are the two components
that affects your decision.
So where you are
putting your criterion
and so on and so forth.
And they each have different
kinds of effects on decision.
The third one is that
you have to measure
both hits and false alarms.
So by measuring hits
and false alarm,
you can get an
estimate of d prime,
and I'll explain
what d prime is.
That is a measure
of task difficulty
that is independent
of the criterion.
So let's go a
little bit deeper--
dig a little bit deeper into
the psychometric function, OK?
So as I showed you, you
have a stimulus that
is going all the way up always.
100% coherence.
You get a proportion
of output choices one.
Same thing here.
0 coherence.
Your at chance.
Going down, you're here.
So now, let's try to explain
this in terms of a model
from signal detection theory.
The other thing is that--
so if you were an ideal
observer with 0 noise,
your graph should not
look like this, slanted.
Your graphs would basically
look like a step function.
Because anything that
is greater than 0,
the ground truth is
it's moving upward.
Anything that is less
than 0, the ground truth
is that it is moving downward.
So if you had absolutely no
noise inside your brain--
in the detectors
inside your brain--
because you think of this as I'm
always thinking of one detector
that is basically
encoding the stimulus.
And then, they are
applying some criterion
and producing a decision.
So if the detector
had no noise, this
is how it should look like.
Now, let's look at the--
how you might want to
think about it in terms
of signal detection theory.
So in the signal detection
theory, you can think of--
you have a neuron that fires
whenever some stimuli comes up.
There's a detector
that responds.
But there is a distribution
that basically obeys.
So the detector is going to
fire with some amount of noise.
This is the distribution of
firing rate of the detector,
let's say.
And on top of that, you
have some criterion.
And based on both of these,
you're making the decision.
So let's say I'm giving
you a stimulus at--
this one is at minus--
a very low moving--
so the dots are all moving down.
Are you guys with me.
I'm still here.
So this is a
demonstration of how
you can think about this problem
in terms of signal detection
theory.
So there is some
noisy responses,
but they're all less
than the criterion.
So if the decision
making system has
decided to put the
criterion right here,
all of these responses
will be classified as down.
And that's what you are seeing
in the subject's responses.
So you move it a little bit--
still everything is
below this criterion.
You move it a little bit more.
Again, below the criterion.
So if you keep doing this,
you will basically carve out
the entire [INAUDIBLE].
So at some point, it
leaves the criterion,
it gets a little bit of
signal, and then at some point
it goes back right
here-- right up.
So because everything will
be higher than the criterion.
So that's one way
of looking at how
you can model internal noise.
So what happens is
that, if you think
of another human being
or another subject that
has detectors with
much bigger noise,
so this one had a tighter
noise distribution,
this one has much more--
much wider noise distribution.
For a subject who has an
internal model like this,
you'll see that the slope of
this graph is much shallower.
So that is-- speaks to the
first point of signal detection
theory that it's-- your
performance is limited
by the internal noise
model that you have.
Now, there are a lot
of other theories
that talks about not only
internal model-- a noise model,
but also external noise
models and stuff like that.
But I won't go into them today.
This is just one way of thinking
about these kind of signal
detection theory
based explanation
of what is going on.
There is no guarantee that
this is actually happening.
But this is a simple model
to explain this change
in sensitivity of the subject.
All right, so I haven't
said this before,
but just to tell you
in more concrete terms,
that these are the four
main things you're looking
for during these kind of tasks.
So you want to know when
the subject is correct,
and you also had
the correct trial.
So basically, if you're
doing a dot detection task,
if your dot is present and the
subject says, yes, it's a hit.
If the dot is present and the
subject says no, it's a miss.
If the dot is not present
and the subject says,
yes that's called a false alarm.
And if the dot is not present
and the subject also says,
no, he's still correct--
she is still correct.
That's a correct rejection.
And you can think of each
of these in terms of models
of internal lines of detectors.
So here is a graphical
representation
of that scenario where you
have your noise distribution,
you have your
signal distribution.
And you can put a criterion
on these two distributions
and choose what's going on.
So once you get the data,
you can then from the data
depending on some
assumption of how much
how wide these
distributions are and where
you can put the criterion,
and you can basically
model that behavior.
So that's pretty much the idea
that, based on these factors
and where you put
the criterion, you
want to quantify the behavior.
I spoke about d primes before.
So if you have your noise
distribution somewhere here
and you have your signal
plus noise somewhere a little
bit forward.
So you can increase the
signal in sensory intensity.
And the separation of
these two distributions
divided by the spread,
which is quantified
by the standard deviation of the
distributions, that's typically
the way you quantify a d prime.
Here is a little demonstration
of how we make these curves--
receiver operating
characteristic
curves based on
these kind of data.
So you can think of these
as, again, one is noise,
one is signal.
And your plotting hit rates
versus false positives.
If you increase this mean, this
is going to go slightly up.
So the area under this
curve is typically
what is of interest was.
As the noise separates,
this area under the curve
becomes larger and larger.
When the noise on
top of the signal--
there is a line at the unity.
I think the way we
should think about them
is that these are different ways
of quantifying the behavior.
And once you quantify
the behavior in some way,
and you want to test
some other model
of that particular
behavior, you're
going to recreate
this kind of analysis
and check whether the
other model, basically,
has the same parameters for this
kind of characteristic curves.
So I think that--
at least that's the way I
think about these things.
Because these don't tell
you about the internals
of the system at all.
This is more like a very
abstract modeling of behavior,
so to say.
OK, so the last
thing, and I wanted
to start this with
the introduction
to not Amazon Mechanical Turk,
but the original Mechanical
Turk.
So how many of you know the
history of Mechanical Turk?
Oh, OK.
This one, not everyone knows.
That's good.
So before Amazon
Mechanical Turk,
you have the real
mechanical Turk,
which was to impress
the empress of Austria.
And I'll play these videos.
Maybe you can follow, yes.
[VIDEO PLAYBACK]
- Built In 1770 for
an Austrian empress--
[INTERPOSING VOICES]
- --traveled through Europe
playing chess and defeating
[INTERPOSING VOICES]
[END PLAYBACK]
KOHITIJ KAR: So this was a
fake chess player, basically.
So it looked like this--
OK.
[VIDEO PLAYBACK]
- Built In 1770 for
an Austrian empress,
this life size automaton
traveled through Europe
playing chess and defeating
commoners and kings alike.
After a century
long career, he's
recently been restored
to full working order.
With lifelike movements,
an error in judgment,
and his expressionless face--
[END PLAYBACK]
KOHITIJ KAR: So he
would play chess.
And so the idea was
that just people claim
that this was some
kind of automation,
and they figured out
how to play chess.
But this was a fake thing.
There was a guy who sneaked
in who was a chess master
and used to play chess.
And then, he went around
Europe foolish [INAUDIBLE]..
I think like a lot of famous
people got tricked into this.
Some point, they--
[END PLAYBACK]
KOHITIJ KAR: Called him out.
Anyways, so that was the
history of Mechanical Turk.
But Amazon also
probably following
the same way of
operation decided
that they will have
Amazon Mechanical Turk.
And I'm going to show you
a little bit about how
to maybe run a simple task.
So you have to create an account
on Amazon Mechanical Turk.
And so there are two
ways you can operate.
You can either operate
as a developer--
you can operate as a worker or
you can operate as a requester.
So if you're a worker,
then you sign up
with all of your details
and then you do the tasks.
And it doesn't pay
a lot, but there's
a lot of interesting tasks
that you can do in here.
As a requester or a developer,
you can upload your own tasks
and run them here.
So typically, you can either--
so the way it works
is that you have
to design HTML file or
something that runs online.
And you don't have to follow
their project guidelines
or anything.
You just have to have a
way where you're pulling up
HTML files, and
the HTML files has
links to some sort of
server that's online.
So in our lab, we typically use
Amazon S3 to store our images
or what--
videos or whatever we're
going to show to the subjects.
And online, we basically make
a call to those particular URLS
and put them on the screen.
You can do it in different ways.
We have used JavaScript a
lot to basically program
these HTML files.
For example, you can go
here in the create folder,
and you can use one of
their default projects.
You can see they have
many projects here.
One thing would be
choose image a versus b.
So pick the image
that you like more.
This or that.
So you can have your
own task, basically.
Or if your task is aligned
with what they already have,
you can use their
template to launch them.
Once you launch them, they're
called HITS, Human Intelligence
Tasks.
They're call HITS,
and then you'll
have workers working
on those HITS.
Typically, I came
from a lab where
we used to work in the
psychophysics rigs.
And it was a big deal to
get a lot of subjects.
So we would get maybe
30 subjects a month
or so, if you were lucky.
Here, I got-- the
first day I ran this,
I got around 500 subjects,
and it was amazing.
The problem that people
would bring forward
is that, well,
these are subjects
who are sitting in their house
and they're taking breaks
and they have different
distance from the screen.
You don't know where
they are looking.
So if your experiment is really
dependent on eye position
and eye tracking, I agree
that this is a difficult setup
to justify.
But if your task is
independent of those,
but you are generally
worried about the attention
arousal levels of the
subjects and stuff, what--
you can think of it
this way that you
will get so much data that
the reliability of the mean
of the effect that
you're computing
is going to be
very, very accurate.
And it will be difficult to get
the same level of reliability
for a mean acquired in the lab.
So we did a lot of
analysis where we--
initially, we
basically try to see
at what number of
repetitions the reliability--
the split of reliability
of the data goes to 1.
So we use that as a metric.
You collect a lot of
data, and then you
split all of your
trials into two halves.
And then, you correlate
whatever you're
measuring with each other with
each of those two data sets.
You'll see that as
you keep increasing
the number of repetitions, that
correlation value approaches 1.
And so we would like
to operate in a regime
where those correlation
values are one 1.
So that's one maybe thing
that I learned by using--
while using Amazon
Mechanical Turk.
You can control for almost a
lot of aspects of these subjects
like where they are
from, what computer they
are using, whether they're
using a keyboard or a touch pad.
I mean, the age is
voluntarily reported,
but you can ask for reports
of age and stuff like that.
So you can do a lot of post
hoc pruning of the data,
or even pre-emptive discarding
of subjects based on that.
You can ask the same subject
to come over and over again.
And I've done that
in some of the papers
that we have published
in the lab where
I would ask the same subject
to do the task more often.
So once the subject does the
task, you can go to manage.
And here, we are not running
to many tasks right now,
so you don't see it.
But there will be a
lot of assignments here
that will be filled up.
So then, you can
download the data.
It comes in the form of a .
CSV file or a JSON file.
So you can unpack that and
then use it for your analysis.
I really find this
very exciting,
and you can do a
lot of stuff here.
So the one thing I was thinking
while I was making the tutorial
is that we have had a lot
of talk about like CNNs
and how they can be
models of the brain.
And where is the brain
reading stuff from,
where is the decoding
happening, where
is the encoding happening.
One simple exercise that we
can probably do at some point
is that-- so I have shown
you a lot of tasks before.
So all of you can
go do those tasks.
I'm not asking you to do it.
But somebody can go and do
those tasks in that website.
So I'll forward that website.
And so once you do the
task in those websites,
the results are
downloadable, right?
So also you can think
of designing some task
and running the similar
tasks on Amazon.
So you can get a
lot of human data
on tasks that are
very low level tasks.
Like dot-- dot intensity
threshold and stuff like that.
You can do these tasks.
And people have
theories about it.
And neuroscientists
have recorded neurons
claiming that these neurons
are responsible for this task,
this is how the decoder works.
But not proper causal
perturbations have been done
and so on and so forth.
So what you can do, you
can take any of the CNNs,
especially if it's a task
that people have claimed
is done in the
ventral stream, you
can do this task on Amazon Turk.
And then you can ask, do I need
to go and do the last layer
to decode this, or
is it better off
if I use an intermediate
layer to decode?
Stuff like that you can do.
And then, those will become
hypothesis based, basically.
So for example, I was told this
once by a professor [INAUDIBLE]
that he doesn't think
that IT is doing
all of object recognition.
If I have to do object
discrimination between two
objects that are really tiny
bit different with each other,
then we might need V4,
or we might need V1.
We might be reading--
the decoder is not--
it goes transformation,
transformation, transformation,
[INAUDIBLE].
It could be like V1--
the decoder actually needs to
read from V1 at the same time.
So that's a testable hypothesis.
And you need some
sort of data to know
what the humans are doing.
And then, you can
go back in the CNN
and test the CNNs
in the same way.
So another message is
that psychophysics is not
limited to human subjects.
Because these models
are available where
you can simulate
the same conditions,
you can do almost all of
the psychophysics stuff
that I've mentioned
on these models.
So I think that generates
a huge space of hypotheses
for future experiments.
Because I mean,
without these models,
there is no hypotheses that
people have been following.
So you might often find that
there is a lot of confusion
within--
explaining the very
same kind of data.
So what's going on.
And there are also--
I think there is a website
from Michael Bach, B- A- C- H.
Bach or back.
He has this repertoire
of illusions.
And you can make
tasks out of them.
So you can basically take
an illusion, and ask--
based on these methods
that I mentioned,
what is the best way to
quantify the percent.
So once you quantify
the percent,
you can ask can a
standard deep net
or recurrent deep net
or something like that--
can that solve these
kind of problems.
They're also very
interesting models
of motion perception,
label line models and stuff
like that, tractor models.
Also try those out
for motion perception.
I think also-- the
last slide that I
had was a quality check of like
Mechanical Turk versus lab.
This is some CSD measure
from an old lab--
old study from a lab.
This is just showing that,
on average, these metrics
seem to be correlated across
lab versus mturk subjects.
And as I said, at high
level repetitions,
mturk data is consistent
with in-lab data.
Also from some of our
studies we saw that.
Thanks.
If you have questions--
this is more like a tutorial
where I think you might want
to think a little bit more
about psychophysics.
Because it's often--
seems like people
think it's a solved thing.
We have all the metrics--
all the behavioral measures
that we can think of.
But oftentimes,
reformatting those behaviors
to match the new
models is the way
to generate new hypotheses
for experiments.
So I would encourage
you to think like that.
Thanks.
[APPLAUSE]
