(upbeat music)
- Yeah
(upbeat music)
(upbeat music)
- Welcome to After Dark Online.
Thanks for joining us as
we continue to explore
current topics through
science, art, and conversation.
My name is Sam and I'm a program developer
as part of the team that
produces After Dark.
And though this program is virtual,
the Exploratorium is located
on Pier 15 in San Francisco.
On unceded territory,
traditionally belonging to
the Ramaytush Ohlone People.
As an institution,
we recognize that we
are guests on this land
and we honor the stewardship
and that the Aloni people
have offered for the ecology we inhabit,
both past and present.
To talk about tonight's program,
we're actually kicking
off a month long series
for After Dark Online,
called agree to a degree.
Where these programs are
gonna be looking at choice,
personal decision making,
collective decision making.
What influences our preferences
and how it contributes
to the democratic political process.
So tonight we're gonna hear
from two different researchers
about uncertainty and how
to better understand it.
The first will explain election
polls and what they tell us
and what they don't tell us.
Our second guest will
be looking at the ways
election forecasts have individualized
both in the past and
in this current moment.
And suggest ways to best
read the data as a user.
But first we'll kick it
off with a short animation,
about the astounding amount of decisions
that one's faced with on any given day.
Enjoy Mr. Wurfel by Rafael Sommerhalder.
(clock ticking)
(snoring)
(clock ticking)
(alarm ringing)
(upbeat music)
- [Narrator] This is Mr. Wurfel.
And Mr. Wurfel loves making decisions.
Mr. Wurfel starts his day at
four o'clock every morning,
because before he goes to work,
there is so much to do.
Decision after decision.
The blue, the gray, or the beige trousers?
The red or the green tie?
Parted on the left or on the right?
Tea, coffee?
Raspberry, strawberry or Blackberry jam?
Or rather the Brown shoes.
And the umbrella?
(piano music)
But today, today will be
different from all of the others.
Because on the way to work,
Mr. Wurfel will get into
this unhappy situation.
(siren ringing)
And this situation requires, no demands,
a very clear quick decision.
(gasping)
(gulping)
Mr. Wurfel has only a
little more than one second
to make the decision of his life.
1,277 milliseconds.
(dramatic music)
(birds chirping)
Once when Mr. Wurfel was a child,
something terrible happened.
It was Sunday afternoon.
The young Mr. Wurfel was playing
in front of his parents' house.
Suddenly, he saw this football
in the neighbors garden.
Young Mr. Wurfel just could not decide.
(piano music)
And then his gut spoke up.
His gut said,
"get it!"
So little Mr. Wurfel took the ball
and kicked it into the air.
(animated music)
And then,
then everything happened very fast.
(window shattering)
(ball bouncing)
(water pouring)
(explosion)
(crow cawing)
and then the young Mr.
Wurfel was an orphan.
Ever since then,
Mr. Wurfel hasn't listened to his guts.
Ever since then,
Mr. Wurfel has made
decisions with his head.
Ever since then,
decisions have become a
passion for Mr. Wurfel.
For Mr. Wurfel there is no wait and see.
For Mr. Wurfel, there is only yes or no.
For him only the best
choice is good enough.
That's why Mr. Wurfel checks
everything very carefully.
every detail.
Just like now.
A step forward or backwards?
To the left, to the right?
Or simply stand still?
(alarming buzzing)
(dramatic music)
It took three years,
for Mr. Wurfel to choose his new bicycle.
(bell ringing)
And four months to decide if
he should repaint his living room.
And this morning over an
hour to get his bath water
to a perfect temperature.
(dramatic music)
But now,
(rope snapping)
in this very moment a 523
kilogram concert piano,
is racing at 46 kilometers an hour,
straight towards Mr. Wurfel.
(alarm buzzing)
(snoring)
(clock ticking)
Actually, Mr. Wurfel could have guessed,
that today would be different
from all the other days.
(upbeat music)
Because after he'd made all
the usual morning decisions,
(upbeat piano music)
and had just set out on his way to work.
His guts cried out full of desperation.
"Don't go"
(alarm buzzing)
but for once Mr. Wurfel's
head reacted promptly.
(dropping)
(piano music)
And that's why Mr Wurfel,
is now in this unhappy situation.
(piano music)
(alarm buzzing)
(piano music)
(soft piano music)
(wind blowing)
Good morning Mr. Wurfel.
Please decide what you'd
like to be in your next life.
And take your time.
(laughing)
(soft piano music)
(upbeat piano music)
- Welcome back.
We hope you enjoyed that short.
Now we're gonna turn to a conversation
between my colleague, Kathleen Maguire,
and Dr. Courtney Kennedy
director of survey research
at the Pew Research Center.
They're gonna talk about election polls,
how they're conducted,
what we can learn from them,
and how best to approach
the data as a news consumer.
Courtney serves as the
chief survey methodologist
for the center,
providing guidance on all of its research
and leading its methodology work.
She has worked as a statistical consultant
on the US census Bureau's decennial census
and on multiple reports
appearing Newsweek.
She achieved a doctorate from
the University of Michigan,
and a master's degree from
the University of Maryland.
Both in survey methodology.
Courtney has served as standard
chair and conference chair
of the American Association
for Public Opinion Research.
I look forward to hearing this expertise.
Take it away Kathleen.
- Thank you so much for
joining us Courtney.
To get started,
could you just tell us
a little bit more about
what the Pew Research Center
does and also your role there?
- Sure.
Pew Research Center is a
nonpartisan non advocacy
research organization
based in Washington DC.
DC's got a lot of people
know as think tanks
that work on policy and whatnot.
We're a little different.
We like to think of
ourselves as a fact tank.
We like to do research
that generates facts
that we hope will make a contribution
to the national dialogue.
Whether it's about
American public opinion,
public opinion in other
countries around the world,
or just broader trends in society.
And my job there is
director of survey research.
What that means day to
day is that I help design
the surveys that we do
at Pew Research Center.
Where we recruit people,
logistically how we interview them
and so forth.
- Great.
And today we're gonna dig a
little bit into election polling
and help our audience
understand some of the ways
to better understand polls.
But maybe to start out,
could you tell us a little
bit about why election polls
are important and looking historically,
how have they influenced decision
making in past elections?
- Sure.
Well I think they're important for reasons
that might be a little
different than you might guess.
So undeniably the public can
have a voracious appetite
with regard to election polls to know like
who's gonna win the
election or who's ahead?
And that's perfectly understandable.
But frankly that sort of horse race data
is not really the strength of polling.
Especially if you've
got a competitive race
where someone can win
by a razor thin margin.
Polls frankly are not precise enough
to get like a razor thin margin correct.
But polls are really
good for other purposes.
So I'm thinking about what
issues are motivating voters?
How are voters reacting to the pandemic?
How do voters feel about the candidates?
What issues are really on their minds,
as they think about going to the polls?
How do they feel about
access to voting this year?
So I think a lot of those
more high level policy issues,
and reactions to the candidates,
polls are perfectly up to the challenge
of giving us useful
information about that.
But they're not so good at
telling us who's gonna win,
a close election months from now.
And you had a second
part to that question.
Can you cue me on what that was again.
- Yeah and so historically,
are there examples of
polls being influential
in past elections in the
public's decision making?
I know last election,
they were a little messy as well.
- So when I got this question before 2016,
I really brushed it off.
And I think a lot of my
colleagues in the polling field
did as well.
And we did that frankly because,
there was never good strong
science suggesting that
just the mere presence of polls
being done and talked about,
did that really had a
meaningful impact on voters
and on their behavior and their thoughts
and on the election itself.
We like to think of
ourselves as just people
who measure something
and report what we found.
We don't like to think
of ourselves as people
who actively affect the election.
And prior to 2016,
I think that that was a pretty reasonable,
way to think about it with
maybe a caveat for primaries.
Because in primaries where you've got
candidates who are less
familiar to a lot of voters,
polls can I think admittedly
play a bigger role.
If a poll comes out and
a candidate in a primary
did particularly well,
I think that there can
be a boost to a campaign
in terms of maybe
getting some more donors,
maybe getting some more
volunteers and that kind of thing.
But in a general election prior to 2016,
we didn't think that
polls really played a role
in themselves, but after 2016,
I no longer brush that off.
I take that very seriously.
In a particular,
if you think back,
there were two things going on.
There were a lot of polls yes.
But there were also a
lot of these forecasts,
and they were really
probabilistic forecasts.
And that's a fancy word that just means,
people put a percentage on the likelihood
that the candidates were gonna win.
And a number of at least
two high-profile forecasters
said that Hillary Clinton
had a 99% chance of winning.
And that got picked up.
And that was definitely a
real prominent narrative
in the campaign.
And one thing that
happened in 2016 was a lot,
several key metro areas,
especially in the upper Midwest
turnout was a big factor.
Turn out was lower than it had been
during the Obama elections.
And I don't think it's crazy to question
whether that narrative played a role
in some people staying home.
And one thing that's happened since 2016,
is that researchers are
trying to look at that,
with data and testing.
As to whether that
narrative can affect people.
And there's some emerging
experimental evidence that,
if people are told a candidate
is extremely likely to win
some people, not all of course,
but some people may be
less likely to vote.
- And so election polls
are something that we see
pretty frequently but we
might not fully think about
their methodologies if
we're not sort of ensconced
in a career like yours or
sort of a math background.
And you already brought up
the sort of horse race polls
that maybe we don't reflect on.
So could you talk a little bit
about some of the different
methodologies that are behind polls.
As well as sort of how consistent
are those methodologies
if we're looking between
different polling bodies?
- Yeah.
That's I think one of the
most fascinating questions
this year because polling, a
lot of people don't know this,
but we're in a very transitional era.
And by that, I mean how,
Fox news and CNN do their polling differs,
from how CBS and Politico
do their polling.
And that differs again from
how Pew Research Center
and the Associated Press
does their polling.
So there's a lot of
different methodologies.
So in particular,
we've got some polling
organizations still doing
live telephone that actually
even at low response rates
works pretty well in an election year.
We've got a lot of other organizations
that have moved online.
And online you've got sort
of two different buckets.
You've got most online polls
that you'll see are opt-in,
which means they're done
with convenience samples,
frankly of people who are internet users
that could have been
recruited from a popup ad
or an email from some membership listserv
that you might be on.
But they're convenient samples.
But then there's online
pollsters who are actually
still doing offline scientific probability
based recruitments.
So for example, the way we do it,
at Pew Research Center is
we take the postal service,
the master list of all
addresses in the U S.
Draw a random national sample.
We contact people through the mail,
and recruit them to our online panel.
So you have all of those methodologies.
We still have other
pollsters doing robo polling.
Sort of like automated polling
into a lot of landline phones.
So there's quite a mix of methodologies.
And you're right for the average person,
that's all sort of behind the curtain
because news outlets don't
tend to make a big deal
about those details.
Even though to me,
they're very interesting of course.
So to the second part of your question,
there's some research about,
you do different approach has
worked better than others.
And I would say on the whole,
the differences maybe are not as dramatic
as you might think.
In 2016, there were not major
differences in accuracy,
by those different types.
Other researchers have
looked at several of,
the recent elections and if anything,
they found that live phone
polling still performs the best
in terms of election work.
But there's other good
methodologies out there.
Obviously our approach
where we interview online,
but we still do a random national sample
works pretty well in addition.
- And could you talk a little bit,
you have this fantastic blog
posts that will point people
to on the Pew Research Center's website.
That is sort of about what to look out for
when you look at a poll.
Or what's going on sort
of behind the numbers.
Could you talk a little
bit about weighting?
- Sure.
So weighting is a really
critical step in polling.
It's something that happens
after the interview is over.
The pollster has their
data with all the responses
that people gave.
And one thing, I've been
in the field for 20 years
and every public opinion survey I've done,
it needs to be weighted.
The reason is some groups in
our country are more likely
to take surveys than others.
It's just a fact and it's pretty reliable.
I mean maybe it won't be shocking,
but like older folks,
Caucasians, college grads,
people with more higher
levels of formal education.
For whatever reason,
they're more likely to
participate in a survey,
than younger folks,
people with lower levels of
formal education and so on.
And so to deal with that,
what a pollster has to do
is to make those groups
representative to basically
adjust the survey data
with this process called weighting,
to make it look like the
portrait of the country
that we have from the census Bureau
or some other high quality data source.
So what we do is when
we're done interviewing,
we might have too many college graduates.
We weight them down proportional
to their share of the population.
And then we'd weight up
people with lower levels
of formal education.
So that they're proportional
to where they should be
in the population.
And we have to do this on
a few different dimensions.
Gender, age, race, ethnicity, education,
geography, as time goes by that list gets
a little bit longer
because we have to do more
to compensate for low response rates.
- And then so as we're looking at polls
and thinking about how
we know that they can't
be perfectly accurate
and there is this level
of uncertainty to them.
How do you think people
can sort of best use polls
to inform their own decision making?
And how much should they
sort of weigh their knowledge
that some of this is uncertain
as they use it to inform themselves?
- Sure.
I think the best way to
think about polls is,
it's gonna give you,
a read a high level
read of public opinion.
But no you really have to keep in mind
that it's plus or minus,
I would say plus or
minus around five points,
maybe six points.
Regardless of what,
that the press release
might say plus or minus two,
or plus or minus three points.
One thing that's just
a truth about polling,
is that there's actually more error
than those margins of error indicate.
Because those margins of error statements
just talk about one source
of error in polling.
But there's actually four.
The margin of error only
speaks to sampling there,
which is the fact that we
interviewed say a thousand people
instead of everybody in the state
or everybody in the country.
But the other error sources
we've got non-response,
not everybody takes our surveys.
Mismeasurement not
everybody might understand
the question exactly
as we tried to ask it.
And then there can be non-coverage.
In some surveys,
some people didn't have a
chance of being sampled.
So if you factor in all
those error sources,
you really wanna think of polls as useful,
but more like plus or minus five or six.
And so I wouldn't encourage
people to think that
they should base their own opinions
or their own actions off
what they see in polls.
Polls I think are useful
'cause they give us a window
into other our colleagues.
Our American brothers and
sisters across the country.
What are their experiences?
What are their thoughts?
Just to be better informed
about what other people
in the country or maybe around
the world are going through.
- Well thank you so much Courtney,
for taking the time to chat with us today.
- It was my pleasure.
Thank you.
- Well thank you Dr. Kennedy
for your time and insights.
Up next we're gonna hear
from Dr. Jessica Hullman.
Jessica is an associate
professor at Northwestern,
with a joint appointment in
computer science and journalism.
The goal of her research is to
develop user interface tools
and methods to help more people make sense
of complex information.
And in particular to
reason about uncertainty
as they use data.
She is the co director of the
Midwest uncertainty collective
at Northwestern.
Which is a cross
institutional research lab,
working at the intersection
of information,
visualization and
uncertainty communication.
Their mission is to
combat misinterpretations
and overconfidence in data,
by developing visual representations
and human interloop tools
that express uncertainty
and align with how people actually think.
Tonight, she presents a talk titled,
"How to Communicate
Uncertainty in Forecasts."
In this talk she compares visualizations
of past election forecast
to election forecasts
for the 2020 election cycle.
And shares how uncertainty can
be successfully communicated.
We are certain that you'll
learn something from this talk
by Dr. Hullman.
Enjoy.
- Well thanks so much
for the introduction.
I'm really excited to be here.
My talk's gonna be called,
"How to Communicate
Uncertainty in Forecasts."
So the election forecast Wars
is at least one journalism
has called it are upon us.
The Economists just
released their first ever
statistical forecast to the U S election.
And FiveThirtyEight also
did just this summer.
And then a few other forecasts
from political scientists
are sort of trickling
in and you'll see those
around the internet as well.
So today I wanna talk about
how we communicate forecasts
and about why our choices
of how to communicate
uncertainty in forecasts matter a lot.
First a little bit about me.
My background is in visualization,
which is kind of a sub
field of computer science,
where we deal with interactive graphics.
And I like thinking about
how we can sort of optimize
the ways that we show data
often to public audiences.
But I have a particular interest
in some of the statistical
and even philosophical
questions that come up
in communicating about uncertain data.
And my own background as a result,
has sort of threaded
between both humanities,
then reflecting my interest
in how and why we communicate.
But also computer science and statistics,
reflecting that I like
to think about solutions.
And when it comes to
determining the best solution
for communicating uncertainty
and model predictions,
I think it's actually pretty tricky.
So first off why visualizations?
Why do they matter?
Well I think as they're used
to kind of inform the public
and support decision making,
they play this important role
in sort of helping people
make decisions about
what they should believe
about the world around them.
So we often care about something like
some true state of the world.
Like what's the state of climate change?
Who's gonna win the election?
And data is a proxy for helping
us answer such questions.
So this is a visualization
from FiveThirtyEight's
forecast this year, showing
predictions of vote share,
for both candidates Biden and Trump.
These are based on poll results,
other types of assumptions,
but there's visualization
is intended to capture
sort of if the election happened today,
what would we see in
terms of popular vote?
And we might look to this
data then as the best possible
estimate of the true vote share today.
But of course it's still just an estimate.
And most of the data we visualize,
or see visualized in the press
in government, et cetera,
is subject to uncertainty.
That can be quantified uncertainty,
things like sampling error.
So when we're looking
at database on polls,
we can't poll the entire
population of US voters.
So instead we have to take samples.
There's some uncertainty from that.
But we also have forms of
unquantified uncertainty
in various types of election
forecasts that we see,
or any types of model predictions.
So for instance,
is Nate Silver right bout
how much economic uncertainty
he thinks COVID has contributed,
or is he right about
knowing how to weight prior
election historical sort
of forecast information
or voting outcomes in making
his prediction for today?
And so if we think about
sort of how visualizations
can convey these kinds
of uncertainties to us,
there's a variety of
styles we can use here.
I think we could say this
visualization is doing an okay job
just in that we are actually seeing these
sort of confidence intervals here.
So we know that while Biden
has a 53% predicted vote share,
there's some uncertainty around that.
This could be better,
I'll get into sort of how
it can be better later,
but at least we're seeing uncertainty.
And I wanna just point
out before I jump into
sort of ways to express uncertainty,
that election forecasting,
I think is a little bit
unique in that it's,
it's a place where we see
uncertainty often reported
by default to public audiences.
So readers of news publications
like FiveThirtyEight
or The Economist.
But if you think about
how you see visualizations
many other times, when
you're sort of casually
surfing the web,
looking at government reports, et cetera.
It's often quite striking
how comfortable we seem to be
as a society with not
presenting uncertainty at all.
So we have extremes like the
congressional budget office,
which will put out reports
but not to say anything
about the uncertainty
in their estimates.
Leaving us with visualizations
that might imply that
they're precise even to the
dollar when they're dealing with
estimates in the trillions.
And so this kind of thing,
we see a lot online,
it's what I think is sort of
a grand societal challenge.
How do we get better at
communicating uncertainty
and as readers expecting uncertainty?
But there's a lot of reasons
I think, why this happens,
why communicating and uncertainty is rare.
And many of them deal
with these real challenges
in expressing it in effective ways.
So a lot of people who
are developing models,
making estimates, analysts, et cetera,
and releasing those to the
public are often worried
about things like, well
if I express uncertainty,
it'll burden my readers.
They might not understand it 'cause
they don't have a
background in statistics.
It's maybe not important to their task.
And I think there's also
this sort of perceived norm
on the part of the people
releasing estimates that,
it's kind of not to their
advantage to express uncertainty
because nobody else does.
So they think I'll look less
certain or less sure of myself.
And then there's also
various reasons related to
how it's hard to communicate uncertainty,
or how it's hard to calculate uncertainty.
Also hard to convey it both
visually or in other means.
So a lot of times,
I think authors won't feel comfortable
that they know the best
way to communicate it.
And so by default they leave it off.
So this is problematic.
And part of the premise,
I want you to keep in mind
when we talk about
election forecast today.
So one of the reasons I think that people
have trouble expressing
uncertainty model developers,
et cetera is that,
a lot of our visualization approaches
that we typically think of
for conveying uncertainty
are just not so great.
Like they have issues
that we can talk about.
So I would argue that,
we lack sort of generalizable
widely sort of understandable
visualizations for uncertainty.
And when we look at
sort of the current ways
that we tend to visualize
uncertainty that are not so great,
I would say we could break them down into,
on the one hand things
like confidence intervals,
which we show using arrow
bars or arrow envelopes.
Like we see here.
Which are kind of summarizing
aspects of uncertainty,
or are summarizing really properties of
a probability distribution
using things like
graphical annotations.
Box plots would also fall into this,
if you're familiar with those.
On the other hand,
another way that we often
see use to convey uncertainty
in estimates visually is
using basically a mapping
between something like a probability
and some attribute of a
mark individualization.
So here I'm just showing the
legend from an Economist chart
from this year.
But they basically want to show by state,
the probability that one
of the two candidates
will get elected.
And so the darker the color,
the more probable for instance,
that one of these red
states will elect Trump,
or that the blue States will elect Biden.
And so this is sort of the other category
of technique we see.
So I wanna talk first
about some of the issues
with these sort of standard of approaches.
And then we'll talk about what's better.
So in visualization
research at a high level,
we have criteria that we
use to talk about when
sort of a visualization is
useful or when it's good.
And one of these criteria
we call expressiveness.
And expressiveness basically means that
when we visualize data,
the sort of spontaneous
interpretation that the viewer
or the reader comes to
about what the data means,
should be correct.
So we don't wanna use
visualizations that make us
think things that aren't
true about the data.
And to give an example here,
I have a bar chart that's
showing us car models,
and the nationality of the maker.
These are two categorical variables,
and yet we're seeing the
mapped as length of bars.
And so this leads us to think things like,
Oh Sweden is somehow more than Germany.
Or the car models from
Sweden are somehow more
than those from other
countries, Germany, et cetera.
And that's not true.
That's not supported in the data.
So this is what a violation
to expressiveness looks like.
And going back to our ways that we
commonly visualize uncertainty,
if we think of summary marks,
things like confidence
envelopes, confidence intervals
expressed as arrow bars.
They have issues in terms of the fact that
what people think when they see them,
is often not correct about the data.
So for instance when
people look at arrow bars,
they often think they're
seeing some sort of uniform
probability range where any
value along that error range
is equally likely.
The problem is that arrow bars
are often used to communicate
different types of uncertainty intervals.
So a standard error interval,
standard deviation interval,
a confidence interval,
and these have different definitions.
And only sometimes is it true,
that it's a uniform probability range.
But of course it looks that way.
So we can't really blame people.
It's a problem really
with the expressiveness
of the visualization.
Another issue we see in a lot of people,
I think the estimates are
around 30% of lay populations.
Is this belief that the values that are
on top of the bar,
when you see a bar chart with arrow bars,
are more likely values than
those above the top of the bar.
And again it's sort of a
violation of expressiveness.
It looks like that could be the case.
People think of the bar as
actually showing the data,
but that's not true.
It's not actually supported by the data.
So these are some issues
with the sort of use
of summary marks like intervals.
So we have to be careful
when we interpret them.
On the other hand, when we
use these visual variables,
when we map probability
or confidence to things
like how dark dots are,
or the area of a shape
or how wide his shape is,
or how blurry a mark looks.
We can also run into problems,
but this time with a criteria
that we call effectiveness,
when we're doing visualization research.
Effectiveness basically
means that we want people
to be able to look at the visualization
and read the values accurately.
So I want to for instance,
be able to look at a
visualization and know
like quantitatively how much
more probable is a darker dot
than a lighter dot.
But that's difficult.
So we know from visualization research,
that sort of studies how well
through graphical perception experiments,
people can read data from
different visual encodings
that some visual encodings
are much better than others.
Or some visual variables as we call them.
So to give you a sense of this quickly,
imagine I asked you,
how much less blurry is B than A?
That's a pretty hard question to answer.
I could ask you how much
darker is B than A here?
Still pretty difficult.
How much bigger is B than A?
Or how much longer is B than A?
And what you should notice is that
some of these in particular this length,
are really position encoding.
Cause we're just comparing positions,
given that the bars on a common
scale is much easier to make
it's one out of two here.
Whereas something like this,
how much blurrier is B how much bigger,
or how much darker is much harder.
And so what we wanna do in visualization,
is use the most effective encodings.
But often the problem going back to
a chart like some of these here is that
we've already used position
to show whatever other
data variables we care about.
So in a scatterplot, we
have an X and a Y variable.
By the time we get around
to expressing uncertainty,
we're left with these sort
of harder to read encodings.
Things like capacity, things
like area, things like color.
And so we see this here in the full chart,
I showed you the legend earlier.
This is from the Economist's forecast,
where darker colors mean more probable.
So there's not necessarily
something wrong with this.
It's not illegal to do this.
But the point is that
it's hard to at a glance,
kind of make an estimate of
how much more probable say is,
New York voting for
Biden than Pennsylvania.
So given a chart like this,
we really have to like rely on
the legend up here carefully
in order to make any
quantitative estimates.
So it's visually not,
we're not doing the work visually.
We're sort of really
relying on the legend.
I think a better way,
to use some of these encodings
that look like uncertainty
like darker is more certain
or less blurry is more certain
is to use them more for what
we would call ordinal data
or rank data not necessarily
precise quantitative information.
So here this is a chart
from the Bank of England,
which I think is nice in that,
darker here means newer data,
and newer data is
potentially more relevant
to whatever judgment we're making.
This is about Brexit,
but they're not trying to show precise
quantitative information
where we're basically
plotting what month it is
to how dark the bar is.
So we're not trying to,
the reader knows that
month's are not quantitative.
And so it's sort of a more
appropriate use of darkness.
But yeah, the challenge
is that often the ways
of expressing uncertainty that
most look like uncertainty
are the hardest to read.
So I wanna talk about
techniques that we've found
from research to be better,
and that we'll see in some of
this year's election forecasts.
But before I do that,
I wanna point to another
sort of deeper problem
than just visualization
with how we present
uncertainty and forecast
in model estimates.
And that's the nature
of probability itself.
What does it mean?
So to quote a famous
actuary and statistician,
Bruno di Finetti,
probability does not exist
and that's the problem with it.
And I think what he meant
here is that probability
is very hard to sort
of objectively define.
And even statisticians argue about
how should we estimate probability?
What does it really mean in the world?
So say I have some
event that's gonna occur
with 30% probability
like Trump being elected.
The problem with understanding
probability is that
we know that in the world,
Trump will either be elected or he won't.
So it's not clear what
to make of this 30%.
What that refers to.
And so facing a probability,
people are often motivated to
simplify their decision making
in various ways.
One way is to round the probability.
'Cause we don't know what to do with it.
And so looking for instance,
at the 2016 sort of top
level forecast presentation
from FiveThirtyEight,
they really relied on win probabilities.
And so I think it's very easy to,
think that someone who came
to this page might see,
Oh 71.4% for Hillary Clinton.
Not know what to do with that
other than to perhaps roundup.
Because we know that
that that number's fairly
far above 50% and similarly
round down for Trump.
And so given probabilities,
we might not actually
incorporate the uncertainty
into our decisions 'cause we're
trying to sort of figure out
what to do with it.
And one trick that can
help based on research,
going back to cognitive
psychology in the 90s,
is to take a probability,
something like 30% and frame
it instead as a frequency.
So rather than 30% saying
three out of 10 times,
Trump might win the election for instance.
And one of the reasons,
or first I should say the
research in cognitive psychology
originally showed that when you do this
simple framing trick,
people can do better on
these kind of classic
Bayesian reasoning tasks.
So tasks that involve reasoning about
conditional probabilities that
are often hard for people.
And so I think one of the
reasons we might speculate
that a frequency framing
helps people do better
when reasoning about uncertainty is that,
if you think about sort of
how we experience uncertainty
in our everyday life,
we're thinking about sort of
the probability of some event
or how uncertain that is.
It's often easy to think about
it in terms of frequency.
So say I wanna estimate,
my probability of missing the bus.
If I get to the bus stop
at the same time every day.
I might intuitively think,
well, yeah about four
out of five times a week,
I catch the bus and one out
of five times I miss it.
And so frequency is kind
of this natural sort of way
of thinking about uncertainty.
And interestingly, just this year,
we see some changes in
how the election forecasts
are being presented and in particular
use of a frequency framing.
Where we didn't really see
this in the 2016 cycle.
So both The Economist and FiveThirtyEight,
Economist on the top,
FiveThirtyEight on the bottom here
are using this frequency framing.
So FiveThirtyEight even
shows a visualization.
So here we're seeing a
hundred hypothetical outcomes,
a hundred hypothetical elections
between Biden and Trump.
And that's used to express
this 30 in 100 or this 30% probability,
as 30 in 100 Trump wins.
And this ball swarm plot was
actually one of the sort of
innovative parts of this
year's election forecast
kind of presentation,
that FiveThirtyEight
really talked about a lot
when they talked about their process.
And so this kind of framing
trick could be useful.
If you're a skeptic of
course you might say,
well is this really so different
from expressing a probability?
How can it really change things?
And in the case of the
visualization like this ball swarm,
how different is this
than if we hadn't used
this frequency framing.
That's the kind of thing that my research
along with some collaborators
has been looking at
over the past year.
So we wanna know,
how much does it help someone
incorporate uncertainty
into their judgments or their decisions?
When we use say a frequency
framing visualization?
How do we design frequency
framing visualizations that work?
And so we have to grapple
with some other questions,
like how do we evaluate if
an uncertainty visualization
works at all?
It's not always as sort of
obvious as you might think.
So one problem when we're trying to know,
how well does a visualization technique
for uncertainty work?
Is that often it's hard to
put our finger on sort of
what the right outcome
or decision should be
from an uncertainty visualization.
So if somebody is looking
at an election forecast,
like FiveThirtyEight's,
what should they ideally do with this
30% chance of Trump winning.
That can be hard to define
making these experiments
on sort of figuring out the
most effective chart difficult.
There's also this problem where often,
the way we study visualizations,
and which visualization is better,
depends on asking people to read the data.
So extract some probability,
make an estimate.
But one thing we know
from a lot of research
in people's use of uncertainty,
is that often there's no
guarantee people will use
the uncertainty information
just 'cause they can read it.
Often we wanna suppress uncertainty.
So even if I ask you to read the chart,
you might do it correctly,
but then not actually make
a decision any differently.
And finally what's even worse
is that people are often
comfortable with visualizations
that don't show them
uncertainty very well.
And that's because we do
ignore it or we suppress it.
And so by suppressing it, we
make our decisions easier.
And sometimes the visualizations
that make it easier,
to ignore are actually
liked more by people,
even though they're
not good for decisions.
So these are all sort of challenges
that we have to grapple with,
when we're asking decisions like,
how effective is a frequency
framing visualization?
Like FiveThirtyEight's really,
in terms of helping people.
But in our lab,
we've done some research
looking at sort of
different ways of expressing uncertainty,
including frequency framing.
So here on the right
I'm showing what we call
quantitative dot plot.
Where we're taking a
probability distribution,
in this case showing uncertainty
in predicted bus arrival times.
So imagine I have a bus or
a transit app that gives me
predictions of bus arrival time.
I could think of that as a distribution,
this Quantal dot plot is
taking that distribution
and expressing it in terms
of dots representing,
different possible bus arrival times.
Similar to the balls swarm that we saw
in FiveThirtyEight's selection.
In studies we've tested these a bunch
against a number of other sort of
more common representations,
like a density plot,
different types of intervals or arrow bars
that we could use.
Even text expressions of uncertainty.
These are all ways that we see
uncertainty being communicated.
And so in trying to answer these questions
about how effective are these approaches,
we often will do use multiple methods.
So because it's challenging
to say with a single
sort of method,
whether an uncertainty
visualization is effective.
We'll do things like first ask people
to do probability estimates.
So in this case we did a big study
with a bunch of bus
riders and we asked them,
what's your chance of missing the bus,
if you get to the bus stop at
some certain amount of time?
Or in some certain amount of time?
But we also wanna connect is that again,
to decision making,
because there's no guarantee
that if you can read it,
you can also incorporate it in a decision.
So we also do studies that look at
incentivize decision making.
So in this particular case
with these Quantal dot plots,
we did as well as study
where we're having subjects
basically make an
incentivize decision about
when to leave for the bus.
Where they get penalized,
if they wait at the bus stop too long,
but they get a reward if they
actually catch the buses.
So there's various sort of
approaches that we can combine
in order to get a sense of
how well these things work.
And what we've seen in
this kind of research
with these frequency
framing visualizations,
similar to the ball swarm
that FiveThirtyEight used.
Is that one they help
people make more consistent
probability estimates.
So people can actually judge
the probabilities better.
And secondly, they lead
to better decisions.
And so what we've found is that,
when you show someone
something like a density plot
here in the middle,
there's more error in any
individual's judgment.
It's almost as though sort of they're
either not sure of what they're seeing.
And so they're not sure
how to interpret it
or making these sort of area judgements
to estimate how much probability
you have a missing the bus,
et cetera is difficult.
It's error prone.
For the decisions, we see things like,
people can use or learn how
to use things like intervals,
but they're pretty bad when they start out
and they don't improve all that much.
So we can learn a lot about
sort of effective visualizations
through these studies.
Okay so we can use frequency framing.
That's a good approach to
communicating forecasts.
But there's a few other questions,
I think we should talk
about when it comes to
visually communicating forecast.
And one of those is actually again,
not a visualization question.
So for election forecasts,
we often see one of two things visualized.
So we either see vote share of some type,
which gives us either
the popular vote margin
or the electoral college vote.
The prediction of that proportion
that goes to one candidate
versus the other.
On the other hand,
we also see win probability.
Where we're seeing basically a translation
of this vote share into
a probability of winning.
These have both their own pros and cons.
So vote share is easier
to understand often.
Often we'll see the popular vote
rather than electoral votes.
And popular vote is what we
create when we vote ourselves.
So there's kind of a
direct interpretation.
But the problem with
vote share is that it's
a little bit removed
from what we care about.
What we wanna know really is
kind of what is the probability
that our candidate wins?
On the other hand, we can
show people win probability,
which answers that question,
but it's more likely to be misunderstood.
And one of the reasons is people don't
grasp probability well.
Like we've talked about they round up.
Another reason as you can
see from these two charts,
is that when we're mapping
something like a vote margin
to a probability there
sort of an exaggeration
of the difference.
And a lot of people not surprisingly,
don't understand how this works.
So you can have a big
difference in probability
from a relatively small vote margin.
And so this is something that
researchers have speculated
may have led to potentially
less voter turnout
in the 2016 election, because
a lot of the forecasts,
the creators of the forecasts were leading
with probabilities of winning
and not really emphasizing the vote share.
But this is just a single
study that's looked at this.
There's still a lot to be determined,
but as we can see, at least
from looking at the graphics,
there is a difference here
that we should be aware of.
Excuse me.
So it matters what we visualize,
and how we visualize it.
I think one intriguing possibility
that I wanna talk about
for a few more minutes is to find a way
to visualize something like vote share,
but also get across something
like probability of winning
in the same visualization.
That would give us kind of
the best of both worlds.
And so let's imagine we
have data on vote share,
essentially if I show you,
if I go back to this visualization,
whenever we have something like this,
we have an estimate and an interval
for both Biden and Trump.
We're basically talking about
comparing probability distributions.
We're trying to compare the
estimates with uncertainty
from these two candidates.
(coughing)
And so you could imagine,
here I've just retranslated
this into a bar chart
with arrow bars.
So two distributions
Often what we wanna do is look
at something like vote share,
or some other measurement,
but answer questions or
intuitively arrive at an estimate
of sort of how reliable
is this difference?
What's the probability
that blue say Biden here,
is gonna win the election?
And so if we just show probability,
people can not realize
how close the vote share
can be et cetera.
But if we show visualizations like this,
that give us both distribution at once,
what we find is that people
will often use heuristics.
So I don't know how to
answer this question
using the graphic I'm given.
And so what I'll do is look at how big
the difference is
between the two averages.
And so given a visualization like this,
I might say, Oh this is a
highly reliable difference.
Anything else with a big
difference is highly reliable.
Whereas anything with a small difference
is not very reliable.
So we call these heuristics,
they're kind of like mental
shortcuts people use.
The problem is that sometimes they fail.
So when we have a small
but reliable difference,
or a large but unreliable difference,
we shouldn't make this kind of estimate.
(coughing)
And one of the problems
with even things like
Quantal dot plots or ball swarms,
is that we're seeing two
distributions at once.
And so we can always look to
sort of estimate the average
and ignore the uncertainty
information to some degree.
So one way we can get
around this is actually
showing people outcomes,
but now showing them outcomes over time.
So these are hypothetical outcomes.
We call these hypothetical outcome flats.
And one second.
What you'll notice is that now
you can intuitively estimate
something like what's the
probability blue will win,
but at the same time,
you're getting the
underlying measurement data,
say vote share in this case.
And so this is a way of
sort of showing uncertainty,
that's a lot harder to ignore.
I can't just focus on the average,
because I have to actually
intuitively estimate the average
using the uncertainty.
So we don't see these in the
election forecast this year,
but we've seen them quite
a bit in the past actually
used among sort of the
top journalism outlets,
like FiveThirtyEight.
And one of the reasons I think these
animated representations of
uncertainty can be useful,
is that often you have
a complex visualization
or you have some measure
that's hard to simply add
an uncertainty encoding to.
So you can't just put an
arrow bar on something.
So for instance,
FiveThirtyEight has used
these to show uncertainty
in a ranking.
So in this case who is likely
to be in the GOP debates,
was the question these different outcomes.
And it gives you a sense
of uncertainty in a rank.
Another reason these are appropriate,
in particular for election forecast data,
is that often when we
have a forecast model,
it's a complex model and
it might be capturing
a lot of dependencies or correlations
between things like different States
that it's making predictions for.
Excuse me.
So for the Economist's forecast this year,
one of the things that they highlight
through an interactive visualization
is that a lot of States
voting behavior is correlated.
So depending on what Idaho votes,
we can expect a similar
vote in some of these
neighboring States and Idaho
tends to not be very related
to how California votes.
So in a complex model,
these dependencies often matter a lot.
And so it's very
difficult to capture them.
It's very difficult to capture them
in a single static graphic.
This is a problem we face all
the time in visualizations.
On the other hand, if
we can animate outcomes,
we can naturally show
things like correlation.
So these are predicted voting results
from the 2008 election,
but sort of gives you the idea that
States can move together.
And we can show this individualization,
without having to do something like this
interactive thing where you
have to hover over each state
to see each state's information.
Okay finally, animating uncertainty,
makes it a lot more visceral.
It's harder to ignore.
And you might've noticed that just as
I've shown these visualizations.
That can be a little bit overwhelming,
given our innate tendencies
to try to ignore uncertainty.
To see a visualization that forces us
to contend with it is difficult.
And so you might recall
as an example of this,
the New York times Needle
from the 2016 election,
this was enrolled on election night
and it's combining a static
display of uncertainty.
So you can see kind of
in the background here,
you have the static
shading based visualization
of uncertainty in vote margin.
But then it's also animating outcomes.
And new data was coming
in during election night.
And so the needle was moving with some
random randomized jitter
within a confidence interval,
but also new data was coming
in which was updating.
So people really rallied
against this visualization.
I think personally,
one of the issues was not that
this was not an effective visualization,
so it showed uncertainty
and it made it very visceral
and it was very hard
for people to ignore it.
I think the real problem
was that this was introduced
not until election night.
And so if we'd seen something like this
leading up to the election,
I think perhaps we would have
been more used to uncertainty
and less surprised than
when the visualization
showed us something that
we hadn't expected to see.
So it's an example though of how,
making uncertainty visceral
and hard to ignore,
can lead to some reader discomfort.
Maybe visualizations should
induce uncomfortableness,
proportional to uncertainty.
That's kind of my view,
but at the same time we
don't wanna alienate readers.
And so the note to forecasters would be,
if you're gonna do something like this,
take into account how
you might give the reader
some control back,
make it interactive, et cetera.
So they can pause it et cetera.
Okay finally,
I want to just close by commenting briefly
on forms of uncertainty
that I haven't discussed,
but which are really critical
to communicating forecasts.
And these are the uncertainty
stemming from our inability
as model developers to
quantify how good our model is
in various ways.
So all models make assumptions.
As a famous statisticians said,
all models are wrong, but some are useful.
And a model is only is
good as its assumptions.
And so this doesn't mean that forecasts
can't help us make better decisions.
But we have to be aware of
this dependency on assumptions.
And we have to be aware
that there's uncertainty,
that we can't always see visually
because it's not quantifiable.
And so how forecasters
should express this,
is really an open question.
We don't know a lot about how
to express uncertainty well
in text, when it comes to election models
like FiveThirtyEight or The Economist,
we wanna express uncertainty
related to things like
how much should we trust the poll data?
How much should we trust
our assumptions about,
uncertainty from COVID?
It's not clear how to do that.
Something that
FiveThirtyEight did this year,
which I think is really
interesting and innovative,
is introduce this Fivey
Fox cartoon character.
Who's basically, yeah,
you're gonna see him around
all the forecast graphics
that they're showing.
And he's giving you advice
often about how to keep in mind
the uncertainty that's lurking sort of.
That maybe is based on the things
that we can't quantify.
So sometimes he gives advice
on how to read the chart,
but often it's emphasizing things like,
upset wins are still possible,
even though the trend appears to be
in the opposite direction.
So I think this is,
a really nice attempt to communicate
some of this other uncertainty.
We still maybe have a long way to go.
What I'd really like to
see is forecaster's really
talking about model assumptions more.
In a sort of accessible way.
I think as readers,
what we need is to develop
literacy around these things.
But steps like Fivey Fox,
steps like frequency
framing of uncertainty,
are really showing us progress I think.
And so as readers,
we need to keep expecting
to see this kind of thing,
and trying to get better
at reading it ourselves.
So I'm gonna conclude there.
It's been great to share
these thoughts with you.
You can find me online if
you're interested in more.
Thank you.
- Well thank you Dr. Hullman,
for your time and your research.
That wraps up tonight's program.
So thank you for watching and
continuing to learn with us.
Your support is invaluable.
We hope you'll join us next week,
as we take a closer look at
how we exercise our choice
through voting.
We'll see you at the
next After Dark Online.
(upbeat music)
