- Hello, I'm Jennifer Chayes.
I'm the associate provost
of the Berkeley Division
of Computing, Data Science, and Society
and the dean of the School of Information.
On behalf of the division
and of the Berkeley
Institute for Data Science
at UC Berkeley of course,
we welcome you to this
conversation on COVID-19.
This is the second in a series of seminars
in which we look at how
Berkeley researchers,
are using computing and data science
to inform pandemic response and recovery.
The focus of today's conversation
is how can we understand and seek equity
as we face COVID-19.
Emerging data show that the pandemic is,
amplifying socioeconomic
disparities and inequities.
Here, we'll look at how
choices about data sources,
research methods and technologies.
And also how approaches
to building relationships
with marginalized communities
shape our understanding
of and approach to
addressing these problems.
We'll hear from three
fantastic researchers.
First, Ziad Obermeyer from
the School of Public Health.
Second, Niloufar _______,
from the School of
Information and the Department
of Electrical Engineering
and Computer Science.
And third, Sarah Vaughn from
the Department of Anthropology.
So we're gonna get
started first with Ziad.
- Thank you so much, Jennifer.
So I'm on faculty at the
School of Public Health,
I actually trained as a doctor.
But most of my work these
days is building algorithms.
And the goal of that work is basically
to help people working in
health make tough decisions.
So that's doctors working in the clinic,
it's policymakers who
are allocating resources.
And I'm actually very
optimistic about that work
and about the role that algorithms
are gonna play in health.
But that's one reason
that I also spend a lot
of time worrying about
what can go wrong when
we apply algorithms to health.
So I wanna start by telling you a story.
And that story is about an algorithm
that was made by people
with really good intentions
who were trying to do
something really important
but that algorithm ended
up hurting a lot of people,
most of them black and poor.
So we're gonna post in
the live chat a link
to the paper if you wanna read more.
I'll summarize it but also
try to summarize the lessons
that I took from writing that paper.
About how bias gets into data
and then into algorithms.
And that actually really
helped me to make sense
of COVID about where data
and algorithms could help
and where that might go very wrong.
So that's where I'll end up.
But first, I'll tell you the story.
And that story starts 10 years ago,
when health systems
first started to realize
that they couldn't just keep
doing things the old way.
So the old way was sitting around, waiting
for people to get sick,
and then treating them,
when they walked into the hospital.
And that game, which is
you can think about is,
basically waiting around for
people to have heart attacks
and other health events
is not good for patients.
And it's also very expensive
for our health care system.
So all these hospitals started
to see the writing on the wall
and realized that that old way had to go.
So the new way was to
get much more proactive
to get ahead of illness,
catch it before it happened.
And to do so they started
rolling out a bunch of programs
to get help early to the
patients who needed at the most.
So that's like a dedicated phone line
that patients could call
whenever they needed help,
a team of nurses, extra
primary care appointments,
home visits, medication,
refills, whatever people need.
And that works.
So patients love it,
the outcomes got better,
the cost got lower.
But there's a catch,
all of that extra help itself costs money
so we can't do it for everyone.
And that's where algorithms come in.
So algorithms are doing the
job of finding the people
who today look like they're gonna be okay
but tomorrow are gonna get really sick.
And then we can zoom in on those people
and give them all the
extra help that they need.
So the hard approach
that algorithmic approach
to targeting was very popular
and almost every healthcare
system in the country,
is using algorithms to
do exactly this job.
And so as one statistic, the algorithm
that we studied in this course paper
that one alone was being used
to screen about 70 million people a year.
And the family of algorithms like it,
is about 150 to 200
million people every year.
So that's like most of the
US population every year.
And I think it's important to
call out this is a good thing.
This is exactly the kind of thing
we want the healthcare system to be doing.
To be looking ahead,
not waiting around for illness to happen,
predicting things that humans can't
and helping those who need it the most.
So I hope that sounds good to you too
because I think that's the vision.
But there was one
problem with this vision.
So I told you that the
algorithm was supposed
to be finding people whose
health is gonna get worse.
And when I said that, you
probably know what I mean
but an algorithm does not.
An algorithm needs to be told about,
a specific variable in
a specific data set.
And so a really important
question to ask is
what is that variable?
And how did the people
who made the algorithm
translate this messy concept
of health and deterioration
into a very specific variable.
Now, I can't see you
because we're on zoom.
But I've taught enough classes to know
that this is about the
time of the lecture,
when people start getting a little bored.
This feels like kind of a technicality,
it's a little tedious,
I understand I got it.
But here's why it's not.
So it turns out that in
all of the health datasets
that we use there's no
variable called health.
There is a variable
called health care costs.
And when they train the
algorithm that's what they used.
The algorithm was finding
people who were okay today
but who are going to generate
a lot of costs next year
as a proxy for who is
gonna need health care,
over that next year.
But there is a world of difference
between needing health care
and getting health care.
And that's especially true
if you're black in America.
So take two people with the same number
of chronic illnesses, the
same illness severity,
however you wanna measure illness.
One patient's black, the other is white
that black patient is on
average gonna cost far less.
In our data set it was
about $1,000 a year.
And part of that is because she is poor
and she has less access
to the healthcare system.
Part of that is because
our health care system,
treats her differently because
of the color of her skin.
Now think about what
that difference means for the algorithm.
We thought we told the algorithm
to find people who are gonna get sick.
What we actually told the algorithm was
to find people who are
gonna cost a lot of money.
So you can imagine what that
looks like to the algorithm.
So everyone is waiting in line
to get into these programs.
Just picture them their ranked,
they're lined up in order
of how much they're gonna cost next year.
And patients the handful at the front of
that line, they're
gonna get the extra help
and everyone else is not.
Now they're ranked from
most to least expensive,
not from sickest to healthiest.
And that means that a
bunch of healthier patients
who happened to be more
likely to be white,
get to cut in line ahead
of sicker patients who happen to be black.
So a 60 year-old tennis player
who's waiting for his
new replacement is going
to cut in line ahead of someone
who has to skip doses of his insulin.
When we use cost as a proxy for health,
we're confusing healthcare
costs with healthcare needs.
And that disproportionately
hurts black people.
And this was not subtle.
When we estimated
what an unbiased
algorithm would look like,
it would have more than
doubled the fraction
of black patients at the front
of the line getting extra help.
So when you scale that
up by 10s of millions
of patients per year,
that's a lot of pain caused
by choosing the wrong variable to predict.
So there are a couple of lessons here
that I think are relevant to
how at least I thought about COVID.
The first lesson is to be careful
what you ask an algorithm
to do, it's going to do it.
If you ask it to predict healthcare costs
that's what it'll do.
It won't pause to ask if cost is biased
because of historical disparities,
it won't point out the
blacks are underrepresented,
in the highest risk groups that's on you.
The second lesson is, I guess
technically about algorithms.
But you can also find this lesson,
in a lot of self-help books.
Most of the things we
really care about in health,
in society, in life, we don't measure.
Instead we have proxy measures.
And proxy measures can be very dangerous.
How do I measure how my career is going?
Is it my salary, my citation count?
Is my fulfillment in life about
how many Instagram followers I have?
Hopefully not 'cause I think
I literally only have one.
But all that is to say that
algorithms are not the only ones
that can be distorted by bad proxies.
So here's what that might mean for COVID.
There's a lot of interest in getting ahead
of the epidemic, predicting
which geographies,
which communities are
gonna be hardest hit?
And that's a good we do
need to target our resources
to where they matter the most.
But what exactly are we predicting?
Usually it's the number
of COVID cases or deaths.
But COVID case means
testing positive for COVID.
And testing means access to health care.
And if you never get tested,
will never know that you had COVID.
We often think that we're
predicting a biological variable,
COVID, it's a virus we measure it
with PCR it sounds like science.
But what we're actually
predicting as a proxy measure
that's just as much a social
variable as a biological one.
And that's why that can be so misleading,
especially when you
forget that it's a proxy.
Here's another one.
We dread rationing health care
and yet there might be a
shortage of ventilators.
So who gets the ventilator?
We often hear that
sicker patients do worse.
So the received wisdom is,
give the ventilator to
the healthier patient.
But that optimizes the wrong quantity
We wanna give ventilators to
where they do the most good,
not to patients who do the best.
Healthy patients do better
with the ventilator,
they'll also do better
without the ventilator.
So the key quantity isn't the outcome.
It's the difference in outcome
between the treated state
of the world with the ventilator
and the untreated state
of the world without it.
Giving the ventilator
to healthier patients,
is just gonna punish sick patients
who are disproportionately
poor black and disadvantage,
even though it might
do them the most good.
So I'll wrap up by just saying that all
of these pitfalls are not unexpected here.
This is a new field we're
all just figuring this out.
And none of this has
been really mapped out.
But they can be mapped out and
these pitfalls can be fixed
and that's where the
story I told you ends.
When we first saw the extent of bias in
that algorithm I told you about,
we reached out to the company
that made the algorithm
and we told them and they
were incredibly responsive.
They worked with us to
retrain their algorithm,
same data set, same everything else,
just predicting health
instead of just cost.
And together we made a revised
version of the algorithm
that was over 80% less
biased than the original.
We'll post an article that describes
that work as well on the live chat.
But we took advantage
of this to turn this,
into a larger pro bono consulting project
where we're working with health systems
with insurers with state
and federal regulators
to understand and remove
bias from health algorithms.
And I think all that just
illustrates both the danger
and the promise in this area.
On the one hand, these
small technical choices,
can dramatically scale up
existing bias in our data
and harm lots and lots of people.
On the other hand, correcting
these small technical choices,
can overnight make a huge difference.
- Thank you so much Ziad
that was really, really fascinating.
Let me ask you a couple of questions.
You were quoted in the LA
Times at about the time
that that article came out.
And the quote of yours was
we shouldn't be blaming the algorithm.
You said we should be blaming ourselves
because the algorithm is just learning,
from the data we give it.
And if you've also just said
from the proxies we give it.
Does the statement hope for any
of the COVID-19 predictive
models that you've seen
and the algorithms of which you're aware?
- Yeah, I think some of the
factors that I mentioned,
will definitely raise
some red flags for me
so often, we're predicting
confirmed cases.
We are putting a lot of stock in that
as our barometer of how
the epidemic is going.
And we know at the same time
that a lot of people just
lack access to health care
that's gonna mean that
precisely those communities
that are likely to suffer
the most from COVID,
are the least likely to be tested.
And so I think that anytime
that we're measuring
or predicting one of these variables that
that's like a hybrid
of a social construct,
like access to health care
with a biological one.
Do you have this virus
in your bloodstream?
Those kinds of errors are gonna happen.
I will say that this is what
makes the work of, a lot
of people including some of
my colleagues in public health
so important to do
population-based service.
So I think that getting a
actively collected set of data,
about the prevalence of COVID,
in different places in
different communities,
is really, really important.
And it is a really vital
complement to the data
that we're getting from
the healthcare system
that builds in all of these biases.
- So second version of
this question actually
as I started to hear about,
the huge disparities in mortality,
I mean, just so different
from the population.
And this was caused by this
is presumably being caused
by comorbidities, by other conditions,
which people have those conditions,
may in fact be exacerbated
by the kinds of effects that you measure.
How do you think your insights might help
to prevent these
socio-economic disparities,
which become even more manifest,
in a health crisis like COVID-19.
- Yeah, I think that,
when we see these statistics
in black communities,
the mortality rate is twice as high.
I think I find those
statistics so hard to interpret
because on the one hand, I
think it's pretty obvious
that black communities
are gonna suffer more
because of all the factors you mentioned.
Because chronic illness
is so much more widespread
because access to health care is less
because there are fewer ventilators,
serving those communities effectively.
On the other hand
that two X number could
be much much worse.
Because we're only measuring the tip
of the iceberg of the actual COVID cases.
And we don't have the data infrastructure
that lets us track what's
actually happening.
So, in a lot of places
for example, you can see
that there are since the
COVID epidemic started,
there are these spikes
in mortality of course
that are in confirmed COVID cases.
But there are also these huge
and in many cases, even
bigger spikes in mortality,
in other people who are
not confirmed COVID cases.
And so that could be undiagnosed COVID,
it could be that surgeries
and other essential
procedures are being canceled
so that health systems can
prepare for an influx of COVID.
And so I think that those numbers,
are just incredibly hard to interpret.
But all of the signs are pointing us
to evidence that these
communities are hit much harder.
So I think that when we
can use those numbers
to target resources, that's very useful.
I think that it does bring
up some other questions
and I think a lot of the
discussions we've had,
as I mentioned are about
rationing healthcare
and so who gets extra
care when care is limited.
And I think that the current
heuristics that we're using,
were healthier people
should get care ahead
of sick people actually risk,
reinforcing all of these disparities,
even when those sick people
actually need the care more
and would benefit from it more.
It would be socially more efficient
to deploy that care to sicker people.
- Thank you so much Ziad
this has been fascinating.
- Thank you.
- We now turn to Niloufar Salehi.
And I think you have a
presentation for us as well, yes?
- Yes, thank you Jennifer so much.
And thank you Ziad that was fascinating.
So hello.
I wanted to start today by
just acknowledging my privilege
to be able to be here and to speak
at this very difficult
time of a deadly pandemic.
And also to use it as
an opportunity for us
to think more deeply about
who gets to talk about data.
Similar to Ziad, I'm
gonna address four points
and I do have some resources
to share as well at the end.
So I wanted to start
by talking about some of my own past work.
A lot of my work has been
around how marginalized
or disadvantaged groups
make use of social systems.
And as a computer scientist
and as a designer,
I'm thinking about how technological
systems can be designed
to better meet human needs.
So for instance, some of
my work has been looking
and studying how Muslim Americans gather
and share information
online related to elections.
So I found that activists,
journalists, civil rights groups,
use social media to
craft counter-narratives
to the dominant narratives
that exist that target and demonize them.
In other work my group has studied
how young women have reconfigured
the Instagram platform
to create more intimate
social spaces online.
So these are spaces
that they call finstas,
which is fake plus Instagram
and they are private accounts
that people only share
with their close friends.
They are spaces of mutual support
and they push back on some expectations
of perfection for young
women and they even poke fun
at those expectations
with hashtag ugly selfies.
And finally, I've worked on
how online social systems,
can be designed to support
collective action and mutual aid.
This is work that seems
particularly timely at the moment.
I did this work with Amazon
Mechanical Turk workers
who are people who are
dispersed around the world
and they complete short tasks online.
So this is work that looks
like content, moderation work
or data labeling work.
And unfortunately the people
who do this very important
work, receive very low wages
and work under very difficult conditions.
So I've done work together with truckers
as they call themselves.
I did over a year of
ethnographic fieldwork,
in a process of co-design
with these people
to create a system that we call Dynamo.
And this was a system that
was specifically designed
for people to raise ideas
about collective action,
issues that they share together,
discuss, agree on them and take action.
So as one example,
Turkers used our platform
to co-write a set of ethical guidelines
for researchers who use Mechanical Turk.
They published those guidelines online
and it's been currently used
by a number of researchers and IRBs,
which are our university's
research ethics review boards
for research that uses Mechanical Turk.
So that was some of my previous work
and I wanted to use
that to talk more about
how my group has accounted for health
and socio-economic
disparities in this research.
So I wanna talk about a
couple of different questions.
One question is where do we look for data?
And sometimes the question is even
what data do we emit and
not look at on purpose.
And working with marginalized
or at risk people requires a
lot of care as Ziad talks about
and choosing what those research methods
and data sources are.
For instance, in our work on
finstas the fake Instagrams,
we purposely didn't mask for
access to the actual content
of the finstas, the actual
pictures that people posted.
And the main reason for doing
this was that fista photos,
might include compromising
photos of minors.
So even though it would have
been great for the research
we instead opted to
interview finsta users.
So for some of our other work,
including our research
with Muslim Americans,
we did a lot of work
to build relationships
and trust with community leaders,
before we even started to gather data.
And this is to ensure
that the research serves their interests
and they are partners in the work,
not just research subjects.
And finally and particularly
relevant to COVID-19 is,
thinking about how we work with people
who are of lower socioeconomic status
and making sure that our research doesn't,
actually add us a burden
to what they're already dealing with.
And I had some experience
doing that with my work
with Amazon Mechanical Turk workers.
Where I was working with people for whom,
time that they were sitting at a computer.
Was time that they needed
to be making money to pay their bills.
So I factored that into
my research approach.
I tried to go where they were,
I spent a lot of mornings waking up,
logging into Turk or
chat rooms, announcing
that I was there and then sitting there
and reading their
discussions and jumping in
if something relevant
to my research happened
and asking questions.
And second, that factored into
how we actually designed
the system Dynamo.
So we made it such that people could
use the system in really
short bursts of time
that they had between the tasks
that they were already completing.
So in a way we made the barrier to entry
and auction low for that reason.
So that was some of the
ways that I've accounted
for different kinds of disparities,
in my own research approach.
And I wanted to end by talking about
how we can best leverage computing
and data science in this kind of research.
So computing and data science,
I think can help in two ways.
They can help us identify the problem
and they can help us identify
and design a solution.
And as Ziad mentioned unfortunately
but predictably COVID-19 has
had drastically harsher impacts
on marginalized in low income communities.
And my past work and being,
in the School of Information has led us
to study how marginalized groups
are accessing information about COVID-19.
And the impact that
potentially wrong information
or misinformation has on them
and how those effects can be mitigated.
So we're looking at
COVID-19 misinformation
and we're particularly looking
as it circulates in private groups,
like family chat groups
where trust is very high.
And we also have very limited
means of accessing the data.
So these are often
private encrypted groups,
similar somewhat defense does.
And we have to get very creative
in how we gather the data.
Some approaches that
we are considering is,
for instance a bot that
people could voluntarily add
to their messaging groups
that automatically detects
and shares information if
it detects misinformation.
Although we have to be
very sensitive in how
that is designed and how
it actually does that.
And data can also be used
to identify solutions.
So some of our preliminary work,
around misinformation suggests
that interventions by community leaders,
can be effective at
countering misinformation.
So we can test that hypothesis
by analyzing past data
to measure the effects
that these different kinds
of interventions have.
I wanted to wrap up by
sharing some resources
that we're using and building
on in our own research
that I think could be useful
for other people working
on these problems as well
so I'm just gonna share my screen.
So the first resource is the
Feminist Data Manifest-No.
This was created by the
feminist data workshop
and I was a member of that.
We co-authored the set of
basically refusals and commitments
that refused harmful
data regime and instead
and at the same time
committed to new data futures,
I pulled out three sections
from the manifest-no
that I thought were particularly,
relevant to COVID-19 Technologies.
I'm just going to read over them
but there's a lot more
in the actual manifest-no
that I suggest you look
at if you're interested.
So the first one is, we
refuse the assumption
that risk and harm associated
with data practices,
can be bounded to mean the same thing
for everyone everywhere at every time.
And instead we commit to acknowledging
how historical and systemic
patterns of violence
and exploitation produce
differential vulnerabilities.
Second, we refuse the use of data,
about people in perpetuity.
We commit to embracing agency
and working with intentionality,
preparing bodies of data to rest,
when they are not used in service
of the people about
whom they were created.
And third, we refuse work
about minoritized people.
We commit to mobilizing data
so that we are working with
and for minoritized people in ways
that are consensual, reciprocal
and that understand data
is always co-constituted.
The second resource is
an amazing playbook.
Is the Digital Defense Playbook
made by Our Data Bodies,
which is an organization that works
with black communities around the country.
A lot of their work has been around,
surveillance and community safety.
I suggest this for anyone thinking about,
surveillance privacy around COVID-19.
These people have a lot of experience,
actually working with that in practice.
And the third resource is,
from the Bay Area Transformative
Justice Collective.
It's the Pods and Pod Mapping Worksheet.
So transformative justice
is a view of justice
that pushes us to think beyond,
just a single instance of harm
or single instances of harm.
To think about the underlying societal
and social conditions that enable harm.
And how communities might be transformed
to change those underlying conditions,
Pods and Pod Mapping
is a specific practice
that they do in these
transformative justice collectives
to think about who are the people
who you trust and can rely on.
And even though this was developed,
in the context of child sexual abuse,
there's been a lot of talk about
how practices such as pod mapping,
can actually be used
for COVID-19 mutual aid.
And so these are three resources
that I think are very
useful for my own thinking.
And I think for other
people were thinking about,
technology in the context of COVID-19
and particularly how it
impacts people differently.
- Great, thank you so
much Niloufar Salehi,
it has been fascinating.
I have a couple of questions for you.
First, what are some of
the dominant narratives,
about COVID-19 of which
you've become aware
that we need to counter especially those
about marginalized populations.
- So I think one that is very dangerous is
that black and Hispanic people
are dying at record numbers
because of their own risky behavior.
And there's a lot of long history actually
of attributing systemic
or societal issues.
What Ruha Benjamin
actually calls pre-existing
social conditions
to the individual person's shortcomings
or even to their moral character.
And as researchers, as a researcher myself
and wanting to strive to
actually do a good work
and do work that is in
the service of equity.
I think that we have
to be very self aware,
about the role that research
has actually played,
in creating those narratives.
Research on black people,
especially physically conducted
by researchers who are not
members of those communities
or engaged with them in a meaningful way,
has frequently mischaracterized.
For instance tightly-knit
black communities
as creeks or gangs.
Or has misunderstood actions
that people are taking
for their survival as risky behavior.
So that's something to be
very careful about anything.
(electronic noise drowns out speaker)
- So I'm wondering what
else in your research
and for other people doing research
what are some of the data
and methods of analysis
that you think can be employed
to help counter these theories?
- That's a really good question.
I (mumbles) fall back to
another dominant narrative
that I think is very harmful at this time.
And I think that's a false dichotomy
that we have between getting
where we're stuck between
surveillance or freedom,
security or liberty, national
security or human rights.
And we're stuck as if we
have to choose one of these
and we can only have have one.
And feminism actually
teaches us to question
and we think these binaries
and do they actually exist?
And this is one of the places
where I think we can look
we have a lot to learn from
people who have been dealing
with surveillance much more harshly
and for a much longer time.
And so I'm gonna refer actually back
to the digital defense playbook
that I briefly mentioned,
one of the things I've learned from
that has this mental shift
that we need and thinking about
what our communities need a security
to thinking about what we
actually need, which is safety.
And the playbook does a really good job
of explaining why these two are different.
So security is what we do two things
we secure our possessions,
we secure our communications.
And usually when we're
thinking about security,
we're not really centering human factors.
So an alarm system doesn't
by itself ensure that
we are actually protected
from actual harm.
Or for instance, a police
body camera doesn't
by itself prevent police the brutality.
So even though it might secure things,
it doesn't necessarily mean
that we are more safe because of them.
So thinking about and
centering safety instead,
makes us ask what actually makes us safe.
And in thinking about that
in the context of COVID-19,
tracking and contact tracing.
The things that actually make
it safe or having knowledge,
having reliable information,
having relationships
of trust with other
people online and offline.
So these are some of
the places where I think
we have a lot to learn
from marginalize people.
These are very hard times
and marginalized people have been dealing
with difficult conditions
for most of their lives.
So people of color have been very well
and resourced in maintaining,
tightly-knit communities of mutual aid.
People of color have been dealing with
and pushing back against surveillance
and while at the same time having to learn
how to make their communities safe.
And LGBTQ people and disabled people,
basically invented what
building online community is.
And so that is something that
we need more and more today.
So thinking about and centering
safety rather than security,
not only moves us beyond
this false dichotomy
and getting stuck between
having to choose whether
we want our health or our privacy
but it also helps us open
up our imaginations for
what is possible and what kind
of futures we actually want to build.
And I think that learning
from these lessons,
actually totally shifts our
model and how we ask questions
and what technologies we
actually end up designing.
- Thank you, your perspective
and Ziad's perspective is,
data science and public
health researchers,
I think really adds to this conversation.
Finally, now I want to
turn to Sarah Vaughn
who will give us yet another perspective,
which is that of anthropology
and qualitative data for COVID-19.
So Sarah,
please tell us about your work.
- All right, thank you.
Thank you Ziad and Niloufar
those fascinating presentations.
And especially this last point
that Niloufar brought up about this idea
that one should refuse or try
to think more critically
about the ways in which,
at risk communities
understand their historical
and social relationships
to data is quite important.
And one way in which I want to sort
of come into this broader
conversation about COVID-19,
at least in relationship
to broader issues.
And particularly around my
research on climate adaptation.
So I'm an anthropologist in here,
I'm in the Department of
Anthropology at Berkeley.
And most of my research
has been on issues related
to thinking about civil engineering
and the adaptation of
critical infrastructures,
specifically dams, sea defences
and others other water
works in the Caribbean.
And so part of those issues,
when thinking about climate adaptation is
to take seriously that those
experts and other communities,
living next to these
critical infrastructures,
recognize that those
systems need to be improved,
repaired in many cases,
but at the same time
that disaster itself will
still be a potential future,
which will they'll have to live with.
So this idea that we
live in a kind of context
where we recognize that disasters
are a potential reality,
even as we care for the very environments,
infrastructures that we live with is part
of the issue raised around
questions of climate adaptation.
And so part of why I
think climate adaptation,
is such an interesting case by which
to think about a comparative way
to get into questions of
COVID-19, is precisely
because of this issue around
living with uncertainty
and the ways in which
one can even move past
As we've already discussed questions
of security to questions of vigilance.
And so what's really interesting about
doing ethnographic
research on such issues,
is that ethnography as a method allows one
to think in broad sort
of comparative cases,
across historical and cultural context.
And so working in the Caribbean
which for many people,
at least from a Euro-American
perspective is one
of beaches, sun tourism, vacation,
is also a site that not
surprisingly because the
of many islands and coastal
nation's geographies
are kind of at a pivotal point
and kind of canary in the
coal mine for thinking about
how climate change affects everyday life.
And so, specifically my
research has looked at Guyana
and their historical relationship
to engineering sciences
as well as how a people ordinary
citizens themselves live
with the adaptation projects
happening on the ground views
of the on the work of
NGOs, state engineers
and well as international
consultants working
to rehabilitate a dam system
that has caused tremendous flooding,
in recent years in Guyana.
And so, one of the issues
that have has come out
of rehabilitating this dam system,
are broader questions about health care.
And specifically how it is
that engineers themselves think
through managing and adapting the system
to more frequent rain
events and sea level rise
as people will also on the ground try
to understand their newfound vulnerability
to waterborne and other kinds
of bacterial infections.
And so while I did my research in Guyana,
a lot of my work wasn't just
with these engineers doing
their modeling practices
as well as dam designing
but also with local
humanitarian organizations,
specifically the Guyana Red Cross,
which would go into local communities
and ask people about how they understood,
the different kinds of and
improvements that were happening,
these would be the dam and canal system
as well as how they live
with the effects of flooding
in their everyday lives.
And one of those effects was
this threat of leptospirosis,
which is a bacterial infection
that people usually get
through contaminated waters
from rats as well as livestock.
And so the idea for these red
cross projects was not only
to implement vulnerability assessments,
which are essentially
qualitative descriptions
of people's understandings of say,
how high their houses are built,
the quality of the roads
that they drive and walk on.
Where and how their livestock
are managed as well as sort
of everyday activities of
work as well as healthcare.
So they would make an assessment of,
to what extent people are
vulnerable to flooding.
These will be those everyday experiences
as well as to understand
the different ways,
in which people could make
their water sources better.
And so one of the effects
of these vulnerability
assessments was then
to implement water filtration kits.
Within these different communities
where the Red Cross working across Guyana.
And so one of the interesting things
that came out of these on
vulnerability assessments
and work with the Red Cross,
was that people were implementing sort
of novel ways in which to think about
how they could use their basic
materials at home in order
to build water filtration kits.
And so one of the effects
of these water filtration kits were,
people were understanding
that, on the one hand,
even though they recognize
they couldn't necessarily,
rely on the state or even
these NGOs to come in
to help them when floods would happen.
They would have already
these built-in networks,
social networks by which
they can begin to understand
how to build up their own ordinaly quote,
I would call them technologies
in order to better manage
floods and their health care.
So part of the effect of these
water filtration kits was
to better understand from
the Red Cross perspective
how everyday kinda ordinary
technologies could be a way,
in which people could not
only recognize the kinds
of vulnerabilities to
flooding and health risk
but also how they were
creating communities,
around these very
technologies as opposed to
or in conversation with
those from the state.
So there's this broader sort of problem
and conversation around
how is it one can start
to think about a kind of
social contract, if you will,
through the build up of
ordinary technologies
and healthcare that come about from living
with climate related risks and flooding?
And so what I find quite
interesting about this case
of Guyanese on climate adaptation
as well as the health risks
that come about thinking
about climate change is
that really researchers can begin
to think about what is technology?
Not just technology and a kind
of cost benefit assessment
but also technology in terms
of how people live with it,
how people, how people make it.
And the ways in which they understand it,
forming their relationships
to one another,
their built environments,
different ecologies,
different vectors, different state
and non state institutions,
as well as their expectations about
what the future holds for them.
So the important thing I learned,
from thinking about climate adaptation
and one which I'll circle back to,
I think is an issue related
to thinking about pandemics.
And specifically thinking about
how healthcare can be
understood as kind of part
of a public kind of idea
of the social contract.
Is that climate adaptation
is about understanding
how past, present and future
technologies are built on
and rely on one another.
So again, for this
specific example of Guyana,
their dam system, there
are different ideas
of what it means to build
up a water filtration kit.
It's not just based on
the specific risks at hand
because of intense flooding
but also this longer kind
of historical context
and understanding of the different ways
in which people expect the state
to help them or to not help them.
And Guyana specific
context or understanding
of racialized politics between
the majority Afro-Guyanese
and Indo-Guyanese populations.
As well as the different
ways in which people,
understand public and private space.
And so, one of the issues
I think is quite important
to them when thinking
about health-related risks
and this idea of what it
means to adapt to them,
from the broader perspective
of climate change
as well as local experiences
of emergencies and crisis,
Is to take seriously that
data comes from somewhere.
It's not just this attempt
to say I'm identifying different
populations and the risks
but also the ways in
which people understand,
their relationships to the
use of technology as well
as its absence in the
communities in which they live.
And so one of the things
I'm really hopeful about,
when thinking about how
researchers respond,
both quantitatively and
qualitatively to emergencies
and risks in whatever form they might be.
Is that we're recognizing that just
that simply collecting
demographic data is not enough.
And that in fact trying
to understand the broader
relationship and not only in terms
of how that demographic data is produced.
But also related to broader
ways in which people use
or don't use technologies is quite helpful
for understanding how people
live with vulnerability,
in the everyday and become more vigilant
to the kinds of disasters
that are on the horizon for their futures.
And so I think then,
especially in relationships to,
Niloufar's last point about
this idea of trying to recognize
how different identities of
the different communities
by which that we're
studying also gets shaped
by the very technologies
that we use to collect data.
So there's a kind of an
ad jargony sense of kinda
co-production between
the kinds of technologies
we use in the everyday and the
kind of data that we collect
and also understand to be
important to research projects.
And so trying to draw in a kind
of a broader perspective on the ways
in which not only people understand
how we use different
technologies than the everyday
but also the ways in which
various environments,
our very institutions, our
very political imaginaries
and our expectations
about what the state can
and cannot do for us
shape the way in which
we expect technologies to work for us.
And I think the most concrete examples
that we've had so far, I mean
in our conversation today
and more generally when
thinking about media incline
on COVID-19 has been the question
of ventilators for example, as well as
how we think about the
distribution of tests
or when we don't need to distribute test.
As well as the different
ways in which different kinds
of media are used to as
Niloufar pointed out,
disseminate information about COVID-19.
- Hey, thank you very much
Sarah, that was fascinating.
So, very often when people think of data,
they think of numbers.
They think of big data,
from social media platforms from census.
But sometimes you and others rely on,
this qualitative data from observations,
interviews, oral histories.
So I wondered if you could
say a little bit more,
about this qualitative data
and why it is so important when
we consider the question of equity.
- Sure, I very much
think that the question
of equity is not just about
imagining what's possible
but also what's actually
happening on the ground.
And if now the field
allows one to get to the,
as I've been phrasing
it kind of the ordinary
and everyday experiences of
how that data is produced
as well as consumed and disseminated.
So without a kind of ethnographic method,
one can't really understand the ways
in which someone understands the benefits,
the slippages, the absences
that are found in kinds
of what could be sort of
understood generally as big data.
And so qualitative research,
particularly ethnographic
research allows a kind of
what one could call a kind of snapshot
of how that data is understood,
in the everyday kind of context
as opposed to or at least
in relation to shift to
how that data is networked
across different kinds
of institutions as well as
understood in it's a framework
of a kind of future or
past understanding of
how those institutions use that data.
- For you to say that some
technologies have the potential
to undermine racial stereotypes.
Could you say a little more about?
What are these technologies
and how they might be able to help us?
- Sure so this example of
the water filtration kit,
I think is quite interesting.
In the sense that most
people in Guyana since 2009,
when these major climate
adaptation projects,
have been happening around its
dam and sea Defense Network,
have recognized that the state,
has actually implemented these projects.
Now to what extent people would say
that they've been to the design measures
that everyone wants, it's,
of course up to debate.
But no one can deny that these projects,
have actually happened.
At the same time, people are still living
with different kinds of
risks related to flooding.
And so it's people's very response
to understanding that
everyone is vulnerable.
All be it in differential ways
that people then rely on
these kinds of ordinary kinds
of technologies as I'm identifying here
as a water filtration kit.
To respond to the fact
that they're vulnerable
but at the same time
recognize your vulnerability,
is related to others.
And so in that sense, I think
this broader perspective
of how literally individual
experiences are related
to collective experiences and
how technologies allow people
to see those connections is what attempts
or what gets people to think more broadly
and to start to begin to deconstruct kinds
of narratives about racial marginalization
that frankly might not always
be there in the specific cases
of the crises that people are living with.
But oftentimes, people
rely on those narratives
as a kind of way to build
kind of political mobilization
as well as community in order
to deal with the problem at hand.
Now, of course that kind
of political capital is quite important.
But again, there's a kind of
nuance to the ways in which
that political capital I'm arguing,
gets mobilized across the use
of technologies in the everyday.
- Okay well, thank you Sarah,
I now I'm seeing that a number
of questions have come
in for our three guests.
Let me start with a question for Ziad.
So
If it tells us if a big
part in discovering the bias
of algorithms occurs by chance,
is there a structured approach,
a more structured approach
to discover the bias?
- Yeah, I think that it's a sad fact
that even our paper was mostly luck.
We happen to have access to this data
because of relationships we
had with a particular hospital
who had bought this algorithm.
And in fact, we've gotten the data
for a completely different
project and only understood
that the real story was this bias story,
about two years into that project.
So it was very ad hoc.
And that actually should,
I think, be a scary fact.
So I think maybe there
there are two answers.
On the one hand,
I think that our work with
regulators at the state
and federal level has made me optimistic
that there will be some kind of response
that will be nuanced and
actually quite appropriate
for detecting and
preventing this kind of bias
that we discovered in
these kinds of algorithms.
And so that's good.
And that's the way regulation kind of has
to work like one, one step at a time.
But that approach I think,
will only really work,
after the kinds of biases are
discovered and knocked out.
And so I think for that,
it's really important
that people have access to data
that they can use for research,
not to kind of do more
with the kinds of biases we know about
but to be figuring out the next kind
that the kinds that we
don't know about yet.
And so I think that research
is critically important,
the access to data is really important.
And I'd say that exactly
the kinds of things
that Niloufar and Sarah
were talking about.
It is that ethnographic
approach that really is rooted,
in an awareness of the
particular challenges
that certain communities face,
is a really important part of that.
So having people who both
speak that language of
social science, research history,
understanding structural
inequalities with the people
who can translate those
insights into data,
I think is really important.
The big roadblock right now,
is getting those people access to data.
- Thank you so much (background
noise drowns out speaker).
A really insightful
answer to that question,
which of course welcomes
our other two speakers
to also provide some answers about
how we might be able to
structure our approaches
for discovering this bias in algorithms
and in the way we collect data.
So Niloufar or Sarah,
if you'd like to say anything about this.
- Yeah, I'd like to.
So there are a couple things.
One is, what is the algorithm
even trying to measure?
And Ziad touched on this
where most of the time
it's extremely hard to know,
most of the time, we're
only looking at proxies.
And so especially when
it's something very vague
that is not measurable, then measuring
what I see then is becomes really hard.
And we're doing some work around,
automated interview technology
that takes in a video of
you performing an interview
and then does machine learning on it
to get at things like
your communication skills
and we're faced with the same problem
where there is no measure of
someone's communication skills
that we could compare with and
see whether the algorithm is,
doing the right thing or not.
And so we have to fall back again,
on measuring bias or discrimination.
And I think that something
that's really important here is working
with people who are in policy and law.
So this particular project
that I'm talking about
is in collaboration with,
Professor Catherine well
represented Berkeley's law school.
And we're sort of doing this work
where we're moving the
empirical work forward,
at the same time as we're
moving the policy work forward,
taking one step at a time to think about
what empirical work can best justify
and support the policy arguments
that we're trying to make here.
And so here again, like falling back on,
existing strides that have
been made in employment law,
around them discrimination
or around ableism
and in employment law are things
that we can really build on
to get there and yeah, like the outside,
it's one step at a time.
And then I'm also reminded of some things,
if Bill said in a talk
once where he was saying,
you can put some ingredients
together in your kitchen
and sell it as an aspirin.
You're not allowed to do that.
But it's very strange
that you can put together,
whatever algorithm you want
and use it in such high
stakes situations like health,
employment, even whether or
not someone goes to present.
So I think this is a
place where we really have
to be thinking about policy
and working very closely
with policymakers to think about
how we may actually be able
to mitigate these effects.
- Right, and particularly again,
in the climate adaptation context,
one of the ways I see researchers
and policymakers trying
to address this question
of what is the algorithm
that we should use to
identify these biases
as well as what is the kind of data
that goes into those algorithms.
Is literally around a kind
of an old practice of
long-range forecasting.
And how is it that long-range
forecasts themselves,
can be used towards different kinds
of health care and public
health policies in order
to alleviate as well as
help predict the ways,
in which different
communities would be affected
by health related diseases
as their ecologies
and local environments change.
And again, part of the work
of the ethnography then is
to ask what are the different
kinds of connections,
the different relations that
are being made between law,
say, economic modeling
as well as meteorology in order
to understand the different
ways in which health itself,
becomes a different kind
of variable under different
kinds of circumstances
as well as not just the different kinds
of social demographics but
expectations people have about
what an algorithm can or can't do for them
as well as different kinds of data sets.
- We only have a couple of
minutes left for questions.
One other question that has come up is
how is the pandemic playing out in ,
less economically stable countries?
There is an economic
disparity among nations
and one of you maybe, can
give a one minute answer about
how this is playing out in terms
of the effects of COVID-19.
I'll just give one thought,
which is to say that,
if you think that we have
a bad data situation,
in terms of real intelligence
about the epidemic in the US,
those disparities are magnified by orders
of magnitude in the developing world,
we have so little idea of
what's actually happening
because all of our data
are again, coming out
of healthcare systems, where
the same kind of challenges
that black patients face in the US,
are effectively faced by entire countries.
And where the ability for
people to get tested is
so incredibly low that,
if we don't see any
COVID cases in a country,
does that mean things are going great
or does that mean that
we haven't even begun
to test people and understand
how big the epidemic is.
So I think again, that
just highlights the need
for not just relying
on these biased systems
to collect data for us but to actually,
tackle those data collection
challenges head on
and go get the data, given
the enormous economic value
and the case for investing
in data collection.
- I will say, as a specific--
- Unfortunately, I think
it's time to wrap up.
But I wanna thank all three of you
for your different approaches,
Ziad Obermeyer, Niloufar
Salehi and Sarah Vaughn.
And I would also like to announce
and I would like to thank the
audience for joining us here.
And I would like to announce,
two upcoming Berkeley conversations
which follow closely on this.
The first is on Friday,
this Friday, April 24.
From 12 to 1pm Pacific time,
straight talk, the
conversation about racism,
health inequities and COVID-19.
So that will follow up on
this one very, very nicely.
And the second is next Monday, April 27
12 to 1 p.m Pacific time.
And that also follows very
nicely in particular on
what Sarah's been talking about
climate change and COVID-19.
Can this crisis shift the
paradigm of climate change?
So again, I wanna thank our three guests.
And I wanna thank everyone who joined us
for this conversation from the division
for Computing Data Science and Society
and from the Berkeley
Institute for Data Science
and of course from the Berkeley community.
Thank you so much for joining us.
