[MUSIC PLAYING]
ELISSE ROCHE: Hi, and
welcome to Cloud OnAir,
live webinars from Google
Cloud hosted every Tuesday.
My name is Lisa Roche.
And I work around data for
social and environmental
impact.
JAKE PORWAY: I'm Jake
Porway, the founder
and executive
director of DataKind.
ELISSE ROCHE: And
we're here today
to talk to you about leveraging
data science for social change.
As a quick note, you
can ask questions
throughout the webinar
on the platform,
as we have Googlers on
standby ready to answer them.
And we'll also be
hosting a live Q&A
at the end of the presentation.
So with that, let's get started.
JAKE PORWAY: That's good.
ELISSE ROCHE: So
another way to express
the idea of data science
for social change
is data for good, which is
interdisciplinary movement
where private
companies are working
with nonprofit organizations,
NGOs, and also IGOs
in order to apply
big data solutions
to real-world challenges.
And this has been going
on for quite some time.
A couple of highlights
include, back in 2008,
the CDC collaborated with Google
to launch Google Flu Trends
under the idea that if you could
actually track online searches,
you would potentially be able to
predict the prevalence of a flu
outbreak.
Later, in 2011, the United
Nations Global Pulse Initiative
advocated for a data
philanthropy movement
where private companies could
actually donate or share data
to advance humanitarian causes.
And in 2013, the UN also
utilized satellite imagery
in order to provide
targeted disaster relief
by comparing pre-
and post-disaster
imagery to determine which
communities needed emergency
relief first.
And in 2017, just
last year, the UN
hosted the world's
first World Data
Forum, with the
belief that it would
help to foster collaboration
between different organizations
and sectors and
also help to achieve
the 17 sustainable development
goals which comprise the 2030
Agenda
And in 2018, today, there are
interdisciplinary programs
and collaboration.
So companies like
Google Cloud are
working with organizations
like DataKind
in order to bring the data
for good movement to life.
JAKE PORWAY: So
I think what's so
fascinating is that
this is still very
young in a lot of ways, too.
There hasn't yet been a
killer app for social impact,
and there probably won't be.
Because though groups have
started to find solutions
like these, really
what I think people
are learning in these
first five or six years
that this has been up is
that collaboration is really
the key.
Technologists don't always
know exactly what to build.
Social organizations don't
have the tech skills.
And so I think this
last point here
is so critical-- that for all
the cool, whiz-bang solutions
we're seeing, collaboration
and interdisciplinary work
is really key to this
movement growing.
ELISSE ROCHE: Absolutely.
So what is Google Cloud doing
in the data for good movement?
Well, since 2008, we have
sought to empower nonprofits
by offering essentially
integrated solutions for them
to access, starting with
Google Earth Outreach,
allowing them to
essentially bring
their stories to
life using premium
Maps APIs and Geo Tools.
With G Suite for
Nonprofits also, nonprofits
are able to work and collaborate
together with productivity
tools using Gmail
and Docs at no charge
and also GCP public data
sets, which actually
helps to democratize access
to planetary-scale data
sets provided by organizations
like the World Bank, the EPA,
and NOA, just to name a few.
And now anyone can
essentially query that data up
to one terabyte
per month for free.
And last but not least, there
is Kaggle and their data science
for good events, which
allows nonprofits
to host an online event and to
tap into the community of over
1.5 million data
scientists in order
to apply big data solutions
to real-world problems.
And in addition to
that, we're exploring
working with nonprofit
organizations
in a hands-on capacity,
like with DataKind.
So back in April, we
actually worked with them
around their ecosystem event
at the Skoll World Forum
for Social
Entrepreneurship where
we discussed the potential
of applying big data
solutions to
real-world challenges
in the humanitarian sector and
also worked with nonprofits
to help scope potential
data analytics projects
for future engagements.
And, in addition to that,
just a couple of weeks
ago, we hosted one of
their DataDives, which
is a weekend-long
problem-solving event
at the Google New York City
office, which is actually where
we're streaming from today.
And we'll get into that
in a bit more detail
later on in the presentation.
But before we dive in there, I
wanted to rewind and actually
talk about DataKind, who they
are, and what they stand for.
JAKE PORWAY: Awesome.
Thanks, Elisse.
So a little background
on Data Kind
is that we're a nonprofit
dedicated to harnessing
the power of data science and
AI in the service of humanity.
And the organization
really arose
from experiencing lots of data
scientists, machine learning
folks, AI engineers who
clearly wanted to give back
but didn't exactly know
how, and at the same time,
realizing that social
organizations are awash
in data.
I mean, we all are, right?
There's digital
information popping off
of our cell phones
and our laptops.
And so we have these
new opportunities
to learn more about our world
and ourselves than ever before.
And so really, we
thought the best way
to do that was not just to have
industry building what it's
building with data
science and machine
learning but finding ways to get
those same tools into the hands
of people working
in social impact,
whether those are governments
or nonprofits or anyone
else dedicated to
seeing a better world.
And as a data scientist
myself, it started just
by talking to a few
folks and saying, hey,
does anyone else want to
work with some nonprofits,
and being very surprised
to find that not only
did people in my own
friend group want to do it.
But we had thousands of people
around the world writing
and saying, hey, I
want to do this, too,
nonprofits and governments
raising their hands
and saying, we want to be part
of the data for good movement.
So that's how the
organization started.
And we run it a lot like Doctors
Without Borders for data geeks.
We just get folks who
want to volunteer and give
their time back to work
and build solutions
alongside nonprofits.
In the last six years, we've
done about 250 projects.
We've got about
18,000 folks signed up
as volunteers and a bunch
of communities running
DataKind versions in their own
cities as volunteer chapters.
So I'm very excited to see this
data for good movement growing
immensely over the
last couple of years.
I'm going to give you just
a few quick examples of what
some of the projects look like.
So one was a project we did
with the American Red Cross.
They're a fantastic
organization.
And one of the challenges
they were trying to solve
was how to stop
preventable fires.
So there's a problem
where if a fire goes off
and people don't have a
smoke alarm in their home
or in their buildings,
they can get hurt.
Or worse, they can die
because they can't get out.
So the Red Cross
decided we're going
to put millions of smoke alarms
for free around the country.
The question was,
where do they start?
Where should you put these?
So we teamed them up with
some data scientists who
combined their data about
fire data along with publicly
available data from
community housing data
or open data that
others had made
available to build a
predictive model that
said these are the places
where if you put smoke alarms
they're going to be most
effective for saving lives.
You can see that
little map on the slide
shows the areas
where they thought
there would be the most
fires and the most folks that
could be benefited from this.
So now, the Red Cross
has intelligence to say,
here's where we can go to
focus our limited resources.
And they're now saving
hundreds of lives per month
with this algorithm.
So that's one great example.
Another example
about a group trying
to figure out where to use
their limited resources
is with the Moulton
Niguel Water District.
Now, they give water to folks
all across Southern California.
And their job depends
on being able to predict
how much demand there is.
And the stakes are high
because if they're wrong,
the only way they
have to solve this
is to go take a dump truck,
fill it up full of water
across town, and ship
it to the people that
need the water
wherever it's missing.
And that's a huge cost.
It's environmentally
destructive.
It's a bad thing.
So some data scientists
teamed up with them
and saw if they could do better.
And basically,
they built the tool
that you're seeing in that
picture on the slide which
is able to predict demand
to the block-by-block level.
And now they're not only able
to get more water to millions
of people in California,
they've saved over $25 million
just saved from shipping costs
and from being more efficient.
So I think these are
great examples of what
can happen when you use
data science and machine
learning and prediction to help
social organizations target
their efforts more
cleanly and better.
So that is largely the
work that was done there.
But now you may ask, well,
how does this happen?
We've got a couple
of different models
for how we get data scientists
together with nonprofits.
But one is our DataDive.
And DataDives are like
social impact hackathons.
We love that hackathons are
high-energy to get a lot done.
But we try to modify them
so that we really made sure
that by the end of these
events, social organizations
and nonprofits had something
that they could use.
So for our DataDives, basically,
we've got three phases.
The first is exploration.
So we worked for a while with
these nonprofits figuring out
what their problems
are, where the data is,
what can and can't be done.
A lot of organizations
don't even
know what's possible at first.
So we build that
out so that we can
have really clear
scopes and clear data
so that, in the second
phase during this big event,
hundreds of volunteers could
show up and just start working.
They could hack away at
a problem they wanted.
We had the nonprofits there that
they could sit and talk with.
And so for 48 hours, they
worked on these problems
so that in stage three, we
can hand off those solutions
to partners and find
that groups didn't just
come up with interesting
ideas or a few creative starts
like you often see in
hackathons but actually made
real progress on their goals so
that the nonprofits could all
walk away with something
that they could use.
So that was basically the
setup for how we work and we
did with the DataDive.
And now, we'll talk a little
bit about the event itself.
ELISSE ROCHE: Absolutely.
So from the Google
Cloud side, we actually
realized the potential
of this kind of model
to drive real-world
impact, which
is why we hosted
one of the DataDives
in our Google New
York City office,
drawing over 130 attendees
from data scientists
to analysts to designers all
passionate about defending
human rights with data and also
building resilient communities.
The nonprofit
organizations that we
worked with over the
course of that weekend
all worked around international
development projects
or in defending
different communities
from the harmful effects of
international development
projects.
And these three organizations
were Inclusive Development
International, Accountability
Council, and also International
Accountability Project.
And over the course
of the weekend,
they were actually able to
make significant progress
on the challenges
that they posed
and walk away with
tangible solutions
that they would be able to
then integrate immediately
into their processes,
which is truly inspiring.
Because you actually
see that you
can sit down in a
room full of computers
with a roomful of very talented
and technical data scientists
and then take that
and immediately
throw it into the real
world and have that impact.
And in addition to all
of that, we actually
had the opportunity to host
a Google Cloud breakout
session where we discussed
the currently available tools
and resources for
nonprofit organizations
and how they can access them
through Google for Nonprofits,
Kaggle, and also GCP
public data sets.
So on the projects themselves
and the nonprofits--
JAKE PORWAY: Yeah.
So let's talk about these folks.
As Elisse mentioned,
they're all in
the international
development space, which
means that all these
organizations are focused
on defending communities
who need help representing
their rights or
getting representation
when international development
is happening in their area.
But they each tackle a very
different piece of this.
I thought that was kind
of interesting as you
listen to the three projects.
So the first was the
International Accountability
Project.
And their whole purpose is to
advocate, make policy change,
boost local advocacy efforts
to make sure that communities
know about every international
development project happening
around them and
could act on them
or defend themselves
when needed.
Now, they're a pretty
visionary organization.
I love these folks
because they're
already data-savvy in the sense
that they have worked really
hard to collect a whole
database of every international
development project they
know of that's going on.
So they could tell
you right now what's
going on in Peru or in Ghana
and who's responsible for it.
So that's really great, and
they've already done that.
The challenge that they
had for this weekend,
however, is that
they were trying
to understand how they could
get more updated information
about what's going on.
So there are news
articles or press releases
that happen around
these projects,
but they just go
off into the ether.
So IAP thought to themselves,
if we have this information,
maybe we could have an
early warning system
when we know there's
going to be a new project
or when it's going to change.
But with that would require is
scraping huge amounts of news
information and matching it.
So that's what the
volunteers set about doing.
They basically scraped tons
of news articles and press
releases.
They built some semantic
matching algorithms,
some classifiers to
try to understand
which projects these articles
reacted to or related to.
And then they pulled
that all together
into one workflow that
actually now pulls that data,
tags it, and puts
it in this database.
And so now IAP is walking
away with a whole API
on top of this new data
model that basically gets
current events and news
media into their system
so they can act
more effectively.
Now this allows them to
get ahead of changes,
get communities ready before
they ever could before.
So it's kind of like having
early sight, which I think
is awesome.
So that was IAP.
That was one group.
Fantastic work, and they're
going to be continuing that on.
The second group we
want to talk about
is Inclusive Development
International.
Now, Inclusive Development
International, or IDI,
they're basically focused
on following the money.
So whenever there's an
international development
project-- say you're
building a mine in Ghana--
they want to know
all of the funders.
Who are the banks and
investors in these projects?
So then if something
does go wrong,
if there's an
unethical activity,
they can understand who those
funders are and go to them
and say, hey, did
you know about this?
Maybe we should
take some action.
Now, that's all well and good.
But if you think about what it's
like to be IDI, when they want
to know who all
these funders are,
they have to basically go out
and find all that information
each time.
So you have to go to each
development bank's website.
You got to type in the
development project's name.
You got to get that information,
copy it, paste it, et cetera.
It's this huge,
laborious human project.
So what they were
focusing on this weekend
was trying to figure out if they
could automate that process.
So could they basically
automatically scrape this data
and put it all into one
workflow so that they could
have this at a moment's notice?
So that's what
they set out to do.
And miraculously, in the just
48 hours of this weekend,
they set out to
build 13 scrapers.
The team built 11 scrapers in
that time that were all working
collecting that information.
So they can now scrape
that information.
There's a front end
that they're building
that now allows IDI
to go in and just type
that project name, get all
that information in one click,
basically go about
doing the work
that they really need to do.
So it's saving their
time from this human join
that they're basically doing
and instead automating it
so that they can focus
on the work ahead.
So they're going to
be building that out.
And they were really
excited to say
that they could just
immediately start using
this right after the weekend.
ELISSE ROCHE: That's great.
And Accountability
Council-- so they actually
work around amplifying
the voices of communities
around the world to
help protect them
from the harmful effects of
international development
projects.
And an example of this
that's worth noting
is that they actually
have been working
with a series of
villages in Ukraine
who are facing the harmful
effects of industrial farming
where they're living with
17 million chickens housed
in a series of structures that
extend as far as the eye can
see.
And there's actually a picture
of this on the next slide.
And so the
Accountability Council
is actually working
with them to fight back
against the company and
the development banks that
are funding them
in order to address
the pollution, the water
scarcity, and the pesticide use
that all result from
intensive animal farming.
And the challenge that they
posed to the DataKind community
during the DataDive was
analyze online complaints
and identify which aspects
of these complaints
actually determine success.
Is it something as simple as
the time in which they're filed?
Is it the number of valid issues
associated with the complaint?
So they dug a little bit deeper.
And they analyzed a series
of projects and data
across a variety of
different websites
and then consolidated that
in one website in order
to analyze that.
And from there, they
actually came up
with a series of high-impact
insights, one of which
is actually pictured on
the slide, where they noted
that there is more
than a 50% chance
that a complaint would become
eligible to be addressed
if it has three issues or more.
So that's something that
they can immediately
start integrating
into their processes
and consider when
filing these complaints.
If they just have two but
there's an opportunity
to add a third one,
that would then
help to ensure the success
of that particular complaint.
And so it's worth noting
that at the end of the day,
we're talking about real
people with real problems.
And this data and
this kind of work
that we're doing over
a DataDive weekend
can have immediate impact.
JAKE PORWAY: Yeah.
And I think that just adding
to that, sometimes when
people hear these examples,
they're like, yeah,
but that's not scale.
That's not millions of people
affected or terabytes of data
analyzed.
Sometimes it's not.
But really, like
Elisse was saying,
these are people's lives
that are affected by this.
And so if Accountability Council
can get three more complaints
through, when you're that
community or that family that
is looking for help, getting
that through versus not
is a huge issue.
That could be life or death.
So I think that's what
I hear in all of these--
is just that people's lives
are being saved and helped
so tangibly from this work.
ELISSE ROCHE: Absolutely.
And I think one of the
villages in Ukraine filing
that complaint with
Accountability Council
described them as one
of their last hopes.
JAKE PORWAY: Yeah.
And so I think that's why
we're so excited by what
we're seeing in these events--
is that all these
organizations are now
boosting their missions
with data science
and machine learning.
So one of the takeaways we see
is just the shocking amount
that gets done in 48 hours.
You may think that if you're a
nonprofit that a data science
project has to take
years upon years.
You got to build a team first.
But we see organizations that
over a few months and then
a quick weekend
make huge progress
just by getting folks together.
And similarly, data
scientists may think, well,
what could I do?
Well, there's a lot.
You can make a big difference
in a little bit of time.
And so that's, I think,
really exciting--
is that when the right people
come together in the right way,
you get a lot done.
The other thing, though,
is that of course it's
not just a weekend.
We're not going to solve
every problem in a weekend.
It would be foolish.
ELISSE ROCHE: As much
as we would like to.
JAKE PORWAY: Yeah, as much
as we'd all love that,
these are systemic,
endemic challenges.
They're going to
take a long time.
But what I see in this is that
this is a catalytic moment.
These are organizations
that are not just
catching up with data science.
They're visionary organizations
on the front lines
of social change
looking for all the ways
that they can do
their jobs better.
And from something
like a weekend event
or doing some data
science projects, here
they're now starting this
journey into something bigger.
We've seen organizations
often come back
from this process able to go
to their funders to say, hey,
we need funding
for a data science
because look at this
complaint registry.
Or look at this.
Look what we can do.
That's huge.
We see organizations
going and actually
now training in some of their
staff to do data science
or coming back and articulating
their data science and machine
learning problems.
We've already heard that
from some of these groups.
So I think that's what's the
second takeaway for me-- is not
just how much great
work gets done
but how it really catalyzes
these data journeys
for these organizations.
And it's so critical that
they get this resource
because, again, they're on the
front lines of making a better
world for all of us.
We want them to have
everything that can.
ELISSE ROCHE: Absolutely.
Just think about it
as a proof point.
With the nonprofit
model, it can be
difficult to secure
resources to support
a variety of different projects,
especially when they're
trying out something new.
It's great to be able
to point to something
that they found successful.
So that's great.
And another thing just to add
onto that and a key takeaway
on impact is the power
of collaboration.
So early on in the presentation,
I discussed data for good
as a movement between private
companies and nonprofits.
And that's something
that we can see
take place in events
like the DataDive
and realizing that individuals
and companies and organizations
all have different
resources and abilities.
And so if you're actually
able to take all that together
and combine those
resources and abilities
and essentially deploy them
against a strategic project
that is driven by data, you
can really make a change,
really make a difference.
So that's something
that's really inspiring.
And being there present
at my first DataDive
as well while we here
hosting it Google New York,
I saw all that come to life.
And it really was an
impactful experience for me.
And I believe it was that
way for the volunteers
as well and especially the
nonprofit organizations that
benefited from it.
JAKE PORWAY: I
couldn't agree more.
Yeah.
Just adding on that, for
anyone who's listening,
there's a Burundian
quote I love, which
is what you do for me
without me, you do to me.
And I think that's
really important
when thinking about
designing these things
and in collaboration.
There's not just going
to be some tech solution
that I could come up
with in my room that's
going to solve these problems.
And the nonprofits
couldn't necessarily
solve them on their own.
So if you're out there
taking your own data journey,
think about that
first and foremost--
not what can I build, but
who can I build it with.
Because that's really the
way this stuff gets done.
And I couldn't agree more.
I was so heartened
and enlightened
seeing everyone work
together and seeing
that support across
corporations, nonprofits,
and everyone working together
to make some good stuff happen.
ELISSE ROCHE: Absolutely.
So what can you do?
Come together and join the
data for good movement.
JAKE PORWAY: Yeah.
I'd be remiss not to mention
that, of course, at DataKind,
we're always looking for
folks to come and be involved.
So if you're a data scientist
who wants to help out,
come to datakind.org
and come get involved.
We have volunteer opportunities.
We've got projects.
There's jobs, all sorts
of stuff going on.
We also have meet-up
communities around the world
that you could join,
so please do that.
And we're also starting to look
for ways to go into issue areas
more deeply.
So keep an eye out.
We will be having some very
exciting projects coming up
very soon around that.
ELISSE ROCHE: Awesome.
And also note from
Google Cloud--
so with Kaggle, if
you're a nonprofit,
you can apply to actually host
one of those online events
to apply big data
solutions to your mission
or real-world challenge.
And if you are a data
scientist, analyst,
or whomever just
looking to learn more
about data analytics,
you can actually
sign up to join the
Kaggle community
and get started from there.
So with that, thank you.
That's a wrap.
And stay tuned for the live Q&A.
Welcome back.
We have the questions.
So the first one--
how can we make data science
more accessible to the public?
JAKE PORWAY: That
is a great question.
Let's see if we
have the answers.
So I think when I hear
this question, what
I think about is things I've
heard people say about the fact
that as technologists,
we may understand
what an algorithm can do or
what statistical bias is.
But for most people--
I think about my
aunt and my grandma.
They're walking around just
clicking apps and having
decisions made for them.
So I think the question
a lot of people
wonder about is how do
we help people understand
what this stuff does.
Because if we don't, we
run the risk of people
not knowing if the data
is being used unethically
or not being able to engage
in it, just being out there.
So I think a couple of
things come to mind for me.
Basically, one is
webinars like this.
We've got to get out
there and talk about it.
When you say stats
or math or data,
most people's eyes glaze over.
But when you say,
hey, here's a way
that folks are using data
to empower communities
or to learn about when
a development project is
going to happen sooner,
they go, oh, OK.
I never really thought
about data that way.
I thought about
spreadsheets or math.
But that's actual.
That's cool.
So I think we have to
talk about it more.
So if you're someone out there
who knows about this stuff,
shout it to the world.
I do also think we probably need
a little more public literacy
around data and
around algorithms.
One thing I think a lot of
people are frustrated by
is the fact that
so many people seem
to be confused by what
data does or using data
to prove a political point that
maybe they shouldn't be making.
And I think we haven't yet
built a public critical thinking
around data.
When you look at a newspaper
article or a picture,
some people have built
this critical lens
about how someone
took that picture.
We know when we see a
provocative image, we go, OK.
Well, there was a photographer.
They chose to take this
over something else.
They probably have an agenda.
What's the agenda?
Hopefully, people are
thinking that way.
But I don't think we're thinking
that way about data yet.
You see a chart.
And someone says,
45% of people are
likely to die from
smoking or whatever.
That doesn't make sense.
But I think people
go, oh, that's fact.
No one thinks, well, what data
did we collect or not collect?
What's that person's angle?
And so that's a kind of--
ELISSE ROCHE: Or even just
what's on the x- versus y-axis
and where do the numbers
start there to actually
then demonstrate a particular
point versus zooming
out and looking at
the whole picture.
JAKE PORWAY: Exactly.
There's a huge visual
literacy and total tricks
you can play with that.
And that just takes training.
We just have to get
people used to that.
So I think the
more, again, we can
talk about it, the more
that we can educate people,
maybe even-- this will
be controversial--
but replace calc classes
in public education
with statistics classes.
I think that might be
more relevant these days.
Things like that I think
are going to be helpful.
But the conversation
has to start
with the folks who know it
and getting that out there.
So get out there and
talk more about it.
I think that's going
to be really helpful.
ELISSE ROCHE: Yeah.
I think it's also
important to think
about data science
versus science,
because they're not too
different when you really
think about them.
So like a scientist,
a data scientist
will take observations,
will actually
come up with a hypothesis and
then test that hypothesis using
their own scientific
method against the data
and against the observations
that they've collected,
and then seek to derive a
meaningful insight from there.
And I think it's
also rhetorically,
looking at the term
data, thinking about what
actually comprises data.
It's a historical record.
It's temperature.
It's the amount of sunlight
or rain in a particular area.
It's a variety of things.
It's anything that
can be recorded.
If you keep your
own journal, you
could consider that
to be data and then
use your own kind
of semantic analysis
to actually see what are the
most prevalent words that
come up in this particular
year or something
within your journal.
So really, data can be
anything and everything.
And it doesn't have to just
be this idea of binary.
JAKE PORWAY: Totally.
And I'm so sorry.
Before I move on to
the next question,
there's just one last
thing I want to it here,
which is that as
data scientists,
one of the things that
we can do is just share
out how we're doing what we do.
At DataKind, one
of our values we
believe very strongly in
that we train everyone on
is transparency.
The idea is that if you are
clear about what assumptions
you make, how you're using
that data, what that data is,
what you're trying
to do, that's going
to go a long way to people
understanding what goes on.
Well, let's face it.
There's some not-so-great things
going on in the image of tech
these days.
One of the best
ways to combat that
is to just be open
about what we're doing.
And I think that's going to help
demystify a lot of this stuff.
ELISSE ROCHE: I think
demystify is a really good term
to think about.
And so on the next
question, what
is the role of data
science, your opinion,
and the humanitarian
sector, especially given
your work with DataKind and
you founding the organization?
I'd love personally to
hear about your intentioned
thoughts around this in
founding your organization.
JAKE PORWAY: Oh,
well, thank you.
I would love to hear
your thoughts on this
as well because I know
you've seen a lot of data
for humanitarian work.
Look, I think it
actually comes back
to what is the role of data
science, first and foremost.
Like you said,
it's like science.
You're getting information.
You're able to not
just look at the past
now, but sometimes
predict the future.
That's a pretty
new human ability.
At scale, there's
so much more data
that allows us to do
that and, of course,
so many more things
that we can automate.
Like we said in the webinar,
how many man hours--
or person hours, excuse me--
are spent just typing
things into a computer
and collecting that
information when you could be
out actually solving problems?
So that's the kind
of stuff that I
think data science and
machine learning is great for.
So why is that different
in the humanitarian sector?
I think it's actually
not that much different.
I think anywhere someone has
a clear thing that they're
working on to get more people
into homeless shelters,
to speed up transportation, to
prevent global warming-- all
that stuff is opportunities to
learn from the past from data,
predict what might happen
in the future with data,
and automate those processes.
So I think that
opportunity is everywhere.
And that really ties back
to your question about how
we started the organization.
At the risk of sounding
a little grandiose,
it almost felt immoral,
for lack of a better word,
that we could have this
wonderful technology that
could teach us so much
more about our futures
and automate what we do and
not apply it for social good.
If there wasn't a world
where we were seeing this,
make the world a
little bit better,
then what are we really doing?
So I think to me, what's
so exciting about this
is that the whole big
data revolution has
put this opportunity in the
hands of many more people
across the globe
than ever before.
So I think the
opportunities are endless.
I think whether it's the
big lighthouse projects
that you see up there that
seem really cool, like using
satellite imagery to track
poverty estimates, which you
see the World Bank
and others doing,
to even just the
seemingly mundane stuff,
like optimizing someone's
workflow so they don't have
to fill something out by hand--
all those things are resulting
in the world being better
thanks to data science.
That's what I see.
I'm curious.
Are you seeing similar things?
Or where do you see the role?
ELISSE ROCHE: Yeah.
I'd say that it's
important to think
about data analytics and
data science as a tool
and as a method.
So I was actually just
thinking about surveys,
like qualitative surveys.
And that's something
that is also a method
that nonprofit organizations,
companies, whomever
can also employ.
But what are you going to
do with a qualitative survey
if you don't have essentially
the underlying foundation
for what you're trying to
learn from that survey?
So with that being
said, you can think
about data analytics and machine
learning in the same way.
And this is going back to
this idea of data science
similar to science.
What is the fundamental question
that you're trying to solve?
What is the potential
data analytics here
to help you solve that?
And how can you aggregate,
organize, synthesize data
in order to help
you achieve that?
So something I actually heard
from the nonprofit participants
in the DataDive as well is
really the power of data
but also understanding
that it is something that
needs a firm and guiding hand.
You need to have that
intention behind it.
Because also to
your point from what
were saying before
around essentially data
science, the reality
that you have
to have that
underlying intention,
that underlying
insight into something
in order to actually
bring something to life.
JAKE PORWAY: Start with
the question, not the data.
I can't remember who we
ripped that off from,
but it's on our
walls at the company.
You got to know what
you're trying to do.
And then this is
in service of it.
I think that's a great point.
ELISSE ROCHE: Yeah, absolutely.
So again, having that guidance
and also understanding,
developing the understanding,
and the nuances
of using these kinds of tools.
JAKE PORWAY:
Couldn't agree more.
ELISSE ROCHE: Next question--
so what skills are
needed to take part
in the data for good movement?
What do you look for
in your volunteers?
JAKE PORWAY: Oh, such
a great question.
I think when people
hear this they think,
oh, so I've got to be some
AI expert, machine learning
engineer.
And yes, there's
a place for that.
We definitely need that.
But when I actually
think about what
the hardest parts
of this work are,
it's really around
communication and collaboration.
So imagine this.
If you're a data scientist, you
probably do work for a company.
And they have data that comes in
through infrastructure and QA.
And you've got
business objectives.
And you can just get in there
and tweak things, optimize,
build some models on
some nice, clean data.
Now imagine that you're thrust
into a nonprofit organization
that does not have that data
infrastructure where they
cannot necessarily articulate
their data science problem.
All of a sudden, you've got
ethical factors weighing in.
You've got people who have
tons of different constraints
on them.
The skills you start
to need are the tools
like being able to
hear someone's request
and really understand what
the underlying need is.
The Red Cross project that we
spoke about before is actually
a good example.
So when we first came to
them, the question was,
how do we upgrade our databases?
That's our data problem.
And sure, those databases
probably needed to be upgraded.
Could have done that.
But it took some
data scientists with
a smart, analytical mindset to
step back and, to your point,
ask the question.
Start with the
question, not the data.
And they said, well, what
are you trying to do?
And that's how they
eventually got to this idea
that we're trying
to prevent fires.
We need to put the
smoke alarms somewhere.
Well, how do you do that?
We're not sure.
That's how they found
the data science problem.
So I would say that's
actually really
some of the most critical
skills, whether with DataKind
or anyone else in the data
for good movement, that
are needed--
being able to hear a
problem and understand
what the solution could be, to
combat a problem with humility,
to realize that we may know
technology, but that's it.
These other people know the
space, their issue area,
the people involved.
To come in and not think,
I've got the solution,
but instead saying, hey, maybe
I could be some support--
I think that humility
is a big role.
And then being able
to communicate across
all the technologists
and non-technologists.
I think those are huge.
So I would say as
much as the work we do
is around tech-- and need
expert technologists--
those other kind of
critical thinking
and collaborative communication
skills are almost worth more.
So I'd say bring those.
And also, if you're
feeling like,
yeah, but what if
I'm not in that boat,
either, hey, don't worry.
This whole thing is a
big pro bono operation.
We need people who are project
managers, communications
experts, UX designers.
There's a role for almost
anyone in this movement
because, again, it
touches all of us.
We're shaping society together.
So really, there's
room for everyone.
ELISSE ROCHE: Yeah, absolutely.
So you said something I
was thinking about myself--
critical thinking, being
a critical thinker.
Even though we are
talking about technology,
we are talking about data
analytics and science,
you don't necessarily
need to have
a technical background of
really any kind to participate.
Because you were even
mentioning before consulting--
this idea of having consulting
experience or something
could absolutely be useful.
JAKE PORWAY: Totally.
All those management
consultant frameworks
that everyone laughs at--
they're really useful
for getting people
to think through a problem.
That's where you have to
start to get this stuff right.
ELISSE ROCHE: And also, I'd
say research experience.
So if you have conducted
any qualitative
or quantitative research
outside of data science,
that's something
you can also employ.
The ability to synthesize
large amounts of information,
especially if they're
provided by said
nonprofit organizations,
so you can then parse out
those individual questions
that we were talking about
before that are critical
for data science--
I'd say that's
also something that
is a very useful skill in
the data for good movement.
And also, design--
so increasingly,
data visualizations are very
important to expressing data.
Because if you think about
it, as an average person
or consumer, if a
scientific report comes out,
that's not necessarily
reaching you.
And if it is reaching
you, it's often
distilled in a series of ways.
And similarly, data
analysis can then
be achieved or published in
a certain way within a niche
group.
But it's not necessarily
reaching the average person.
So with that being
said, data visualization
actually comes into play there.
That's incredibly useful
to synthesize information.
That's something
you can actually
do with Data Studio on the
Google Cloud side, which
connects to Google BigQuery
where you could, for example,
analyze the GCP public data
sets that are available,
as I mentioned before, from the
World Bank, EPA, NOA, et cetera
and then visualize
that information.
Explore and see what you
can find in Data Studio
because that helps
to drive action.
You have a dashboard.
You have these clean charts
and maps and whatever you'd
like to customize from there.
And you can share
it with others.
JAKE PORWAY: Very cool.
We have maybe time for one more.
Or do you want to
do both lightning?
ELISSE ROCHE: Let's see.
Do you want to
try both lighting?
JAKE PORWAY: All right.
Let's go for it.
Lightning round.
ELISSE ROCHE: We've
been waxing a lot.
So next question--
where can data science
make the most impact?
JAKE PORWAY: This
is a great question.
People always ask, what's
the issue area to go into?
ELISSE ROCHE:
That's a tough one.
JAKE PORWAY: I would say
there's almost no limit.
Because as we
mentioned, data science
can help with seeing the
future and automating it.
And you can use that everywhere.
I will say I think we see
great opportunities wherever
there is a really quantifiable
outcome that everyone's
agreed on.
So your health outcomes,
like eliminating a disease,
food transport, water--
those are a little
easier than some
of the more human rights
advocacy-level kind
of projects.
But there's still
opportunity there, too.
But I think maybe
one of the ways
I'm going to spin this,
even in the lightning round,
is what considerations you have
to think of in different issue
areas.
Because, remember, places that
are working for human rights--
data has been used as
a weapon against them.
When you get data about
vulnerable communities,
governments use
that against folks.
So there's still great room
for data science to be used.
But you've got to go about
in a totally different way.
There has to be a huge
lens on privacy, ethics,
and consideration.
You really have to work
through those groups
to understand how to
protect these communities.
So maybe that's
what I would say.
There's room everywhere for it.
There's so many opportunities.
But you might have to take a
slightly different tack in each
of the different areas.
ELISSE ROCHE: Yeah.
I would say quickly I think
that a huge opportunity for data
science and analytics
is the environment.
That's something
that you can see.
You mentioned before World
Bank and satellite imagery.
Well, satellite imagery can
also be used to, for example,
track deforestation
in the Amazon
or in other highly
biodiverse regions.
That's something that
Google Earth Outreach
has been working on with
different indigenous
organizations and protecting
their ancestral lands.
And that's something that
data analytics, analysis,
GIS-- that all can
be supremely helpful.
And last question--
so how did DataKind
select their engagement models?
And what's next?
JAKE PORWAY: OK.
Let's think of it this way.
The whole purpose
of this organization
is to see data science applied
to any social challenge
where it could be useful.
And so we picked
engagement models
that could tap into existing
talent and just kind of sew it
together as easy as possible.
We laugh.
We often say we're
the weaver of worlds.
We don't have a lot of
technologists on staff
because most of our job is
just coordinating experts
in the social sector
and technology
to come together
to do great things.
So we've picked
them that way to try
to make sure that
people have the least
overhead to helping from
either side as they want to.
So when you ask what's
next, all I could think of
is we're working
to scale that out.
So we want to know.
If you're out there and you want
to build a DataKind community
on your own, let us know.
There's some exciting
news coming out.
We're working with a
foundation to be thinking
in the next few months about how
we can grow out and replicate
this model.
So I would basically just leave
this as an open call people
to say if you're a corporation,
a university, nonprofit,
anyone else who wants to see
more data for good in the world
and wants to think about how
to scale it, come talk to us.
Come to datakind.org
and write us an email.
Because we've got
some big plans there
and we'd love you all
to be a part of it.
ELISSE ROCHE: And you also
have several chapters, right?
JAKE PORWAY: We do.
We have volunteer
chapters around the world.
We've been learning
from the five we have.
They're in places that are rich
in both nonprofits and tech.
But we want to take
it beyond that.
How do we get this in every
city, every organization that
wants it?
So, again, if you have ideas
on that, come talk with us.
ELISSE ROCHE: That's great.
Well, thank you so
much for your time
and for joining me here
in the New York office.
JAKE PORWAY: Thank you.
It's been great.
ELISSE ROCHE: It
was really great
just talking about the
DataDive and the work that we
were able to do together.
And so we're actually at
time for this webinar.
So with that being
said, we'd like
to thank you all
out there as well
and say stay tuned
for the next session.
Because what's coming up
next is accelerating insights
with external data sets on GCP.
Thanks again.
JAKE PORWAY: Thanks.
[MUSIC PLAYING]
