Good evening.
I want to welcome all
of you who are here
tonight here in Boston on
our campus and those of you
who are watching from around
the world on our live stream.
I'm happy to share with you that
the first two seminars of 2019,
we had more than 20,000
people from around the world
join our Longwood Seminar
classroom from Boston
and from as far away as the
United Kingdom, South Korea,
Pakistan, Egypt, Italy,
Brazil, and Australia.
So to all of you, welcome.
And I hope you're
joining us again tonight.
Tonight, our
Mini-Med School will
feature artificial intelligence
and the tremendous potential
it holds to revolutionize
health care.
There is one remaining
seminar this year.
Please join us on Tuesday,
April 30, for Why Sleep Matters.
And we always have a great
attendance for our sleep
program, so do come early.
So now for a few
brief announcements.
If there is anyone watching
tonight, a business or science
leader who may be
with us, we want
you to be aware of a
four-day executive education
course called Inside the
Health Care Ecosystem.
Zak Kohane, one of
tonight's speakers
will be among the faculty
teaching this course.
Details can be found on
the web link on the screen.
Now on the screen
you'll see information
related to obtaining
certificates
of completion and professional
development points.
So those of you who joined
us for the first two seminars
and who are here
with us tonight,
you're entitled to
a certificate that
says you completed
the Longwood Seminars.
Our speakers will
be taking questions
at the end of their
talk, so I ask you--
if you're in the audience,
you have a little card.
Please pass it to a
member of our staff
who will be circulating
up and down the aisle.
If you're watching
on the live stream,
we want your questions as well.
So please write your questions
in the comments section
of Facebook and YouTube.
And when you post
your question, we'd
love to know where
you are viewing from.
So please write the
country or the city
from which you're watching.
And now please, silence
all electronic devices,
but do not turn
them off because we
want you to join our
Twitter conversation
by using #HMSMiniMed.
So please write your
comments and thoughts
as you're watching our program.
It's difficult, isn't
it, to remember a time
when technology and computers
did not exist and play
a major role in our lives.
My children never
lived in a world
without personal computers.
Technology has defined
their lives and ours.
The impact of machine
learning and technology
is dramatically transforming
our lives across many spheres,
but importantly, never more than
in the practice of medicine.
So how reliable are
computers in making decisions
about our health?
Looking into the future, what
are the many possibilities?
How can our ability to rapidly
analyze vast amounts of data
offer clinical tools to
diagnose disease, identify
best treatment options, and
predict outcomes for patients?
It has been said that
our intelligence is
what makes us human, and
AI extends our humanity.
We're going to find out
more about that tonight.
Tonight we'll learn
more about the symbiosis
of human and
machine intelligence
from our expert Harvard faculty.
Tonight we have with us
Brett Beaulieu-Jones,
a research fellow in biomedical
informatics at Harvard Medical
School.
Katherine Liao is an
associate professor
of medicine and assistant
professor of bioinformatics
at Harvard Medical School,
associate physician, Division
of Rheumatology, Immunology,
and Allergy at Brigham
and Women's Hospital,
and director
of applied bioinformatics core
and the VA Boston Health Care
System.
But we'll begin with
our moderator and one
of the world's foremost experts
on all things AI, Zak Kohane,
who is the Marion V.
Nelson Professor and Chair
of the Department of Biomedical
Informatics at Harvard Medical
School.
Please join me in welcoming
our expert faculty.
Thank you.
[APPLAUSE]
Thank you, Gina.
And I'm very excited to see
how many of you showed up
to hear us talk about this.
So we are privileged
to be living in an era
where something
transformational, something
genuinely new has
happened, and it's happened
in the span of my life.
So when I was an MD-PhD student
getting my PhD in computer
science, artificial
intelligence then
meant we were going to
hand code using programming
the style of diagnosis
and treatment selection
that we saw doctors perform.
What's happened since,
and in the last 10 years,
is we've learned how to
use the various techniques,
various computer
science techniques,
to use the data to
itself directly inform us
what are the patterns
that are important.
And so just as you
can now automatically
search for cat
pictures on Facebook,
you can automatically classify
pathology images of tumors
and actually say whether it
looks like this kind of cancer
or that kind of cancer
with performance
that is as good and often better
than pathologists in the best
academic health centers.
So that's a very exciting time.
But the topic of my 20 minutes--
and I will try to get it done
before 20 minutes because I'm
looking forward to having this
moderated discussion
with all of you--
what I'm going to
be talking about
is the opportunity for new
medicines, for new treatments.
Because I think in the end,
as patients, what we really
are hoping for
are new treatments
to help us suffer less and to
have the lives we want to have.
So the most obvious
thing is to ask would be,
is artificial intelligence
going to transform
the way we develop drugs?
And the answer is it may well.
And so shown here on the
slide is one of my colleagues
formerly from Stanford,
Daphne Koller,
who is a professor
of computer science.
And those of you
who are teachers
should know that
when she was still
a professor of computer
science at Stanford,
she started the
Coursera online course
behemoth that's been very
successful and disruptive
in its own way.
But she's now had several
other careers after that,
and she's now
leading a new startup
called Insitro, which asks the
question-- using a lot of data
out of our health care system
and a lot of data out of animal
studies and chemical
studies, can actually
come up with new drugs?
And we'll see.
We don't know the
answer to it yet.
And actually, that's not going
to be the point of my talk
because maybe this
process will succeed,
but I can tell you that our
experience as a community
is that drug development
is really, really hard,
and often things that
make a lot of sense
end up not working
in the clinic.
But this may in fact
work, and we'll see.
But that's not what I'm
here to talk to you about.
I'm here to talk to you about
something quite different.
And as always, in 2019, it's
best to start with a story
than with a bunch of numbers.
Here's a story.
It's a six-year-old
child who was doing fine.
And then he was no longer
walking and no longer talking.
He had been walking and
talking, and then he stops.
And saw many doctors.
No answer.
And so he was
referred to a network
that I have the privilege being
part of, of the Undiagnosed
Disease Network, where we take
patients who are undiagnosed,
we do whole genome
sequencing on them.
We look at every single one
of the three billion letters
in their genome,
figure out what's
different from
reference human beings,
and then refer this
patient to the right expert
throughout the United States.
Shown here are only
seven academic centers.
Currently includes 12
academic health centers.
And through this network,
we referred this patient,
we did the analysis,
and we found
that this patient had a mutation
in a gene that has an almost
unpronounceable name--
GTP cyclohydrolase 1.
I had never heard of it
until I saw this case.
But what does this gene do?
It takes a bunch of
chemicals and turns them
into neurotransmitters.
The chemicals allow your
neurons to talk to one another
and make your brain work.
And because this is
deficient and is not
making enough neurotransmitters
from the pre-existing chemicals
in your brain, this child
was really losing milestones.
Not only not
progressing-- losing.
And what's amazing is once
we knew what the cause was,
we could just give
this child a bunch
of compounds that get
easily transformed
into these neurotransmitters
like L-DOPA, folinic acid,
and 5-hydroxytryptophan.
And what's so amazing
is that within months
of starting this therapy,
which is just things to eat,
this child started
walking and talking again.
That's amazing to me.
And let's think about
what really happened here.
We combed through billions
of bases, went through thou--
what am I talking about?
Millions of records of what
diseases are associated
with which mutation, something
that no matter how ambitious
you are in medical school, you
will never be able to learn.
Sometimes hard to get us doctors
to be appropriately humble.
But the point is,
this allowed us
to zoom in onto that mutation
and treat this child.
There's a couple of other
interesting things that I
found, which is that we
published an article in the New
England Journal of Medicine
about our network, Undiagnosed
Disease Network,
and it turns out
that a third of the
patients already
came in having their
genome sequenced.
So it's not the data.
It's what you do with it.
And having the right
programs to analyze them
is the augmented intelligence,
the artificial intelligence
that will help us
be better doctors.
So that's one view of how
artificial intelligence will
allow us to create
new treatments simply
by identifying what's wrong
by sifting through millions
of facts and saying, that's
what's wrong with this patient,
and that will make clear
what the treatment should be.
But there are other things that
can be done for new treatments.
It's important to say
for those of you who
are with me in Boston, as
the sun is finally coming out
after this long winter,
we're going to be out
and showing a lot of
skin, which we probably
shouldn't be doing because
it actually allows the sun
to damage our skin and cause
what's becoming a growing
problem of melanoma, skin
cancer that can be deadly
if you don't catch it.
But it turns out the same
artificial intelligence
techniques that I described
before that allows
you to find the cat in a
huge pile of images can also
be used to look at moles or
spots on your skin and say,
that's not a mole, that's a
melanoma-- that's not a birth
spot, that's a melanoma.
And why is that important?
Because a scientist at
Stanford, using images
that you can just use
with your smartphone,
whether it's your
Android or your iPhone,
can allow you to take a
picture of these spots
and then immediately
have a diagnosis
of whether this is something
that you need to get taken out.
And guess what?
A, if you take it
out when it's still
superficial, much different
history of the clinical course
than if you let it stay.
And on average, people who have
been diagnosed with melanoma
have known about this
spot at least a year.
But it takes time to
be seen by a doctor,
even those of us
who are our doctors
have a tough time getting seen
by doctors in a timely way.
So think about the difference
it makes for so-called secondary
prevention, which is--
primary prevention
would be sunblock
to prevent the cancer from
happening in the first place.
Secondary prevention
is identifying the mole
as being malignant and therefore
should be removed early
before it becomes metastatic.
So there again,
just by using this,
we're jump-starting the way
that AI can not only augment
doctors--
I want to point out
to you a theme that
will be familiar to those
of you who have smartphones.
Makes you, the patient,
part of the solution.
Because waiting for
doctors to diagnose
us is probably the wrong move.
Doctors are overtaxed
in time and bureaucracy,
and they're think about
many, many things.
But you are thinking about
yourself, hopefully, more
than they are.
And so if we give you the
tools so that you can actually
decide in a much
more acute way, I've
got to see a doctor now
because this thing says
I have potentially cancer,
then we're actually
making a new treatment.
I'm going to start wrapping
up by telling you a story.
It's a lot of words here.
Don't forget--
don't feel like you
have to read the words because
I'll tell you the story.
This is a story of a
friend of mine who--
well, the son of
a friend of mine,
who's actually a professor
here at Harvard Medical School.
His child was diagnosed
at age 3 and 10 months,
almost four years of age,
with something called colitis.
This is inflammation
of your gut.
And you determine that by
putting a tube up the rectum,
look around, see
inflamed tissues.
You take a piece of the
tissue lining your colon,
you look at it
under a microscope,
and say, wow that looks
like inflammation.
That is inflammatory
bowel disease.
And there's two types of
inflammatory bowel disease,
Crohn's disease and
ulcerative colitis.
And I will spare you the
details out of interest of time,
but I can tell you
that this child did
great on very mild
anti-inflammatory agents for 10
years until puberty.
And then in puberty,
as often happens
with these kids, the
disease flared up.
And this child, who was
doing fine until that point,
started pooping every hour.
And when you poop every
hour, you're not sleeping.
Therefore, you're
not going to school.
And so my friend's kid was
just no longer going to school,
lying in bed, no energy,
pooping every hour, in pain.
And every medication
that we used that is--
and here we are in the middle
of the best academic health
center.
Forgive me for those of you who
are at other academic health
centers.
But potentially the best
academic health center,
and nothing worked.
Not steroids.
Not the antibiotics.
Not the first-generation
monoclonal antibodies.
Not the second-generation
monoclonal antibodies.
No expense spared.
Nothing worked.
And everybody was
pushing him and his wife
to go for something
which was reasonable,
which is to get his colon
removed, so-called colectomy.
Now, for those of you
who are as old as I am,
you might not remember how
bad it was to be a teenager,
but let me remind you.
It's tough to be a teenager.
And to be 14 years old and
then have surgery and then
have a bag with
stool in it at least
even for a few months is really,
really not a great thing.
And even after you
remove the colon,
sometimes there's a little
bit of inflammation left,
so you still need
to be on the drugs.
So it's not an ideal situation.
So we're pushing it off.
But eventually,
everybody convinced us
that the surgery had to be done.
So we're five weeks
away from surgery.
And so my friend asked me- Zak--
so my name is Isaac Kohane,
but my nickname is Zak.
He said, Zak, what
about a crazy analysis
that your graduate students
showed me the other day?
And what it was--
and these are-- I'm showing
the pictures of the students
and postdocs who did it,
none of which have an MD.
And that's very important.
All have PhDs in
computer science.
These individuals,
we took a bunch--
we had taken a bunch of
samples from patients,
and we'd measured
which genes were
up or down in these patients who
presented with bowel problems.
And what we found was that
there was one subgroup that
ended up being healthy.
And we show them here in red.
And then there was
another subgroup
that had ended up having
inflammatory bowel
disease, shown here by
the blue and green dots.
So the point is, just by looking
at which genes were up or down,
we could tell that they
had inflammatory bowel
disease without looking
under the microscope
as regular doctors had to do.
That's not the interesting part.
Here's the interesting
and somewhat crazy
thing we did that my
friend had asked me about.
We said, what if we divide
this patient population in two
and ask ourselves,
which drugs can
push the genes to make them
much more like the healthy kids?
In other words,
the genes that are
high in the gut of
these unhealthy kids,
can we make them go down?
And the genes that are
down, can we push them up?
And so we went through
a large database
of drugs that are
known to affect genes,
and we were able to show, sure
enough, that the drugs that are
known--
like azathioprine--
that are known
to work for inflammatory
bowel disease,
do seem to push these kids
who are sick towards healthy.
But that was just an
experiment, a talk that we gave.
But he, my friend, asked
me to do this for his kid.
So we had a biopsy from when
he got flared up from his gut,
and we did this analysis.
And then these
postdocs and students
did the analysis I described,
and they came to me
and they said, Zak, the top drug
that works best for this kid
is indirubin.
I said, indirubin?
What the heck is that?
I never learned about
that in medical school.
So I did what you should do
and what I tell students to do,
is use Google.
And so I looked it
up, and it turns out
indirubin is part of a purple
thing called indigo which
is made by bacteria that,
when they chew through things
in your gut--
food, for example-- they
make this purple byproduct
that's available as a
supplement over in a store.
And forgive me
those of you who are
Chinese speaking
because I'm going
to massacre pronunciation.
It's also known in
Chinese as Qing Dai.
And so then I did
the next thing that I
tell medical
students to do, which
is look up if there's been any
studies using this drug, Qing
Dai or indigo, for
ulcerative colitis.
But I warned them that you can
always find in some journal
some good effect for
some supplement, so not
to put a lot of weight on it.
So sure enough, we found
a journal that's in china.
And this is--
forgive me if you've
published in this journal.
It's a third-tier journal.
And they had found that there
was a good response to therapy
in these kids, in these
individuals with Qing Dai.
So I call him my friend, and I
thought he was going tell me,
when I said indigo, he was
going to say the same thing as I
did-- what the heck is indigo?
Instead, he said Zak, that's
really interesting, because he
had been asking around the world
about what to do with his kid,
and there was a group
in Israel, in addition
to the standard
Western medicine,
was giving indigo
as a supplement
to every single patient.
But he had dismissed it.
Why was he going to give
a supplement to his kid?
He's a Harvard trained doctor.
He's not going to
believe in supplements.
But he said, maybe
we should actually
try it now that your
analysis suggests that.
And so I said, OK, let's do it.
He says, how do we
get good indigo?
Because if you
don't know already,
any supplement, depending
where you get it,
it can be either 100% that
compound or 0% that compound.
So I said, just get the Israeli
clinic to FedEx it to you.
So he did it.
And the amazing
thing that happened
is within two weeks,
this child who
had been pooping
every hour, went down
pooping three or
four times a day.
And that was three years ago.
Still no colectomy.
He's doing great.
If we had not done this,
he would be minus a colon
and God knows what else.
And I want to point out,
this is not a party trick
that any doc could do.
It was three graduate students
using these AI techniques,
combing through these
large databases of drugs
affecting genes that actually
came up with this result.
And so when I tell-- this is
part of a longer story which
I can't bore you with where
I talk about whether or not
people need an MD degree
to advance medical science.
But punchline is-- no.
[LAUGHTER]
Speaking about
treatments, I just
want to say that, just
in case you're a surgeon,
you should not feel too
self-assured that you're not
going to be dealt out of the
game as well, or at least not
have a useful assistant.
There's now already
some studies showing--
this is, again,
just in pigs-- where
suturing done on the
gut of these pigs
using artificial intelligence
to identify where the gap is
in the gut and sewing it shows
that, in fact, these things
can, as you'd expect,
be much more even
in the spacing
between the stitches
and also have much
more tighter seals.
This is basically
pushing water through
and seeing how much it leaks.
It does much, much better.
And you know what?
We've only started.
This is only going
to get better.
And so even without
developing new drugs,
with AI, we're going
to be able to find
the right diagnosis for you.
We're going to be able to find
which of our existing drugs
is the right drug for you.
We're going to be
able to improve
the performance of
doctors, like surgeons,
but for many other tasks
that doctors can do,
but we can make them better.
We can make them be the
best doctor they can be.
And with that,
thank you very much.
We go on to our next poll.
[APPLAUSE]
Good evening.
I'm Brett Beaulieu-Jones.
I'm actually a postdoc
in Zak's group,
so it's a little bit
strange to have your boss
and your mentor open for you.
[LAUGHTER]
Totally appropriate.
So I get to play a little
bit of the bad cop.
But first, I want to
start out by saying
I truly believe in the
potential for AI for medicine.
I want to echo all the
sentiments that Zak laid forth.
We will be able to
figure out what's
working in medicine,
what's not working,
find things where we're
missing treatments
and need better treatments.
And there are patients who
are being poorly treated now.
As well as areas where
we're wasting resources,
we're spending money on
ineffective treatments,
among a huge number
of other things.
And then identifying
patients who
are the best fit for specific
drugs and many other questions.
In some of my work, we did some
deep learning on ALS patients.
And so this was across 23
different clinical trials done
all over the world, so with a
wide variety of different data
sets, different data
elements collected.
And in this, we are able
to consistently identify
a cluster at the top where
the darkest red indicate
that people who had
the shortest survival.
This cluster was clinically
interesting to some
of our collaborators,
and they're now
continuing to look for
patients among this cluster.
So I do want to start by
saying I truly believe
in AI and in some of
the things that it
can do before diving into one
of the key issues with it.
So there's all of
this promise, but we
do have to remember that it
is driven by historical data.
It's driven by the
current practices.
Machine learning learns from
the actions of people today.
It's the things that
have happened over years.
And so if we are learning
from people who are biased
or systems that are biased,
the machine learning model
is not going to be
able to magically
get rid of those biases.
It may even have the ability
to exacerbate these biases,
because if we are now taking
something that currently
exists, predicting
it in the future
and making decisions
based off of this,
we may just continue to deviate
further and further from what
is right.
So as a example of
this to lay this out,
we have two groups
of people here.
There are green people
and there are blue people.
And they happen to smoke a lot.
For whatever reason,
they're still smoking.
Because of this, they develop
lung cancer, and many of them
develop lung cancer.
Unfortunately for
the green people,
money is the same
color as them, and they
have trouble seeing it and
they drop it on the ground.
Blue people are able to
hold onto their money,
and because of this are
much richer on average.
So because of this,
they're able to afford
a new treatment that works well
and can actually treat them.
And when we do this, and if we
train a model on this scenario,
the question is, what
is the model learning?
And one thing that
it might learn
is that green people
can't actually
receive this treatment.
It will see that because
they can't afford it,
that they never actually
receive the treatment.
And this will mean
that it will never
recommend the treatment
for green people,
and it will never know
whether it works or not.
And it will create this cycle
where we won't actually know
the answer to that question.
If we want to get a little
bit more realistic here
and take a population
of people where there
are some green people
who have better
eyes and can see their
money and hold onto it,
and they all receive a drug that
works in about 20% of people--
not all of them.
But 75 blue people
receive the drug,
and three green people
receive the drug,
and it works in
about 20% of people.
There's still greater
than a 50% chance
that it never works in this
population of green people.
So under this
situation, we might
learn something even worse.
The model might
learn that the drug
doesn't work in green people.
We might be biased by the
small sample, where the machine
learning model is never seeing a
successful case because there's
such a small sample of
people who are actually
receiving the drug.
And this could be even worse
than never recommending it
because it might say that
it's a bad recommendation.
So the question is whether
this is a realistic situation.
It's a toy example
that we put together
to illustrate this point.
And we know that
people aren't green
and people don't
carry cash anymore.
But if we start to
look at the real world
and some actual cases,
we can see differences
among things such as insurance.
Insurance can be the gateway
to receiving treatment.
It can give you-- it
can really lay out
what options you can have.
It can lead to disparity
of health care.
It will determine what things
are realistic treatment
options for you.
A couple of the key things that
I'd like to point out here,
first of all, is that
among the Medicaid
and self-care populations,
in 200 million
inpatient admissions,
people who self-identified
as black were twice as
likely to have Medicaid
or self-insurance,
self-insurance meaning
they don't have insurance.
They're paying
for it themselves.
These are within
these two categories
where this is one example,
but we can't in this database
even look at other racial
groups because in areas
of the country,
the numbers are so
low that if you
look at that group,
it risks privacy
for the individuals.
There's a risk that you could
actually re-identify people
within that population.
So there's a lot of
groups in a data set
as big as this is that we may
not even be able to study.
So what does this translate to?
One of the things that
is a shocking statistic
was something that
the CDC put together
between 1987 and 2014, which
showed that black women had
mortality during pregnancy
at more than three
times the rate of white women.
And when we take
this into research
and start to look at
other areas and try
to get back to
different things that
are going to be training these
artificial intelligence models,
one example are in
genetic studies.
And there's two main takeaways
I want to make from this figure
that I know can be a
little bit hard to see.
But the first is--
first is that the European
population represents about 80%
of the genetic tests that have
been performed and associated
and are indexed for
researchers to work with.
And if we look at potentially
the most interesting
genetic group,
the African group,
because of the long history
in Africa and the way
that different migration
patterns happened,
it only represents 2% of
the genetic tests that
are available for researchers.
Similarly, if we look at
clinical trial participation
by race, the USFDA reports
that 86% of clinical trial
participants are white.
So what does this tell us?
It tells us that we
have a pretty good idea
of whether things are working or
not among the white population.
And among other populations,
we have much smaller sample
counts.
So all of a sudden, that group
of three green people receiving
a drug becomes a lot
more realistic as we have
this smaller sample
counts where we may not
be able to tell if a
drug is working or not
among that population.
What does this lead
to in the real world?
Here's one example.
So the government of New Zealand
put in place a computer vision
algorithm to recognize
people's faces
to determine whether their
pictures were adequate quality
for passport photos.
This man uploaded a photo to
it and gets a message saying
that his eyes are closed.
So if this was you, how
does it make you feel?
And this is the
case where, likely--
it's New Zealand.
Again, there's probably a bias
in the training population
of the algorithm,
and it just doesn't
work for this particular case.
Again another example
is an algorithm
that was developed
by a private company
to predict the risk of
recidivism, the risk
that a criminal would re-offend
and commit another crime
after ever leaving jail.
If we look at this, it sounds
like a really noble goal.
We know that humans are biased.
We know that judges are biased.
We know that there's different
people in different places.
And so maybe we can take
it all, turn it into math,
use data to power
our decisions, and we
can take out the human element.
It sounds like an
incredibly noble goal.
But when we look
at the algorithm,
we start to notice some
interesting trends.
Among the people who
do not re-offend,
if we look at the
predicted risk,
we find that these are all
people who did not re-offend,
and black defendants were
given a risk score of double
what white defendants were.
If we look at this
from another angle
and take the group
that were deemed
to be low risk of
re-offending, black defendants,
again, were about half.
So this is looking at it from
the other angle, where now they
re-offended about half the
rate in the same risk group
as white defendants.
So what can be done?
So we need to start
to think about,
how can we fix some
of these problems?
How can we recognize
bias and work on it
to illuminate the issues?
And so the easiest
solution would be,
let's remove race
from the classifier.
Let's not pass race
in as a variable.
This is something that sounds
like a very easy solution
to this question.
This was something
that has been tried.
A famous example of
this is Amazon has a--
had an algorithm to
score job applicants
and to create scores for them.
And as they were using this, one
of the things that they noticed
is it consistently ranked
male applicants higher
than female applicants.
So their answer to that was,
let's get rid of genders
from being passed in as inputs.
And what they then found
was that all of a sudden,
the algorithm was
ranking people who
used words such as "executed"
and "performed" in their CVs
or resumes and
ranking them higher.
And when you look
at it, those terms
were used much more
frequently by men than women.
And so it was essentially
getting around the fact
that you were no longer passing
gender and learning that
from a different way.
And a lot of this was built
up because, obviously, there
are gender inequality
issues in the tech industry.
And if you're training it on
historical data where there
are more men than
women, you continue
to see this pattern
over and over again.
So where do we start?
We have to think
about AI machine
learning from
framing the problem.
We have to think
about it like, if we
are talking to a salesperson
and giving them a task,
and they have two groups of
people they could possibly
sell to, and we tell them
that if they sell to one group
they're going to double
the commission of selling
to the other group, what's
that salesperson going to do?
They're going to immediately
sell to the group
where they get
double the commission
and fully optimize to that.
They'll completely ignore
the other group, no matter
how important it is
to your business.
And we have to think
about AI algorithms
as if they are that salesperson.
They're going to solve the task
that you put in front of it.
Unfortunately, it can be
really hard to define that task
to be a holistic, wide
range view of things
where you're considering
all the other possibilities.
In this case, it could be
trying to eliminate bias.
It can be really hard to
mathematically frame bias.
Another thing that
we need to look at
is we need to ensure that
the population that something
is being used on actually
matches the training
population.
So this is the example of the
New Zealand passport image.
But if we are
looking at a training
population and a
real population here,
and we say that these
are two distributions,
and these actual graphs don't
mean anything other than to say
they're different groups--
And we look at it and we
train on this red group,
and then we see a person from
the real population who is
otherwise very average--
they're the right in the middle
of the actual population--
and we train on this,
would we really expect
the algorithm to work?
Would we expect
the model to work?
And so this starts
at the basis of,
where are we getting
the training data from?
And so one thing that I'd like
to bring that back of telling
all of these--
and I don't mean to
fear-monger because I
do think AI can actually help
with a lot of this stuff.
So one of the things you
can do is because we can now
look at this, we
can mathematically
model bias in these systems.
We can say, what happens if we
change the gender of someone?
What if we change
the race of somebody?
What if we change
different factors
and we look at the
output of a model
to see what is actually driving
the AI, the machine learning
model's decision?
The other thing
that we need to do
is eliminating bias
is going to require
a much more inclusive scientific
and medical community.
It's going to
require that we make
sure that the studies
that we do are
achieving a more diverse group.
And this is something
that is very easy
to criticize but in
practice can be very
hard, because scientists are
looking for the smallest sample
size that they can get to
determine whether an effect is
real or not.
And the best way to
do that is to get
people who are very
similar to each other,
because then you're
measuring one effect.
You don't have other
potential effects going on.
And so I see the need
to counter biases
as potentially a tool
for us all to argue
for more inclusive, larger
studies where we can
look at some of these factors.
And so with that, I would
to thank you all for coming.
I do want to say--
[APPLAUSE]
Really quickly,
there are two things
that I think, as a researcher,
you can really appreciate.
And the first is
that we would hope
to actually build
something or come
to some conclusion that actually
has an impact in a patient's
life.
And the other is
that people actually
care about what you do.
So something like
this truly does
mean a lot coming from
this side, so thank you.
[APPLAUSE]
Slides going to switch.
Just waiting for the
slides to come on.
Well, good evening everyone.
My name is Kat Liao.
I'm actually a rheumatologist
at Brigham and Women's Hospital.
And I actually see patients,
but I also, almost a--
over a decade ago
started working with Zak.
And since then, we've
been doing a lot of work
on clinical applications of AI.
So I might be taking
a slightly deeper dive
into the nuts and bolts of what
we're doing in these research
projects.
So hopefully I'll
keep you all awake.
So let's see.
So I'd actually like to
start with a cab drive story.
So I called a cab because I
needed a ride to South Station
last month.
And I got in the cab,
and I got a chatty cabby.
He says, what do you do?
And I said, well, I'm a
doctor, and I also do research.
And he said, well
you know, actually,
just didn't have a great
experience with one
of the hospitals in Boston.
And so what happened is
he had a recent cancer
diagnosis made on biopsy.
And in the first
hospital, he was
told he had a pretty severe
high-grade cancer on biopsy
when they looked at his cells.
And he, like everyone,
rightfully so,
went to another hospital
and got a second opinion.
And there they said,
you have moderate-grade.
You definitely have a
cancer, but you may only
need six weeks of
chemotherapy and not the 12
weeks of chemotherapy
and radiation
that was recommended
by the first hospital.
And so he actually went back
to both institutions and said,
hey, there is this
difference of opinion.
And so the pathologists,
the doctors
that review the slides from
the biopsy, re-reviewed it.
They actually had somebody
else review the slides,
and they came to the same
difference in opinion.
And he asked me, how
could this happen?
How could something
like this happen?
In my head, I was thinking, it
actually happens all the time.
And that's because, as many
of you are probably aware,
there's a lot of gray
areas in clinical medicine.
And so what I'm showing you
here is a complete cartoon,
but of cells.
This is a normal cell, and
this would be an abnormal cell
that you would see
in high-grade cancer.
But oftentimes, people
have a lot of things
in between-- gray area.
So you might say this is normal.
This is mildly abnormal,
moderately abnormal,
and highly abnormal.
And I don't know
exactly what happened.
I didn't get involved
in that case.
But I could see how he could
have a difference in opinion
because things like this
happen all the time.
So let's say the cab driver,
he had a biopsy done,
they looked at the cells, and it
was 50/50, right in the middle.
So those physicians,
those pathologists,
have to pick one or the other.
And that has to do with
practice or opinion
when you don't
have a lot of data.
And in fact, in many
situations, in this gray zone,
there is no right answer.
The reason there's
a gray zone is
because we don't know
what the best answer is.
But from this story, you
can tell the implications
for this patient are very
different based on how
the data were interpreted.
So one hospital said,
you need 12 weeks
of chemotherapy and
radiation, and the other said,
you need 6 weeks.
And he said, 12 weeks would
put me out of the job.
I'd have such a hard time.
It would really just affect
my life in such a big way,
and I can't believe it
can be so different.
And so ultimately, the cab
driver did undergo treatment
at hospital two.
He had chemotherapy
for six weeks.
He was doing very well.
But in reality, we
actually need more time
to know if this was
actually adequate therapy.
So I want you to hold
this story in your mind,
and this theme will come up
again, themes from this story,
when we talk about how
we might be applying
AI in clinical medicine.
And so why AI for
clinical medicine?
To say it's very exciting time.
You heard from Zak and Brett
about all these technologies
that are changing.
For me as a physician, I started
training with paper charts.
So a classic case of a
72-year-old man comes
into the hospital
with his daughter,
and his daughter's
like, I think--
he's confused.
He can't tell us anything.
And the daughter
says, I think he
might have had a
stroke three years ago
and was admitted
at this hospital.
So what that meant when
I was an intern, meaning
I go down to the basement.
I request the charts.
I get a stack this high.
And I'm trying to flip
through it to find out
where in this past three to five
years was he admitted and why.
And so as you can tell,
that's very labor intensive.
Just for one patient, it's very
hard to recreate that history
and synthesize the data.
Then, if you take
it a step further,
on the research
side, when you're
trying to learn about
relationships between diseases
or how a treatment
may impact an outcome
or may be good to
prevent stroke,
you have to do
these chart reviews
for thousands of patients.
And in fact, before
now, we literally
had teams of people reviewing
stacks and stacks of paper
charts to figure out who had
a stroke, who had high blood
pressure, who is on what drug to
figure out these relationships.
Now, with electronic
health data,
I might say that we
almost have too much data.
We're drowning in the data
dell where we actually can't
find the information we need.
The good thing is it's
in there somewhere.
And obviously, this
is why EHRs are here.
It's the opportunity to improve
the efficiency of health care.
But as physicians,
now when someone
comes into the hospital,
if someone says,
it's all on the computer,
and I said, I know,
but I can't find it.
And so our goal now is, how
do we get this information out
of there?
And particularly for medicine,
when we think about research,
there's a lot of information
for us to understand, again,
the relationship
between diseases.
What treatments are effective?
And it really has enabled us
to do these large population
studies and change the way
and the types of questions
we can ask.
But before we can do that,
we have to figure out
who has what disease.
And so Brett and Zak both went
through some applications of AI
in medicine.
And what I'm going
to focus on is
the one I think as physicians
we think about the most,
is how can AI help us
make the diagnosis?
And assist in making
the diagnosis,
or actually predict that someone
is going to get the disease?
And what I want to hammer home
is that before we can do that,
we have to figure out,
in all these data,
how do we define who
has what disease?
And I see the research studies--
this is the realm where I
live--
as a first step.
And in fact, the clinical
Electronic Health Record data
has enabled us to try
to ask this question.
You don't want to test
AI on the patient.
You don't want you to be the
test subject in the clinic
to see if AI is working.
But the clinical EHR
data gets you as close
as you can get to the patient
without actually testing it
on the patient or
ourselves, and that's
because this is all the
data that's generated
as part of clinical care.
And so this phenotyping, or
knowing who has what disease,
is really the foundation
for useful applications
in making the diagnosis
as well as all the studies
we do asking about--
does a treatment work?
What are the side effects?
What kinds of-- does smoking
increase risk of lung cancer?
Which we know it does.
So why is making the
diagnosis so hard to do,
and why is it so
hard to teach AI?
So phenotypes are
actually a spectrum.
So phenotypes themselves
are measurable attributes.
And so they can be
physical characteristics,
such as eye color.
Or it can be certain
diseases, such as stroke
and rheumatoid arthritis.
So for stroke, someone can have
a small blockage of an artery
and have damage of a few brain
cells, have a facial droop,
get to the hospital in time, get
treatment, completely recover.
That's a stroke.
Another patient with
a stroke is someone
who had a blockage
of a major artery,
massive damage to
the brain cells,
and complete paralysis
on the left side.
That's also a stroke.
So I'm a rheumatologist.
Many of my patients
have a condition
called rheumatoid
arthritis, the most common
inflammatory joint disease.
There is a blood test
that's associated
with rheumatoid arthritis
called rheumatoid factor.
So someone with positive
rheumatoid factor,
two swollen joints,
and about an hour
of morning stiffness,
that's rheumatoid arthritis.
Another case, on
the extreme, you
can have negative blood
tests of rheumatoid factor,
have five swollen
joints, and complete
destruction of the joints.
That's also
rheumatoid arthritis.
So these are-- as you
can tell, the spectrum
comes in many
different combinations
and characteristics.
And it's hard to-- as humans,
I think our intuition-- we
can integrate all these data and
say, this person has a stroke
and this person has RA.
But how do you teach
a machine that?
Do you have to give it all
the different combinations?
It's very hard to explain that.
The other challenge is,
where do you do that cut?
I showed you the
spectrum of the cells,
and you have to make a cut
to say, this is abnormal,
and this is normal.
In every disease, you
have the spectrum,
and somebody has
to decide at what
point that you say
someone has a disease
and needs this treatment versus
they don't have the disease
and perhaps you
don't need treatment.
And so this is where
I wanted to just make
the point that artificial
intelligence is
very different from
human intelligence.
Working with this
kind of technology,
it's very different, and the
goals are very different.
So in medicine
right now, at least
in terms of trying to
understand the diagnoses,
we've been using something
called machine learning.
And I'm sure many
of you probably--
I think they use
this word in ads now.
When I'm driving to work
listening to the radio,
they say, machine learning
for this and that.
This is a technology that we've
been using to try to see--
can this machine learning,
artificial intelligence,
help us to make better
diagnoses and more accurate
diagnoses sooner?
And as Brett and Zak mentioned,
it requires data to train.
So you can't just give it
data and say, OK, intuit.
Like a human, you
can give someone data
and say, OK, figure
out who has RA.
You have to say who you think
has rheumatoid arthritis
and have it train on that.
And I'm actually going
to go through some
of the gory details of
this in the next slide.
So I'm going to give you a
real scenario that we went
through almost a decade ago--
over a decade ago now.
And that was Zak had--
he was very visionary.
He said, OK, we've got all
these Electronic Health
Records coming on.
There's all this data in there.
We should be using
it for research.
And so he got a
bunch of us together,
clinical researchers
such as myself,
but also bioinformaticians,
biostatisticians,
people working in natural
language processing.
Said, there's all this data.
Now figure out how to
do something with it.
And so at the time, we
had seven million patients
in Electronic Health Records.
And as a researcher, I was
interested to know, who has--
I wanted to study
rheumatoid arthritis,
so the first step was trying to
identify who has the disease.
In the general
population, it's 1%.
So it literally is like looking
for a needle in a haystack.
And so those of you who
have some familiarity
with the medical field,
you're probably saying, well,
why don't you just use a
diagnosis billing codes,
because they're called
diagnosis codes?
And so what we did is we
started and we randomly
selected 100 patients with
at least one code for RA.
And what we found-- we had
three rheumatologists review
the charts, and we found out
only 19 of the 100 actually
had RA.
So you can't do any study
with this if you're only 20%
correct.
I just want to say,
it's not because people
are miscoding on purpose.
The way billing works is
when someone comes in,
when you go in to
see a physician,
something has to be billed.
You're ruled out.
You're being assessed for x.
You're being assessed for heart
disease, for RA, for stroke.
It doesn't mean you have it,
but you need that code to say,
this is what you're
being worked up for.
So then we said, OK, well
let's do three codes.
That got us to about 50%.
So it's almost a coin
toss at this point.
And you imagine,
if you're trying
to do a study understanding
the association
between whether a treatment
is effective and the outcome--
you're trying to understand if
it's effective for preventing,
like let's say a stroke,
and you're only 50% correct,
you'll never see a signal.
The other thing I want to point
out here is in this exercise,
we took 100 random patients,
and what we were doing
is we were slicing and dicing.
We were saying, OK, we
have codes and medications,
and how can you get
some kind of algorithm
or very simple algorithm
that's accurate in defining
the disease?
And this is where things
were over a decade
ago in how we were
defining conditions
for studies in large data sets.
And you're limited to
maybe about 5 to 10,
because after that, there's
too many combinations for you
to manage.
So let's talk about how machine
learning might help us here.
And so I'm showing you
one data set first.
This is a very small
data set of data
you can typically pull out of
the Electronic Health Records.
You have an ID, age, gender,
diagnosis code, and a lab.
On the right side
here, I have what
we would call a gold standard.
This is what a
physician we review
the charts of these
eight patients and say,
you have or have
not this disease.
So for this particular
group of eight patients,
there's only one patient.
You can't train on this.
This is not something
that machine learning
can help you with because
there's not enough data.
And as Brett was
mentioning earlier
with the clinical trials
data and the people who
were being included
in the studies,
if you don't have
enough people, you
don't have the
right training sets.
This is a terrible training set.
So let's go to the next one.
So now we have
another training set.
Eight people.
50% have this disease.
And if you look closely, you
might say, OK, most of these
are women.
So this disease is--
let's say this is
rheumatoid arthritis, which
is what I modeled it after.
It's mostly women.
Most people have the
diagnosis code in this lab,
we'll say it's rheumatoid
factor, is roughly above 30,
you have a good chance of this
person having the disease.
So we as humans can handle this.
There's literally four
variables on here.
But you are limited in how well
you can define a disease when
you only have four variables.
Now, the beauty
of the EHR is now
you have thousands if not--
depends on what you use.
You can have millions if
you include the genetics.
And so let's say a typical
training set has 200 patients.
So you have 200 rows.
But now you have, on the
columns, 500 to 1,000 columns.
And so even if you had
people reviewing the charts--
because I could--
the physicians can say--
the clinical experts
can say, reading
the notes, who has what
disease, because that's
part of the training.
But we can't see the pattern.
There's just too
much data in there.
And this is really
where machine learning
has been very helpful to us.
We just can't process
all that data.
So I don't have to
spend a lot of time
on this slide, why getting
the phenotypes right
is important,
especially when you're
going to use it in the clinic.
So there's no question
that misdiagnosis in clinic
has just tremendous
impact on the patient.
But misclassification
and research
is also really detrimental.
So if you don't
get it right, you
don't see the relationships.
Again, I use the
example of stroke.
If you're looking at the
relationship between blood
pressure-- high blood pressure
we know is related to stroke.
But if you can only classify
stroke right 50% of the time,
you're just seeing noise.
You're not going to
see that association.
You're not going
to know that you
need to target blood
pressure to reduce
the risk of future stroke.
And so that really--
this need to get
either the diagnosis or
the phenotype correct,
is really important
because it's what
we call it powers the study.
Your study has no power
to see any relationships
if the data are too noisy.
And I know this has already come
up, that the algorithms really
rely on these training sets.
The training sets have to
reflect the population you're
going to be running it on.
And it also relies
on the reviewers.
Those gold standards-- when I
talked about this chart review
here, the machine
is trying to mimic,
is trying to predict what
you tell it to predict.
It's not going to
go beyond that.
There's no intuition there.
So I wanted to share a little
bit of what we learned in terms
of using machine learning
in clinical research
using the Electronic
Health Record data.
So I'm not going to go
into this in detail.
This is probably version
12 of what we've worked on
in trying to start
with the EMR data
and getting to this probability
or this phenotype yes/no.
And what I want to point
to in the center here
is that we found that machine
learning methods have actually
been very useful
and very well suited
to dealing with the
complexity of the EHR data
and helping us to accurately
define the disease.
And that at the center here,
you have the gold standard.
So we still have about--
you start with a set
of 200 to 400 patients
where you pull out hundreds
of variables or columns.
But you review the charts on
these patients and you train.
You have the machine train
on this gold standard
and find the pattern.
Then you take that
mathematical model
developed based on
that pattern and run it
on the EMR of now
millions of patients.
And that's how you get this
yes/no, who has what disease.
But right now, it's
for researchers only.
And that's because
there a lot of things
that we can't study using data.
There are lots of things
going on in the clinic that
are not captured in the
Electronic Health Record data.
So there are some
challenges to translating AI
into the clinical setting.
I know there are many
people working on this now.
We already talked
about the training set.
Who are going to be
the clinical experts?
Who's going to define
the gold standard.
And adapting to new diagnoses,
new inputs, and new therapies.
Brett mentioned you're
training these algorithms
at one point in time.
How do you know it's going to
be useful 10 years from now?
How do you reassess it?
When do you retrain it?
And the stakes vary
very differently
depending on the situation.
Are you using it
for screening, where
you're then going to have--
it's going to be
very sensitive, it's
going to capture anyone
who possibly has a disease,
and then you confirm
it with a physician?
Or is it going to be the
actual diagnostic tool?
And last but not
least, as a clinician,
I think a lot about, how are
we going to use these tools?
Ultimately, the
clinical team is going
to be responsible for the
final diagnosis and treatment.
And when we make
that decision, it's
not based simply on an answer.
It's not like, you
have this disease.
It's-- you have this condition.
Here are the treatments.
But what's all the
other stuff going on?
What are your other
medical issues?
What are your other
social factors?
Can you tolerate this
type of chemotherapy?
So those kinds of almost
more intuitive or I would say
data that aren't
captured in the EHR
are very important in making
decision for treatment.
And so this theme
I think has come up
is that I think that the
research that we're doing,
the research on the
clinical EHR data,
may mirror how we might move
into the clinical realm.
So what I showed you,
this is very much
what we call a semi-supervisor,
an automated pipeline where
you move through processes.
And I showed you
that machine learning
and artificial intelligence
is at the center.
But what we found,
taking this algorithm,
implementing it in
multiple other institutions
and across 20 to 30 diseases
now, you need a check.
You need a human check.
And each of these stars is areas
where things went very wrong
because of some blip
in the data, something
that the machine's not going
to know intuitively that's
not supposed to be there.
And so each of
these steps is where
we've built in human checks.
And right here, this is a check
to say, where do we threshold?
Where did we say
someone has a disease
or doesn't have the disease?
And I do, I strongly
believe that we're
going to need a
similar paradigm when
this AI comes into the clinic.
And so in summary, I
hope I've demonstrated
how it could be a
powerful tool to assist us
in clinical medicine, where
it's not necessarily replacing
a lot of the things
we do, but it's
able to do other things such
as integrate large volumes
of information that we
simply can't process.
But it is limited
by the training data
and how good the reviewers are.
But ultimately, this is
might be a cool new tool,
but we shouldn't use
it unless it actually,
if we bring it into a
clinic, if it actually
improves how we take
care of patients,
that it actually improves care.
And so I believe that
you have to combine
the artificial intelligence
with a human intelligence,
because any diagnosis
and downstream treatment
has large implications
for patients.
And so we still have a
lot of future work ahead
that may need to be actually
tested in the clinic.
Medicine changes over time.
How often should we
be reassessing it?
So I just took my board
exam, which we have
to take every couple of years.
I get reassessed.
I think the machines
need to be reassessed.
And in fact, the algorithm
that we developed 10 years ago
with the early
studies was Zak, we
are reassessing it now
to see how well it runs.
It was built on historic data.
Now we have a new EMR.
We've got new treatments.
How well is it working?
And then, ultimately,
the responsibility
is with the human
clinical care team,
and that in this
rapidly changing world,
that team needs to
understand how this AI came
to that decision or
that results and how
to integrate it into the care.
So with that, I'd like to
thank you for your time.
[APPLAUSE]
Thank you very much for
those very good [INAUDIBLE]
I'll grab a water, actually.
Think I left my water here.
I'll grab this one.
So this is I think the more
interesting part of the session
today, which is where
we get questions
from the audience,
which you have been kind
enough to forward to me.
If you start getting
bored of the questions
that we've selected, I
will entertain hands up.
First of all, let's go get with
the most important comment.
This is not mine.
Fashion Police says,
nice shoes Gina.
[LAUGHTER]
But that same comment--
but the same card has an
easy-to-answer question.
It says, essentially, are there
MRI data sets linked to cancer,
linked to genetics, so we could
do machine learning on those?
And it's an easy answer
because in fact there
is a data set available
courtesy of your tax dollars.
The National Cancer
Institute has
something called the Cancer
Genome Anatomy Project, where
you have MRI images, CT scan
images, and pathology images,
and the genomics both of the
individuals and their tumors
and a variety of
other measurements.
So you can-- there are
whole fields of research
that could be done with that.
I'm going to now start
picking on my colleagues here.
So there is a
question about-- we
got Brett, which is since
one way of looking at blacks
is a skin color, so why not
factor that out of analysis,
and wouldn't we be better off?
Yeah.
I think the first
example is going
to be similar to the
example where Amazon
tried to do that with gender.
But the other
thing is skin color
is not necessarily indicative
of [INAUDIBLE] genetics,
but its highly
correlated with them.
And so it can be
a useful feature.
It can be helpful in
actually diagnosing a disease
or picking a treatment without--
especially when you don't
have the genetic test done.
The other side, I
think, is that it
can be a marker in certain
areas for socioeconomic status
and other markers, where
we see the differences
between insurance
and other things
like that that do play
a key role in outcomes.
Thank you.
So we received a question
via YouTube from Ireland,
and also a few
questions that are
very much like this one
from local audience members.
And they basically
are asking, are we
going to be put out of a job,
the diagnostic radiologists,
the pathologists, and
the ophthalmologists
and the dermatologists?
And so let me tell you a
little story first of all.
So I saw a AI program
that was published
with a study describing
it, published
in the Journal of the
American Medical Association.
I called up my cousin,
my first cousin, who's
a very proud ophthalmologist.
I said, ha!
Look, we're going to replace--
what do you think
of this program that
can look at a picture
of an eye and just
diagnose in a few microseconds
whether you have retinopathy
or not?
And he said, fantastic.
This is actually great.
I hate looking at those images.
I'd much rather be in the
operating room doing surgery
and have an AI program do that.
Meanwhile, I'm
seeing more patients,
I'm getting more money,
and I'm having more fun.
So that's one version of it.
I'd say the big picture
is if the doctor is not
seeing the patient at all,
it becomes much easier
to replace them.
So you may or may not know
that if you get an X-ray done
in several hospitals
in the United States,
those x-rays get
interpreted while we're
sleeping during the daytime
in India and Australia
by doctors who
are very competent
but have never seen us.
That kind of expertise
can be completely replaced
by computing.
And so from my
perspective, and I
think it's a growing
understanding, is
we value the human contact,
not just for the warm and fuzzy
part, but because of what
Dr. Liao was talking about,
which is we know how to
weigh not only the diagnosis
but what are the things
they're going to tolerate?
What are the things that
you might want to balance?
By the way, Kat, I'm just
impressed the conversations
that you have with
the cab drivers.
They never--
[LAUGHTER]
--never want to
talk to me about--
I get the chatty cabbies.
Yeah.
So the short answer is for those
doctors who don't see patients
at all, they're at a much
higher risk of being replaced.
Doctors that see patients
have a lot of value
that will cause them to be
sought after for sometime
to come.
So there's been
several questions
that I'll direct to you,
Kat, about essentially
looking at these programs as
if they were diagnostic tests.
They talk about things
like false positives
and false negatives.
How do we think of
these programs in terms
of how well they perform?
You already hinted at
this issue by saying
you want to update
the algorithm that we
did for rheumatoid arthritis.
But this is a very
interesting question
because the Food and Drug
Administration, the FDA,
has just approved
two AI programs,
one for the retina and the
other one was for chest x-rays.
Yeah.
And it's already approved.
So the question is, will
it continue to be updated?
And the question that I'm
having for you is how do
you think about how to evaluate
what are the performance
metrics?
Yeah.
I think, at least to
start, we should probably
evaluate them similarly to how
we evaluate current humans.
And that is with--
and it might not be
exactly the same,
but reassessment of
these models over time,
and making sure--
adding new inputs
to see if-- against
gold standards,
meeting with humans as the
gold standard-- to see they
continue to meet those
benchmarks as a start.
That's a start.
And medicine's going to
change, so just retesting it
on medicine, on real data,
I think will be part of it.
Yeah.
I know that Kat talked
about the diagnostic codes
being a part of that.
The diagnostic codes
completely changed in 2015.
So that's the type
of example which
will break a current algorithm
that will require retraining.
Many of the, I think,
harder ones to catch
are going to be much
quieter than that.
That was an easy one because
it's something that everybody
sees coming and can adapt to.
So there's an
interesting question
from Gainesville, Florida.
They say, can AI be used
to train doctors, nurses,
and other health care workers?
And here's an interesting thing.
Because of privacy concerns or
appropriate privacy concerns,
we can't share a lot of data.
But I don't know if you've
seen on the internet
these things whereby I can say,
I want to see Kat as a blonde
or I want to see her
in a different dress,
or I can see her rendered
in a certain painter style.
And so these deep learning
outcomes can not only
recognize, they can
generate images.
So for example, we can generate
millions of broken bone images.
We can generate
millions of skin lesions
that are actually not
anybody's skin lesions
but look exactly like it.
So we can provide a lot
more training materials
that previously have been very,
very limited because of privacy
concerns, and frankly
because some people view them
as their intellectual property.
So Kat?
Yes?
This is an interesting
question, and there
are several questions
on this theme, which
the theme is privacy threats.
Large data sets.
Who's watching them?
For what purpose?
This is one version of it.
Is there any fear that
the patient info, data,
is shared over the internet,
can be hacked into and shared
with the wrong people
or misused by others,
like insurance companies?
Well, I think that's
always a fear.
And so at our
institutions, they really
take this very seriously.
And so their data's
behind firewalls.
They're locked in
these server farms.
So this is taken very seriously.
They do the best they can.
And for research
purposes and for--
so research is one level.
There is a whole set of
rules of how we don't even
use the actual patient numbers
when we're doing the analysis.
We can't send out data with
not even dates on there
because that might help
to identify patients.
And so for clinical care,
there's another level to that.
So I think the
health providers are
doing as much as we can to
prevent that from happening.
One thing I'll add
to that is this
is not a new
threat, necessarily.
There's actually a
federal government site
that tracks breaches
of over 500 patients.
And if you look at it,
you'll see, shockingly,
that more than 50% of these
are hard copy breaches.
And it's people having left
hard copy of patient records
or other things in places
that they shouldn't, or just
losing them.
It's not always something
where it's been clearly taken
by somebody else.
But I would raise that
this is not a new thing,
but I think it's an
incredibly important thing.
Yeah.
The fact is, if you walk
into most hospitals wearing
a white coat and look like
you know what you're doing,
you could walk out
with a lot of data.
[LAUGHTER]
And that's just a reality.
But also, I want
to point out this
is something that you should
think about as citizens.
Studies were done and public
was asked, who are you
worried about seeing your data?
So unsurprisingly, they said, I
don't want commercial companies
to see my data.
And they also said,
and this surprised me,
I don't want public
health to see my data.
But I want researchers
to see my data.
But the irony is the
commercial companies
have contractual
rights to your data.
Public health authorities have
a legal right to see your data.
The only group that have major
blocks to seeing your data
is the researchers.
And so it's like
we're, on the one hand,
putting this huge dam to
prevent data leakage, where
on the side, it's just flowing
out to these other parties
that we may not want
it to be slid into.
By the way, I actually--
here's an interesting factoid
for my audience.
Indigo was the principle blue
dye for hundreds of years.
I did not know that.
One of major crops
in British India.
Great Duke Ellington song,
had a song, "Mood Indigo."
Share that for
cultural edification.
[LAUGHTER]
This is a question for you, Kat.
And I think it's raised and
it's legit by the taxi story.
You see, stories are important.
Is it better to go with the
highest course of treatment
in order for a much
better outcome based
on the patient diagnosis
when in the gray area?
Yes.
Yeah.
And I actually had
this conversation
with the taxi driver.
I said, I think
that one hospital
might have wanted to go
err on the side of caution.
But chemotherapy is not--
every treatment comes
with a side effect.
So he could have
neuropathy, which
losing the sensation of
his toes and fingers.
So yes, in general
we do think that way,
but it isn't always the
case because-- especially
when you're working
with very toxic drugs.
And for this
particular cab driver,
he was afraid that 12 weeks
of chemotherapy and radiation
would mean that
he loses his job.
And if six weeks was
enough and he kept his job,
then that is a really
big difference for him.
Thank you.
This is also a good question.
Why did you get into this area
of artificial intelligence
meets medicine?
So you, Kat,
started in medicine.
You started in artificial
intelligence, computer science.
So you both answer it,
and then I'll answer it.
Why don't you start?
Yeah, I can start.
So I was actually
working in consulting
for financial services doing
something that actually felt
pretty meaningful, and
it was assessing damages
from the mortgage crisis
in 2008 and trying
to figure out who was
wrongly foreclosed on
and which individuals were
harmed so that the banks could
be made to pay some form
of restitution to them.
After about a year of doing
this, everything got settled.
Everything ended.
Everybody got even payouts.
People who were more
wronged than others
got no more than
the average person.
And it all felt
kind of worthless.
And so in thinking about
where I wanted to be
and where I wanted to try
to be applying skills,
medicine was the
natural next route.
And what was your
first hook into that?
I did study computational
biology as an undergrad.
I had initially
thought that I was--
my parents are
probably disappointed
I didn't go the MD route, and
fortunately, my younger brother
did, so they're content.
[LAUGHTER]
I was actually watching
a neurosurgery,
and ended up getting kicked
out of the operating room
because I thought I
was going to pass out.
[LAUGHTER]
And this is at the
age of 18 where
I was a testosterone filled
young man who wouldn't leave
on his own without the
neurosurgeon actually
asking me to leave
because he didn't
want to operate on me next.
[LAUGHTER]
Love it.
Kat, you've had time to
think about this answer.
Well, maybe not as exciting.
So I was trained as a clinician.
I thought I would
mainly see patients.
Got into research.
And then, a lot of my--
as a clinical researcher,
my questions come
from the clinic.
And I realized that
there were some questions
I couldn't answer.
So I'm a rheumatologists.
I study a lot of autoimmunity.
And I said, we need to
look at bigger data sets,
and we need to know
a lot of diagnoses
and really look at really
complex relationships.
And we just couldn't do it at
the time I was coming through.
So when I heard about Zak's
project and the scope of it
and the amount of data,
working millions of patients,
I got really excited
and jumped on board.
And now you're a
great leader in it.
So I'll answer for me.
I had no one in my
family who was a doctor,
so I didn't know what
medicine was about,
and I didn't have
any mentor telling me
how to figure that out.
So I just applied
to medical school,
got into medical school.
And then I realized after
the first year, wow,
this is a very noble profession.
It's a profession.
It's a trade.
But it's not really a
science, and I thought
I was going into science.
So then I panicked,
and I dropped out
the ambitious way, which
I dropped out and got
my PhD in computer science.
And then I went
back to medicine,
and I've completed my training
in pediatric endocrinology.
And all the while,
I started seeing
all the holes in medicine,
all the mistakes that
are being made, all the
slowness that's happening,
all the things that
make Netflix look better
than medicine in terms of
recommending the next step.
And frankly, it made me enraged.
And so I channeled that rage
into grant writing, which
is something that I've
become quite good at,
and started research groups and
research in this arrow, which
allowed me to work with smart
young people like the two
you just heard.
All right.
So let's-- ah.
There was a question,
a reasonable question.
Hey, can I get
that iPhone program
that allows me to recognize
melanoma lesions on my skin
or other people's skin?
And the short
answer is this thing
really works, it
was really deployed,
and speaks to another question
that we got from the audience.
Anybody want to guess why
it's not yet available?
[INAUDIBLE]
What?
They don't know
how to [INAUDIBLE]
[LAUGHTER]
Getting close.
Unfortunately, cynicism might
be the order of the day.
It's who is going to be
medically legally liable when
this thing makes a mistake.
You need a company behind this.
And some random
Stanford researcher
is not going to say, hey, use
this, and if it works for you,
send me a car.
Because the cab driver
is going to say,
hey, you made me do
this therapy because--
and it turned out I
didn't have melanoma.
And so you really have to
have, A, a company that
takes on medical
legal liability, that
educates physicians about it,
and that gets FDA approval.
Big, big challenges.
And those challenges are as big
as the scientific challenge,
perhaps bigger than the
scientific challenge of getting
the software distributed.
One quick question.
Will AI be able to
detect pancreatic cancer?
Any of you want
to deal with that?
Come on.
Punting to you, Zak.
[INAUDIBLE]
So my answer is I
don't believe we're
measuring the
things that we would
need to measure in order
to be able to diagnose
pancreatic cancer.
Right now, we tend
to measure things
that are associated
with pancreatic cancer
very, very late, like
right up at diagnosis
or after diagnosis.
I could imagine a
future where, if you're
genetically prone to
have pancreatic cancer,
we'll measure a bunch of
things like circulating cells.
But this is not an AI question,
it's a measurement question,
in my opinion.
So I think we've really
answered all the questions,
and there's nothing wrong
with ending before time.
Is there--
[INAUDIBLE]
Any other-- I will entertain--
yes, a question from--
What if it turns out that
Boeing designed the software?
Well, that's a very good point.
So I actually shared a
very sad story from--
who-- Ralph Nader.
So Ralph Nader's grandniece
was on one of the flights that
crashed with a 737 MAX.
And what really happened
will be determined,
but we know some
things that were true,
which is the designers put a lot
of faith in automated controls
and made it very
hard for the pilots
to go with their intuition.
So on the one hand, yes, pilots
get drunk, they fall asleep.
And doctors make mistakes,
and doctors fall sleep
and they get drunk.
And so you create
software to avoid that.
But what you're also
doing is making it harder
for doctors and pilots
to use their intuition.
And so if you're a great
doctor and a good doctor
and an alert doctor,
you may not be enabled.
You may be prevented from doing
the right thing because there's
something very confusing
going on that's not intuitive.
And so the plane was
actually actively
fighting the responses because
a program had been imposed.
And Ralph Nader's
comment is his grandniece
died because of some
hubristic assumption
that the computer was
always going to be right.
And I think it is a
good cautionary tale.
And I think it is
a reason why it
may be that
computers and AI will
be used to watch
for errors, will
be used to make
automated diagnoses,
but I in my own
care of my family
and myself will
always hope that there
is a smart, intuitive,
commonsensical doctor who's
at the helm, and that she's
making sure that something
obvious and stupid--
Because AI programs
can be very, very good
at what they're
doing, but they're not
intelligent in the
sense of human beings.
And so for example,
one of my students
just published a
paper in Science
where you take an image
of a retina or of a mole,
and you just add a
little noise to it.
And to you and me, it looks
like the same picture,
so you can still make the same
diagnosis as you would before.
But the person who
added the noise
knows something about
the computer program,
so that little bit of noise
completely confuses the program
and it completely changed
this diagnosis from melanoma
to not melanoma or vice versa.
The point is,
these programs look
like they think like us,
they don't think like us,
they certainly don't
have common sense.
So just because someone can--
a machine that can play chess
at the grandmaster level
is still not going to be
the machine that can
tell you reliably--
do you want this
treatment that's
at a higher risk long term
but is more likely to get you
to your daughter's wedding,
or this other treatment
which is higher risk
for the short term,
but overall a better
chance of survival?
That's a human kind
of judgment question
that maybe one day, in
a science fiction sense,
computers will be able to
do, but we're far from that.
Right now, we're
in this amazing era
where things that human beings
don't do well, like look
at images and see that little
spot that maybe was missed
on a mammogram that might
be associated cancer,
looking at pathology
imaged and making sure
that you don't miss any
of the cancer cells.
It's very good at
that kind of detailed
work in a very high throughput,
systematic, reliable way.
Because again, remember what
I said at the beginning.
Pathologists are not-- will
disagree with one another
on a same sample
maybe 30% of the time,
but when it comes
to decision making,
you're really on target
to bring up the 737.
We should not put
ourselves in the position
where the computer program
is deciding on therapy.
With that, thank you.
[APPLAUSE]
