[Music]
Kathleen DeRose: Thanks for your patience. So I'm so
excited to introduce the next, the next
moderator for our panel. It's a great
segue actually to find out how platforms
and technology work because the
moderator of our next panel, Arun
Sundararajan is the world's expert
on the sharing economy, having just
written a book of that title. So
he's going to introduce his panelists, so
Arun, why don't you come on out. Thank
you so much!
Arun Sundararajan: Thank you. Yeah I'm gonna
help you move the tables forward as one
of the distinguishing characteristics of
the FinTech conference here is that the
furniture moves sort of dynamically. You
know the real reason why they have me
moderating this panel is like, you know,
my ability to sort of move tables
seamlessly between sessions. So Vasant
and Michael, why don't you come on
out as well? So I'm Arun Sundararajan
and I'm a professor here. I've been a
professor here for a while
and today's session, like, you know, the
next 45 minutes, is about automation and
financial services. This is not new,
like automation has been happening in
financial services for, like, you know, at
least two decades if not more, but it
seems like we've hit some tipping points
over the last few years, certainly sort
of in the artificial intelligence
technologies, like you know, the move away
from coding and towards induction, the
move away from programmer defined
features and towards sort of systems
generating their own features on which
they learn, the progress that we've made
in solving the problem of perception. So
all of these have been sort of tipping
points that make us wonder, "Well, will the
pace of automation pick up?" We've also
seen the presence of sort of the growing
presence in financial services of large
tech platforms like Apple and Amazon and
Facebook and Google and Alibaba. These
are some of, the sort of, the largest
concentrations of AI talent that exists
and so it makes you also wonder whether,
like you know, the pace of automation
will pick up on account of sort of, like
you know, the
increased talent concentration here. I guess
what this leads to is a question about,
you know, when when do we delegate our
decisions? What kinds of things are
societally okay to delegate to the
algorithms? And when we do delegate
things to algorithms, what kind of
societal issues do these raise? And that
will really be the focus of what we
talked about. And we're gonna structure
our session in the following way, you
know, my colleague, Professor Dhar, will
sort of kick things off with a few
opening remarks for a few minutes and
then Professor Michael Kearns from Penn
will speak for about 15 minutes on
algorithmic bias and how that applies to
financial services. We'll have a couple
of questions over here and then we'll
throw it open to the audience. So without
further ado, let's get started. Professor
Dhar, Vasant is, has been, my
colleague for many years, has been in
FinTech for 25 years long before it was
called FinTech. He ran our Center for
Business Analytics for a while, he's the
editor of the journal "Big Data", he's
published widely, sort of at the
intersection of finance and machine
learning, and he was also the founder of
one of the first purely machine learning
based hedge funds that, like you know,
started almost fifteen years ago and
he's the senior adviser to McKinsey on
issues relating to artificial
intelligence. So, Vasant?
Vasant Dhar: Thanks Arun! So
I'm going to take about ten minutes to
just briefly summarize the, what I see as
the two major drivers in FinTech and
I've asked Arun to shock me at eight
minutes which means I need to look at
him eight minutes from now.
Arun: That's the other technology they give us when we moderate
[audience laughs]
Vasant: So you know so the first big driver is the
emergence of FinTech platforms, and this
is not new. This has been a sort of a key
driver of the internet economy, and you
know, just last month my colleague Roger
Stein and I published a paper and the
communications of the ACM called"FinTech
platforms and strategy", and I'll just
take a few minutes to describe what we
mean by that, both by platforms and
strategies. So essentially we draw on the
literature on platforms and point
out that they have
three properties or components. One is open
access, easy participation. The second is
that they implement some key business
process, and typically that business
process sort of enjoys network effects,
you know the more people that sign on
the more valuable the platform becomes,
the more data it generates, so it's sort
of a virtuous cycle, and then the
business process is implemented through
some key technologies, so automation is
really the kicker that turns the
platform into overdrive. In the FinTech
space, and so, you know, one of the things we
point out is that historically, financial
services platforms have been incomplete
in that they've lacked one or more of
these components, and this is either for
technological or regulatory reasons, and
that sort of strategy in the space is
something that we call platform
completion. That is, you actually complete
one or more of those functionalities, and
make the platform more complete and open
and more accessible and essentially you
can think of that as sort of being
arrows pointing to the center, and we've
seen the emergence of certain kinds of
platforms, in the Robo advising space for
example, and also in the peer-to-peer
lending space. And, you know, there's an
attractiveness to being in the center of
the platform, you know, one of the things
about the Internet economy, sort of the
winner-take-all sort of aspects of these
platforms, so once you're sort of lodged
in the center it's not easy to dislodge
you, right? So that's a big motivation for
platform completion and getting into the
center of the platform. The second
strategy that we point out is something
called "component replacement" where you
know some component of this platform
gets replaced by a superior technology
and, you know, one of the sort of more
compelling examples that is blockchain,
which could essentially displace a lot
of the intermediary functions that are
currently performed by institutions.
Right, so that's an example of a
technology that could replace existing
sort of infrastructure and we're seeing
the
the weaknesses of, you know, some of the
sort of existing infrastructure, you know,
increasing number of hacks. Like a lot of
these institutions were just not defined
for internet era kinds of, you know,
businesses, and so this is, you know, one
of the, you know, big reasons for the
excitement behind a platform that would
just streamline the whole sort of post
trade or post transaction part of the
infrastructure. Right, so that's a
component replacement strategy. And later
on in the afternoon we're gonna see
other examples of platforms in the
payment space, for example, and also in
India where there's a lot of, sort of,
really interesting stuff going on around
platforms related to the authentication
payments and, you know, data, having to do
with sort of privacy and sharing of data.
And so there's platforms emerging for
all of these things that are really
exciting, so it's a great space to be in
and that's one of the major drivers. The
other driver of course is data, and all
of the associated technologies that come
with it including specifically machine
learning, and one of the things that I
point out to people is that by seeing
this rise of autonomous learning systems,
where systems have the capability to
learn automatically from data and this
is something, as Arun said, I got
started in many years ago, in trying to
you know, ask myself whether I could
design a machine that could learn to
trade at a similar level that humans
have, based on access to data, and that
sort of turned into a really interesting
exercise because I realized that, you
know, while I was using machine learning
methods to find patterns and data, in
fact, at every, you know, at the end of
every sort of cycle, like every month, I
would be faced with a decision of like
"How much?" You know, "Which models do I
trust? How much risk do I allocate to
various things?" and over a period of time,
I realized that that was a thankless
exercise and I had no ability to really
do that in any superior way, that I was
often doing worse than random and that
really caused me to think about this as
a process, that you really sort of designing
a process, and I wrote about it in the
paper called "Should you trust your money
to a robot?" where the question I asked is,
you know, "Might it be easier to find a
good robot relative to a good human?" Like,
Warren Buffett right, so it's an
interesting question. Of course I answered
that in the affirmative because, you know,
humans just tend to be poor investors,
right? We just tend to sort of do the
wrong things, like I did over the last
two weeks. I sold my Apple stock, I sold
Nvidia several months ago, just like
terrible decisions, you know. [laughter] I feel like
a real, you know, I feel like a chump for
for doing that, you know, and yet, and so I
realized I really should give my money
to the machine and so I started doing
more and more of that, other than Warren
Buffett, you know. But the same question
applies increasingly to a number of
problems, you know. Should you trust your
tax return to a robot? And mine is simple
enough that I probably would trust a
robot to do that. Okay. And then
the number of other questions, "Should you
trust you child's education,
transportation, etc etc etc?" and the
question is, how do we make this decision?
And so I proposed a simple way to think
about it which is, in terms of two
dimensions, and this was published in the
Harvard Business Review, and if any of
you are familiar with that, you know that
unless you can explain your problem in
two dimensions, it's not going to make it
and so fortunately I could explain it in
two dimensions. One of them was
predictability, so every problem lies on
a spectrum from complete randomness to
complete determinism and I've laid out
these problems there, and so you might
think that, you know, high signal problems
are the ones we should automate, whereas
low signal ones are human, but
that's just not true, right? High
frequency trading is automated, like and
you know, the winning rates there are
just north of 50 percent, and yet
driverless cars aren't. I took my first
drive, you know, autonomous driving
exercise in Manhattan the other day and
it took me all of 15 minutes to convince
myself that the car wouldn't kill me, but
that's only because I did have access to
the controls, you know. But I suspect that
you know, years from now, as we see that
this stuff seems to work ,we'll sort of
cede control to the machine, and so the
question is, what really matters here? And
the answer is,
that it depends on not just predictability but
the cost of error. Right? So the x-axis is,
"How often do things go wrong?" and the
y-axis is, "When things go wrong, how badly
do they go wrong?" and this sort of, this
gives us sort of this, you know, what I
call a frontier, an automation frontier,
where to the bottom right, we'll, you know
we tend to trust machines, and you know,
towards the top left, we don't. That's the
domain of human beings, not that human
beings do it well necessarily, but it's
just that we don't really trust machines
with, you know, in that zone. So problems
are not static data, better algorithms
move problems to the right and
towards the lower part of this figure,
and that's the challenge for many data
scientists, is "How do you really move these
problems towards the bottom right and
make them amenable to automation?" which
is another way of saying, "You do not
trust the machine to make the right
kinds of decisions." Have I gone past
eight minutes? Okay, so let me just
summarize and I'll say that, you know,
autonomous learning systems encode a
process and that's ultimately what, you
know, we put our trust in. The
challenge is how we move problems to the
lower right. The framework of course says
nothing about important things like
regulation, like financial services is
the most regulated industry in the world,
so they're the issues that really go
into fielding these kinds of systems in
reality have to do with concerns like,
"Are they fair? Is the process fair? Is the
process perpetuating
existing bias in the data? Is it morally
acceptable? Is it violating privacy? Right,
so these are all really vexing problems
that we're trying to address at the
moment that are essential to how we
actually end up using these kinds of
systems. So these are the kinds of
critical issues we need to think about
because at the end of the day, all of
these systems are systems where we're
handing over control to the machine,
which is making decisions at scale. Thank
you, all right. Arun: Thank You Vasant. That
sort of sets us up really nicely for
Michael's talk. So we're delighted to have
Professor Michael Kearns join us for
this session.
Michael's bio is sort of so long that
like, you keep scrolling through the
webpage and you think you're done and
like, you know, his list of
accomplishments is ah... so I put what I could
onto a little card. He's the National
National Center Chair of Computer and Information
Sciences at the University of
Pennsylvania, where he also has
affiliations with economic statistics
and with the Wharton School, like you
know, those of us who are familiar with
Michael's research know him as sort of
the quintessence, sort of the
quintessential interdisciplinary scholar
who sort of pulls from different
disciplines to sort of give us a new
perspective onto, like, you know, sort of
the most vexing research problems that
we face. He's also the Chief Scientist of
MANA Partners, which is a trading tech
and asset management firm, like you know,
in Midtown, and advises a wide variety of
organizations that include Microsoft
Research, FINRA and the Alan Turing
Institute, and Michael will be talking to
us about fair algorithms for machine
learning and sort of their applications
to financial services.
Michael Kearns, University of Pennsylvania, National Center Chair and Computer Science Professor: Okay, thanks Arun, and thanks to Arun and
Vasant for inviting me to speak. I've
known them for many years and done a lot
of fun events here with them. So I want
to kind of follow up in a slightly more
technical vein on some of the things
that Vasant said at the end of his
presentation, and I want to talk in
particular about fairness or fair
algorithms for machine learning. And I'm
not gonna cast it in the context
specifically of FinTech or consumer
finance, but I think you'll be able to
easily map this onto that because I
think one of the the main things that's
changed in a sort of modern
machine learning era is not just the
things you read in the New York Times
about, you know, machine learning solving
important problems like detecting cats
and images and playing Go, but really the
more important development is the fact
that, it is now the case, that on a
regular basis, at scale machines are
making autonomous decisions about things
that are of consequence to specific
citizens lives, like whether you get a
loan, what criminal sentence you receive,
what college you get into, whether you
get the job that you want. And so, it's
not surprising that kind of quickly, on
the heels of that, in addition to the
articles about AlphaGo, for instance,
there's now a plethora of articles about
instances in which machine learning
exhibits discrimination or bias against
some particular group, whether it be
gender-based or racial or what-have-you,
whether it's in the domain of what ads
you're shown on Google to what criminal
sentence you receive, this is happening
quite often now and it's, you know, kind
of the dark side if you like of AI
machine learning, or one of the dark
sides. And so I've spent a lot of time
thinking about this problem, both from a
technical standpoint and also from more
of a legal and regulatory standpoint,
over a couple of the past couple of
years, and one of the things I find when
I talk to people who are outside of the
machine learning community is this kind
of belief that they have that the code
underlying most machine
learning systems is some incredibly
opaque complicated tangle spagetti of
code that might, let's say, rival
complexity of Grand Theft Auto, and this
slide is meant to show you that actually,
nothing could be further from the truth.
This is lifted straight from the
Wikipedia page on stochastic gradient
descent, which is simply these which is
the simple algorithm that underlies
training of neural networks for instance,
o deep learning. This is basically the
code for deep learning I'm showing you,
and so the first thing I want you to
notice is that it's incredibly simple. I
mean, of course, if you had to really
implement this it would get a little bit
longer, but not by an order of magnitude
and nothing could be more basic and
objective and scientific than this piece
of code. I've highlighted in red kind of
the action part of this code, you know,
for each example in some data set
compute the error of your current model
and adjust the parameters of your model
so as to reduce the error. Okay?
And so not only is this incredibly
simple, but there's nothing blatantly
discriminatory in it. It doesn't say, you
know, "If the applicant is from this
racial group, then branch over here." Okay?
So if there's discrimination and bias in
modern machine learning systems it's not
sort of directly detectable from this
code or attributable to kind of the
complexity and opaqueness of this code.
It's got to be, you know, somewhere else,
or, more than one place perhaps, and so
you know, one question is how does
discrimination and bias arise? And you
know, I'm here to tell you the good news
is that there
multiple places that this can happen. So
first of all it could happen kind of at
the input, in the data. So, you know, most
data sets that are used for, you know, sort
of consequential decisions about
individual citizens are gathered from
some existing process that might have
predated the era of machine learning, So
for example, in things like predicted
policing, you know, you're much more
likely to make more arrests where you've
already put more police in the past and
so if you use that data and feed it for
instance to the code that I gave you on
the last slide, it wouldn't be surprising
if the resulting model learned that you
should kind of continue to put more
police where you've been putting them,
not because there's sort of inherently
more crime or bad people there, but maybe
just because you've generated bias in
the data through your own collection
process. There can also be bias at the
output, right? So there can be
discriminatory treatment of
subpopulations just from the
demographics or the relative weighting
of the data. So, in general, if you have a
large population and there's a minority
subgroup by definition you have less
data on the minority and if you're
building a model that's designed to fit
the data well it's probably going to fit
the properties of the majority well. Okay
so, many of you may have read recent
articles about medical studies that, you
know, tend to be highly biased towards
subjects who are white males, and so the
results of the models are strongly
biased towards, you know, the specific
properties and medical and biological
properties of that group. Okay? And so the
third way that this can happen of course
is in between the input and the output,
which is in the algorithm itself, and one
way that the algorithm itself can
generate bias is that a lot of times, for
instance in consumer lending decisions,
there's a count of, there's sort of
counterfactual data that you don't see.
So if you have an algorithm that's
making mortgage decisions for instance,
you find out whether the people you gave
mortgages paid them back or not, but you
don't find out whether the people you
denied mortgages would have paid them
back if you'd given them to them. Okay?
And one point I might make is that a lot
of traditional ways of thinking about
fairness and discrimination in
statistical modeling
tend to kind of primarily care about
just the output, sort of like, is the
model that's been trained by the process
fair in some statistical sense? Okay? And
I think that this is a bit of a fiction
these days because it kind of harkens
back to, if you've taken like a basic
statistics course that teaches you like
train test methodology, so you have a
data set and you use some of it to train
the model, but you save some of it out to
test the model, because of course the
reason you're fitting a model to date in
the first place is not to fit the data
that's in front of you well, but to
generalize out-of-sample. Okay? And what
the observation I would make these days
is that that separation between a
training period and a testing period is
basically a fiction these days, right?
Where learning is happening perpetually,
so every time you click on an ad on
Google, that is immediate feedback that's
directly incorporated right away into
the improvement of the models that are
making those decisions about, let's say,
what ads they are shown. Okay? And so if
you kind of forgive a machine learning
system for some prefix of its existence
for making unfair decisions during the
"training process" you're basically like
letting it off the hook forever.
Okay? And so it's important to not just
think about algorithms that produce
fair output at the end, but the
algorithms that are operating and
adapting perpetually and need to make
fair decisions all along the path of
their existence. Okay? And so, what a bunch
of colleagues and I have been doing for
the last couple of years or so is trying
to make a science out of this, is
actually trying to make a science out of
the design of fair learning
algorithms. Okay? And so at a high level
the methodology involves picking a
particular framework, like whether it's
a supervised learning problem or more of
a sequential decision making problem or
a clustering problem or the like, and
pick some technical definition of
fairness. I'll say a little bit more
about possible definitions shortly. And
then basically it to explore the
consequences of that definition in that
particular learning framework and try to
design algorithms that have some sort of
guarantee about fairness. Okay? And more
importantly, or equally importantly, not
just to design those algorithms but to
understand what the costs of fairness
are, and I'll come back to that in a
minute, because there will be costs to
fairness, right? We can't,
you know, the idea that you can expect to
have an optimally accurate model and a
fairer model simultaneously is a fiction,
okay? And I'll make that point clear at
the end. Okay so just to quickly say, you
know, it says, you know, the interest in
fairness and machine learning is at a
high point right now, but concerns about
fairness has been, in statistical
modeling, have been around for a while
and just to give a little bit of a
flavor of how one might try to be a
little bit more quantitative about what
you mean by "fairness" in machine learning
or in a learned model, you know, one
definition is a kind of statistical
parity. So that would be a very crude
notion where you say like, "Well, the rate
at which I give loans to Group A has to
equal the rate at which I give loans to
Group B." Okay? A more refined
notion would be kind of equality of
false rejections for instance. It's
okay if I have different rates at which
I'm giving loans to two groups as long
as the rate at which I make kind of
consequential mistakes is the same. So,
the fraction of time that I
deny white people who are actually
credit worthy alone has to match the
fraction of black people that I deny
alone that are credit worthy. Okay? So,
this is some kind of notion of fairness.
It doesn't promise anything at the
individual level, right? So if you are one
of the people in one of those two racial
groups that was denied a loan
and you should have been given one, it's
kind of cold comfort to you that it's
like, "Okay well we made a mistake on you,
but trust us, we're making the same
fraction of mistakes on this other group
over here, so you should feel better
about that." Okay? So this is a flavor of
the kind of work that many of us are
doing these days and let me just quickly,
you know, summarize what we know. It's
still early, so we don't know a lot yet.
There is older work from the statistics
and data mining community that tends to
be a little bit more heuristic and
doesn't make kind of, you know, formal
promises about fairness. And one
thing I might mention is that, one thing
we do know is that unfortunately there's
a little bit of definitional messiness
in fairness. Okay, so what do I mean by
that? I mean that, so for those of you
that have an econ background
and have heard of things like Arrows
Impossibility Theorem, there's sort of
impossibility theorems for fairness, so
there's been a couple of papers in the
in the past two years or so that
basically have the following form. The
paper starts up and say, "Well look, here
are three properties that we can all
agree any definition of fairness should
meet, right?" And you look at those
properties and you say, "Yes, yes, these are
you know, these are minimal properties. We
would definitely want these and probably
stronger things." And then, you know, the
punchline of the paper is, "Well guess
what? Here's a theorem that says you
cannot possibly achieve all three of
those criteria simultaneously except in
trivial cases that will never arise in
real problems." Okay? So unfortunately
there's not going to be some single
definition of fairness, and it means that
when you pick one definition of fairness,
you might be trading it off against
another notion of fairness that you
think is equally valid. Okay? But that's
the way life is, and you know one has to
proceed with the hand that one is dealt.
Maybe the most important message I want
to give is just in this slide and then
I'll wrap up so we have some time for a
Q&A and discussion. This is a plot from
an actual data set in which fairness is
a major concern. It's the so called
COMPAS Criminal Recidivism Data Set. I
don't know if some of you that might
have followed the whole controversy
about, you know, the use of statistical
model modeling and criminal sentencing
decisions that's taken place over the
past couple of years. This is that same
data set. And I won't I won't describe in
detail what the two axes are other than
to tell you that the x-axis is a measure
of how unfair a statistical model is and
the y-axis is a measure of the
predictive error of that of that model.
Okay? And I'm showing a curve here where
each red circle corresponds to a
different model. Okay? And note that where
you'd really like to be in this picture
is in the lower left-hand corner.
You'd like to have the smallest possible
unfairness and the smallest possible
error in your predictions, and notice
there are no red points down in that
lower corner. Okay? And so, you know, those
of you who have a finance background,
this is like the efficient or Pareto
frontier of accuracy versus fairness and
you just cannot get off of
this curve. You cannot do better than this curve. You
can choose to have a model like the one
on the far right which minimizes your
predictive error, okay, but then you're
gonna have the highest unfairness. You
can also choose to be at the upper left,
where your predictive error is high, but
you have a very fair
model, or you can pick anywhere you want
in between, but you can't get off of this
curve. Okay? And so, an important thing I
think for society to think about in a
kind of application by application way
is there's always going to be a curve
that looks something like this, and you
have to pick where you are on it and
where you pick should probably depend on
how important the decisions you're
making are. For instance, maybe if what's
at stake is criminal sentencing you
really want to be more to the upper left
there and be sure that you're being fair
to subjects even if it's costing
accuracy, but it will cost you accuracy.
Whereas maybe it's what ads people are
are shown on Google
we're less concerned about fairness. Okay?
So, let me stop there and just say that
more generally, this type of research is
part of, I think, a pretty important trend
within the machine learning research
community to try to start thinking about
machine learning and social norms and AI
and social norms and not just worrying
about them, because, you know, every
article I pick up on this topic in the
New York Times, Wall Street Journal, or
wherever is worrying about this topic,
but the question is what can one do
about it? Okay? And the traditional
response to this would be like, "Well, we
need legal, regulatory, and watchdog
frameworks to sort of police the models
and algorithms that are used in these
systems." And I don't disagree that that's
going to be, continue to be, as it always
has been historically, an extremely
important piece of the puzzle, but
unfortunately, those types of methods, you
know, to use computer science terms, don't
scale. If we're gonna count on human
based organizations to actually audit
and regulate by sort of like looking at
the behavior of models "by hand" so to
speak, it won't scale. Okay? And so the, I
think a big part of the answer here is
to endogenize these social norms into
the algorithms themselves. And just to be
clear, I'm not proposing that machines
decide what fairness means or what
privacy means or what morality means. I
think that those decisions should and
always will be made by humans, but once
you've committed to what you mean by
them, then there's a chance you could
actually encode them in your
algorithms and this is where I think the
the kind of important research work lies
at least. Thank you. Arun Sundararajan: All right, thank you
Michael. That's awesome. [audience applause] So we have
something approaching 20 minutes left
for discussion, and I'm gonna do the
following - I'm gonna ask each of you one
question, and then I'm gonna open it up
to the audience. So my question for you
Vasant is, I mean, like, you know, many of
us are fans of your framework. That's
sort of actually in the like, you know, I
mean we're all thankful to Harvard
Business Review for having sort of
forced you to simplify into that two
by two. But if you sort of look at that
mapping and you look at the kinds of
decisions that are being made in
financial services industries today, and
you look forward, like, you know, the next
five to seven years, what are the
financial services domains in which you
will see an increase in the rate at
which we start to delegate decisions to
robots? Vasant Dhar: So you know, I I think the key to
that is to look at those areas where
there's more and more data becoming
available, and broadly speaking I'd
say that it would fall into the
following categories. I think you're
going to see more in the investment side
and you're already seeing that with robo
advisors, you know, that at the moment do
very simple kinds of things like
portfolio optimization that humans tend
to be sloppy at doing, right? So they,
so they take the grunt work out
of that and keep your portfolio balanced.
You know, but they'll gradually start
creeping up the value chain and start
doing sort of more of the stuff that I
started with, which was the, you know,
making the decision-making part. Arun: Yeah.
Vasant: Whether people trust robo advisors
with that kind of thing
remains to be seen. You know, it'll depend
on how they work out, the quality of
those decisions, but I would expect those
platforms to start trying to creep up
the value chain, in terms of like, you
know, not just how to optimize, but what
to do or to invest in. Lending is going
to be another big space. There's
lots more data in that, so you're gonna
see, you know, more sort of automated
decision-making in the lending space. The
concern there has to do with sort of the
opacity of the models. You know the
concerns I see out there are, "If I put an
automated learning systems out there
that's making lending decisions, how do I
know that people won't start gaming it
if they find out the basis on which the
system is making decisions?" So those are
the barriers, but I'd expect, you know,
that notwithstanding, to be more activity
in that space. Arun: And that's a game we've
been playing with spam filter systems. Vasant: Exactly!
Arun: With like, you know, sort of search engine
ranking system. Vasant: Right, same problem,
different context, right? So we'll see
more of that. I think we'll see more in
the targeting space, right? We're just
seeing more payment flow, you know all
kinds of transactional flows, so there'll
be more targeting going on. And then of
course there's sort of processes in
general, right? And you're seeing
that in sort of, you know, if you look at
the typical sort of investment portfolio
of a VC, you know, you see these areas
and you see, you know, regtech, process tech,
where people are sort of you know
addressing small parts of the process:
compliance, regulation. Just, those kinds
of things that are repetitive, but once
you have enough data you can actually
learn how to do those things better so
we're seeing innovation in that space
already. Arun: Okay, I mean it's
interesting because there's this sort of
this breaking up into three pieces right?
I mean one is, where will the automation
take place? Where will decisions be
delegated to robots? And there's an
associated problem, which is, as these are
delegated, will the individuals trust the
systems? And, you know, I think in some
cases that are
consumer-facing, that's an important
determinant of the first issue. In
other cases, which, where it's sort of
buried more deep into sort of, the
organization, it's less of an issue. Then
there's the third question, which is like,
you know, how is this going to sort of
affect societal outcomes? Right? As we
sort of move more. I mean we talked about
what, robo advising, lending, mortgages,
you know, payment systems. So my
question to you, Michael is actually
really simple, which is, you know, there's
what you
talked about was really fascinating, like
you know, we, like you know, economists
understand this sort of trade off between
,optimality and some notion of fairness
and as a community we've sort of ignored
the second issue in favor of the first
,right? I mean we maximize total surplus
and don't worry about distributional
effects. But then you came up with a
couple of definitions of, you know,
different kinds of fairness, like you
know, rate at which you sort of, you know,
come up with positives, rate at which you
sort of make certain kinds of errors, and
then as you concluded, it was sort of a
you know, the the message was that this
would not scale if human beings are
involved in sort of assessing the
fairness, but so my question is, so what
do we do? I mean like, you know, what it's
I think it's clear from the hearings
that are going on on Capitol Hill in a
different context that algorithmic bias
can sort of have like you know huge
societal consequences. I mean based on
the research that you're seeing and
you're doing, I mean like you know, where
do you see the solution lying for the
financial services space? Michael: For financial
services per se? Arun: Or just in general? Michael: Yeah,
I mean I think we're at the point
where, you know,  the concerns about
discrimination and unfairness and
statistical modeling sort of transcend
any one particular sector, but generally
are highest in things that, you know,
touch consumers' lives in a
very direct way. I mean in terms of what
should, what needs to be done, I think
from the research side the kind of work
that I was talking about is exactly the
right kind of work. One one thing I
might, you know, sort of point out is that
I think when you're trying to make
promises to
users or people about social norms being
preserved or defended by an algorithm,
there's a particularly
important role for a theory to play. And
what I mean by that is, you know, how do
you give people promises that
an algorithm meets some social norm,
right? So an analogy I like to make is
that of cryptography.
So you know, for like two thousand years
cryptography preceded by code makers
basically, you know, taking messages and
devising some algorithm or scheme that
kind of scrambled things around and you
know, saying, "Well, sure looks random to me!"
Until somebody comes along and, you know,
reverse engineers that scrambling and
then suddenly it doesn't look random at
all.
Okay? And cryptography, you know, sort of
made a huge leap forward in the 70s when
it moved to public key cryptography and
the algorithm didn't have to be kind of
obfuscated anymore, and the kind
of security of the cryptography was
based on sort of a kind of a hard
mathematical promise and proof,
right? So this was kind of what, and
you know, this is what led to RSA and
this is what underlies HTTPS so you know,
again, this doesn't mean that there
aren't hacks available or poor
implementations of modern cryptography,
but the point is, that you
kind of, sort of, you know, two millennia
of people just sort of doing heuristic
things that they couldn't make any
promises or guarantees about and then
would subsequently be broken. I think
kind of, sort of, having firm mathematical
grounding for these problems is sort
of, relatively important in settings
where decisions are touching individual
civilians and consumers. Arun: But to me the
paradox is then, and just so that in
the interest of sort of being efficient,
if you do have a question for either
panelist, you know there are
microphones on both sides. And so if you
want to just sort of start lining up,
that will allow you to ask your question
first and it'll also save us a little
time. You know, when when you
concluded, you talked about how we don't
want to delegate these decisions
to humans, which sort of means
that we want to delegate them to
algorithms, right? We were talking about
YouTube and how a lot of regulation has
been delegated to YouTube and they are
making these decisions algorithmically
now. But if these algorithms are
inherently sort of subject to the kind
of bias that we're trying to prevent, I
mean, what's the fix? Is it
sort of the absorption of these norms? Michael: Yes, so
again, I'm not sort of saying
humans shouldn't be in the loop here, but
if literally human beings were
the ones, if human beings sort
of have regulatory and watchdog agencies
actually have to catch all instances of
malfeasance directly, I don't think
they're gonna be able to keep up with it.
I think this is already failing, okay?
And so maybe a better way of putting it
is that you have to put kind of the
right algorithmic tools in the hands of
regulators and auditors, right? And so you
know, one can actually formulate the
problem of kind of detecting whether a
statistical model has bias as sort of an
instance of machine learning itself,
right? So you know because you're
basically looking for like a subgroup or
a correlation between the input features
in which let's say the false positive
rate conditioned on those features is
very different than the background rate.
And that itself is a machine learning
problem, right? So you know, not to make it
sound like it's going to be kind of
battle of machine learning algorithms
but, you know, I think that those
kinds of tools are going to be important.
And by the way you know one thing I
would mention is that one very kind of
bothersome thing about traditional
definitions of fairness is they sort of
assume that you've predetermined what
subgroups you're trying to protect, like
racial, like I'm trying to protect a
racial subgroup, and this lends itself to
the possibility of what I call kind of
fairness gerrymandering, where you meet
the technical definition of fairness by
discriminating against a subgroup of the
group that you want to protect. So for
instance, you know, if I'm forced to show
an ad for my Country Club to some group
that I don't really want to join my
Country Club, but I'm forced to show it
to them at some rate, okay,
I can try to use machine learning to
find the subgroup of that group that I
know can't afford my country club, and
meet my quota by showing it to them. So
now I've kind of subdivided the group
that you told me we wanted to protect
and deliberately targeted a group that I
know you know won't be able to take me
up on the offer that I make them, okay?
And so I think that, you know, it's
even worse than sort of protecting
against false positive or false negative
disparities between groups. You need to
like be worried abou,t kind of an
explosively combinatorially large
combination of subgroups. Arun: Okay, so we seem
to have three questions so far. Why
don't we take all three? One after the
other. We all have pens and little index
cards and then I'll invite the sort of
like you know the panelists to answer
any subset of those three questions. So
why don't we start on this side? Audience member: Yes in
the conversation about robo investing
and even ethics here it seems to me it's
all very much oriented around passivity,
meaning machines are making decisions on
behalf of institutions for consumers and
I guess I'm interested in understanding
what your thoughts are on
allowing machine learning to empower the
individual to be a component of those
ethical decisions, meaning you're
filtering information to allow them to
be more active participants in a
decision process as opposed to passive.
Arun: Okay, great and what was your name? Audience member: My
name is Emile Westergaard. Arun: Okay, thank you.
So, that's the first question, passivity,
and like making algorithms components of the decision. Audience Member:  I'm Paul Motty from MES Finance.
My question is really, isn't the sweet
spot for AI application and financial
services payments fundamentally because
we have the convergence of digitization
and payments, we have the convergence of
deregulation where there's a mandate, PSD2
in Europe, and we have a whole
movement globally to instant payments, so
you have at least 20 companies at 20
countries that are moving to instant
payments, fast payments. Here in the U.S.
TCH, the
clearinghouse was moving to RTP
real-time payments. There's no way human
intervention can address instant
payments from a compliance standpoint, so
you need the application of AI to that
area to be able to interrogate all of
that activity. Arun: Okay great, and your name
was?
Audience Member: Paul Mahady from MAS. Arun: Okay thank you. Audience Member: Hi, my
name is Ahmad.
I run a consulting firm. I'm a programmer,
and I have programmers who build systems
for financial services, most of the big
banks, and I want to ask in specific in
terms of AI, I started computer science
bachelor's and master's and most of my
time talking to clients is not really
writing code, but managing expectations
because of the hype that people have
about AI at consciousness and
philosophical intellectual bs sometimes...
Arun: It's us professors who are setting those
expectations that you have to manage. [audience laughs] Audience member: So maybe along those lines,
but for the computer science professor,
if you studied Marvin Minsky's work and
from the eighties and the AI winters if
you really think about it, really nothing
new has been really extremely developed
in AI. It's mainly more computing power,
but nothing in the theory of it, and so
yeah, I mean what's your comment on that?
You know, I've always
argued with people about how AI is part
of computer science, not part of
philosophy. There's no machine... but yeah I
mean, has there have been really big
changes if you think about from a
technical perspective, or is it just the
fact that we can upload stuff to the
cloud and process better in terms of
cognizance and what decisions
are making? How simplified, how structured
they are versus, you know, complex human
decision. Anyway... Arun: Okay great, thanks.
We'll take one final question and then we'll open for answers.
Audience Member: Hi, I'm Maria D'Albert. My question
actually relates to some of the
questions you raised about fairness. I'm
particularly interested in areas of
financial inclusion and I was curious about
your thoughts on, beyond not being able
to keep up, the, I think there's an
assumption that the, almost inherent in
the framework, that there is a baseline
of fairness in the data sets
that exist and in fact some of the
challenges and in the tightness of
regulation these days is that they
actually are reasserting
the limitations and what data is used,
how do you know who somebody is, run
around KYC, and so I was curious about
your thoughts on how does AI partner to
open up the understanding of what is
what is a fair set of data, fair set of
algorithms, and decision trees? Because
those seem actually regressive in many
ways, because of the current environment.
Arun: Okay great, thanks. So I'll just
summarize. The first question was about
whether we should reject passivity, and
make machine learning a component of,
like you know, sort of some of the
decisions that we're talking about. The
second question was, is payments the
sweet spot for sort of AI applications
because, like you know, we need instant
payment and, like you know, you can't have
humans do compliance there? The third
question, asked for a comment on, like you
know, is the only new thing about AI
today the fact that we've got sort of
massive new gobs of computing power? And
the final question was about, like you
know, sort of creating a baseline for
fairness in sort of Financial Inclusion
and sort of more definitional issues. So
I throw it open to the two of you. Michael: Could
be here all day. Arun: So we do have, what, seven
seven more hours before the reception.
Vasant: Well I guess I can start with the one
about managing expectations and whether
there's anything new. I think it's a
little uncharitable to say that there's
nothing new. I think that, you know, the
fact, you know I would not, and maybe it's
just because I have low expectations, but
I would not have imagined that I would
see in my lifetime the system where you
type in one language and the correct
answer comes out in another language
instantly, right? I would not have
imagined that that would happen in my
lifetime, and I've been in AI, you know
for 38 years,
you know, indicating my age here. I would
not have imagined that... Arun: You started when you were born. Vasant: I would have started,
yeah, exactly, close to it. Arun: In the womb. [audience laughs]. Vasant: You know I
would not have imagined that, you know,
we'd have cars driving us around
autonomously, you know, and that famous
law so he would drive his famously
and suddenly to that you know machines
could not be intelligent because they
couldn't perform basic tasks like
driving. So I think the proof is in the
pudding, right? We're seeing some really
powerful technologies emerge and
powerful capabilities, so there must be a
basis for why all of this stuff actually
seems to work. I have a harder time
describing it in one sentence or saying
here's the theory behind it, but I think
collectively there's been a lot of
progress in bits and pieces, and by
seeing the impacts of that. And what do
you feel Michael?
Michael: Well, let me, since you answered that one,
so eloquently let me try to just move
around here a little bit and maybe
address these pieces of the first and
the fourth questions, which I think are
related in some ways. So if I understood
the first question, it was about, you know,
what about the possibility of providing
machine learning or algorithmic tools to
consumers to participate in some of
these decisions that seem to be moving
towards automation? And, you know, I'm
personally all for that. Of course one of
the difficulties is that, you know, it's
not just the machine learning tools, it's
kind of who controls the data, right? And
so this would involve also, you know,
giving access to the data that's being
used to make your mortgage decision to
you. Much of that data is proprietary and
you know, these days your online behavior
often gets fed into credit scoring
models and things like this, but they're
I'll take an, you know, maybe even a more
philosophically extreme question in the
same direction or common in the same
direction, which is, you know, I think it's
entirely reasonable and important for
society as a whole to, you know, kind of
collectively decide what types of
decisions they think machines
should and shouldn't be made. And so you
know, this sort of subfield of AI that
thinks about ethical questions, there is
debate about this. And to give one
example of you know, an extremal point
you know, I think many people believe
that in automated warfare for instance,
the decision to kill another human being
should never be made by a machine or an
algorithm, even if the machine or
algorithm was much more "accurate" in some
sense than human decision makers, just
because there's like a moral agency to
that decision, that a machine
possibly share, and so even if the human
being is gonna make more mistakes, you
don't want, you know, you
don't want algorithms making the
decision. And so I think that this is
actually sort of an important discussion
for society to be having as a whole.
Arun: I mean it's interesting you bring that up.
Michael: It's about sort of drawing boundaries, right?
Arun: Because, you know, because I'm
reminded of Wendell Wallach and his sort
of like, you know, his effort over the
last five years to try to get
governments to commit to this. Michael: Right.
Arun: unsuccessfully, and sort of, and this
and this, seems like an obvious one, right? Michael: Yeah.
yeah. Arun: And so does that sort of highlight
a more general problem? Michael: Yeah, and I mean
you know, many of you may have seen these
kind of philosophical conundrums around
self-driving cars, where your
self-driving car hasn't, you know, it's
lost its brake system and so it has a
choice of either saving your life by
plowing into a crowd of schoolchildren
or taking a left turn into a wall and
killing you, you know. What should the
algorithm do? And so these things are
kind of parlor games in some ways, but
they're getting it, you know, I think
important issues for us to be thinking
about. On the last question about
you know, kind of fairness and inclusion
in finance, which again I think is a very
important issue, I think we should
remember that things like discrimination
and bias were with us before there were
computers, right? And so it is important
to remember that, you know, these
trade-offs that I talked about, you know,
the only thing that's kind of new here
is our ability to quantify those
trade-offs in a scientifically precise
way and study them and decide where we
want to be on the trade-off. But these
trade-offs were always there when human
beings were making criminal sentencing
decisions or loan decisions or what have
you, and so it's important to, it's
important that we not try to expect too
much from algorithms and machines, right?
We, I mean, the, you know, we can't expect
them to be perfectly fair and perfectly
accurate, but we can at least as a
starting point ask that they, you know,
not be worse than the discrimination and
bias that human decision makers made. I
think one of the biggest concerns about
sort of moving from human to algorithmic
decision making in these sensitive areas
really has to do with the scale and
uniformity of decision making. So, you
know, there have always been like racist
loan officers, right? And so you
you know in the old days you'd go to
your neighborhood bank and you know the
loan officer would deny your loan and
they wouldn't tell you it's because of
you know your race. They would make up a
bunch of other reasons. But at least in
that era there was the chance that if
you were denied the loan at one bank
maybe you could walk to the bank next
door and get a completely different
decision from somebody who wasn't racist.
Now that racism, you know, if a racism and
discrimination are baked into
statistical models, and for instance
there's really only a couple of credit
scoring companies, you know, that that
generate all of our scores, these days if
you're if you're rejected by a loan from
one bank, the conditional probability
that you're going to be rejected by all
of the banks is extremely high, because
they're all using the same models
derived from the same data. And so, and
you know, just the scale and
uniformity of the decision-making that's
possible with algorithms
I think presents kind of a special
danger that we need to be very vigilant
about defending. Arun: Okay so I'm
getting signals that we need to wrap up, so
Vasant do you have any sort of
closing thoughts before I wrap up? Vasant: I was actually going to
respond to one more question, which is
that yeah, I agree that payments is a
sweet spot, in fact, it's also one of
the vehicles for inclusion in India,
right? You're seeing more payments happen,
and so you're seeing the potential for
now people to actually make loans to
these people making payments, so those
two things actually can even go
hand-in-hand. Arun: Ok, so just to sort of
wrap up in 30 seconds it seems like
we've sort of seen both a framework for
thinking about when will computers make
decisions and when will humans, and like you
know, what kind of, like you know, how will
that frontier move? And sort of like you
know, sort of the beginnings of what
seems to be a tremendously important
sort of discussion about as we sort of
start to delegate decisions more and
more to robots in the financial services
space, how do we integrate sort of issues
of fairness? Seems like there are
industries ranging from sort of robo
advising to mortgage lending all the way
down to financial compliance, which I
think is going to sort of be automated
at a pace that is greater than, like you
know, many others and not just in
payments, where we're going to sort of
see a lot of automation over the next
ten years,
and you know, I heard Michael sort of
call for more theory in thinking about,
like you know, sort of how to address the
issues of fairness. I heard Vasant
call for lowered expectations, and like
you know, sort of like you know, what
we place in these AI systems. Michael: Yes, those two go well together.
Arun: Yes, and you know I just want to
conclude by thanking Kathleen and Raghu
for, like you know, putting together such
an amazing program that you're going to
sort of see play out after the
break. I was thankful that you introduced
me, Kathleen, and not Raghu because he
used to be my professor when I was in
grad school, and every time he introduces
me he tells an embarrassing story about
me and grad school and so it was nice to
sort of see the upside and the more
recent sort of, a more recent view of me.
But like, you know, thank you again and lets wrap up.
[Applause]
Kathleen: Okay so,  that was amazing. So we're
running like 15 behind schedule, so if
you guys could be back in here by
quarter of? Which is a time for a quick
coffee, bathroom break, whatever. Come back
here a quarter of, no later, okay? Thank
you!
{Blank}
