our first speaker is Robert Stojnic he's
from Atlas ML and Atlas ML is all about
democratizing machine learning in the
open-source community he also runs
something called papers with code which
is taking a data-driven approach to
seeing what is hot at the moment in
machine learning so this will be a
really good opening presentation welcome
Robert thank you thank you very much for
everyone who came here on this horrible
day and withstanding the cold and the
damp and everything hopeful for you guys
all right so my name is Robert and CEO
of Atlas ml and we created this website
papers with code essentially to be able
to keep up with em out so what is the
problem with keeping up with Emma well
every day there is like a hundred papers
published this is probably as many
github repositories out there just just
this constant avalanche of information
but new methods new results new this new
that so some time last year in July we
created this website where essentially
said well we want to know what are the
current papers of the community is
talking about and we only want to focus
on papers that actually have code
implementation so there's something to
try to not just something you know
people have claimed that they've done so
we kind of automatically linked those
and we present them in a kind of our in
a trending page if you go papers with
cocom this is going to look differently
because the screenshot from last week
and that's kind of lets you monitor what
happens with the latest research but
that wasn't enough so another problem we
found when we try to keep up with
machine learning is we never really knew
like what's the best method for a
specific thing because there's always
new results coming out even if you know
even when I went to a researcher and
asked them like what's the current state
of the art in semantic segmentation
they'll be like oh yeah I heard about
this paper but I'm not sure and so on
so essentially what we've done then in
February this year is we created the
biggest database of machine learning
leaderboard
tasks datasets and papers with code so
essentially we have all of this
information in one place
so you can more easily track progress in
machine learning so for instance if I'm
interested in computer vision and I'm
interested in semantic segmentation I
can click on that card there I will
leave me a page two looks like this now
give me a brief description of what
semantic segmentation is and then it's
going to give me all the different data
sets that people use for semantic
segmentation it will give me what's the
current currently the best methods on
each of these different data set and we
directly link me to the paper and to the
code if I want to go even a bit further
and for instance I'm specifically
interested in the Pascal contacts data
set I can click on that and get the
graph that looks like this so this is
showing me how the Green Line is the
state-of-the-art how it evolved over
time and this is showing me how we made
as a community progress over time on
this specific benchmark so if you were a
researcher you might want to understand
what are these sudden jumps what are the
insights would enable them and if you're
a practitioner you might want to look at
a couple of best methods and see how
they perform on your data set or if
there are any ideas that you can use
from there and once again for each of
these you get links to papers and to
code to try them out so this is papers
we code right now a lot of a lot of the
data aggregation is automated however
sprinkled around the website they're
also edit buttons so if you find an
error or there is a result that hasn't
been included maybe it's your result
someone else's result you can go in and
edit it we made anything completely open
to the community and and don't worry
everything is version so you know don't
worry about adding stuff we also have a
pretty big community on slack and and
and otherwise of people who add results
who review them and so on and we are we
are entirely indebted with this
community and kind of keeping this up
all the data is open sourced so if you
go on here and about you can download
all the data at CC by a cc
I say so just like Wikipedia you can use
it for whatever purpose you want as long
as you credit us so essentially this
website you want to just kind of bring
together this information and make sure
this information is accessible for
everyone all right so now in the second
half I wanted to tell you about some of
the insights from the data that we got
so first of all you might want to know
what are the three most popular papers
we code in this year so far
number three Gans so here is a work from
a group in Korea so you can take a bold
person drawing some color some color for
hair and then this is going to in paint
the actual here and you can see multiple
examples of that two and a half thousand
stars and github that's number three
number two
Jan's so what we're doing here is is
kind of sketch two image synthesis so
you can draw your sketch of the type of
thing that you want synthesize and then
the gun is essentially take going to
take your sketch so here there was no
sure and we drew a sure and then it's
going to in painterly four and a half
thousand stars in github so what's going
to be number one not ganz see something
even more dangerous and as GPT to so
this is the work from open AI six
thousand stars made it to to the
Guardian as well so the code for this is
actually available so you can get the
code and you can play around with it the
smaller models are also available
they just haven't released the very big
model the very big language model that
essentially breaks the state of the art
and in various benchmarks by essentially
kind of using more data than everyone
else however there is some chapter on
social media that somebody has
reproduced the biggest model and they're
going to release it soon so we'll see
whether these types of predictions are
going to come through or not mistake
releases and you might also want to know
well you know given this whole open
I think like is the community actually
getting more open or less open like are
we going as a community in the right
direction and I have some good news for
you and that is so if we look at the
percentage of machine learning papers
that have code over time that percentage
in 2013 was just under 3% and now last
year it was more than 15% so as a
community I think we're definitely going
in the right direction obviously we're
still off 100% where I think we should
be that every every paper has some sort
of code some something that you can try
out and all the kind of data models are
published with it I think we're going in
there in the right direction
and our mission as as artists ml is to
kind of help that further accelerate
that those developments you know pay for
college is the first thing we built and
we are going to you know release a bunch
of new stuff so if you want to be
up-to-date with what's going going on
you can follow myself on Twitter on
papers with code to see also what are
the latest and interesting sort of
papers thanks very much
thank you for that Robert next up we've
got Jane Wong from deep mind she's a
senior research scientist there and with
a background in computation on cognitive
neuroscience which will make perfect
sense when you hear her talk about
causality thank you Jane thank you very
much so I'll be talking today about
meta-learning causality in humans and AI
now artificial intelligence as we all
know has yielded some pretty impressive
advancements in recent years but one way
in which it still lags the way that
humans are intelligent is in abstract
and causal reasoning so that's why I'm
talking about this today and why is is
this important well in the real world
you we actually were able to learn much
faster if we have a notion of what are
called priors
so if we are able to learn from from our
past experience and apply it to our
current situation so for instance I can
take my knowledge of physics or temporal
continuity and apply it to if I'm
learning a new softball game or
something for instance so cut off a
little bit here but so that even more
useful are causal priors these are just
if you have if you know about
relationships in terms of cause and
effects and having a notion of causality
is important because in the real world
which is very interactive and in which
my own actions can actually cause
effects on the world around me it's very
useful to be able to have a notion of be
able to model those cause-effect
relationships and that can help us to
essentially have faster adaptation and
control exploration planning and also
experimentation so in general it just
essentially makes everything
a lot easier alright so um and and this
is why it's such a fundamental part of
human intelligence and this is important
for also two reasons the first is that
causality is is essentially automatic
it's something that we just do without
really thinking about it and so here's
an example that I'm gonna give you this
is me sitting at my desk maybe at night
this has happened to me before where the
door just kind of opens as if by itself
and it looks really creepy just like
opening by itself and I could actually
get scared because maybe I think that my
house is haunted and and you know I have
a paranormal activity situation here but
I don't and the reason is that I have a
cat and I just know pretty automatically
that I don't have anything to be alarmed
about because he's learned how to open
the door even though I can't see him I
can make that inference automatically
that my cat cause the door to open
another reason why this might be
important
happy causal reasoning is that it's
useful for making important decisions
for instance in the medical industry so
say that I'm a medical doctor which I'm
not but so take all these with a grain
of salt what I'm about to say but if I
were a medical doctor and I have a
patient who has high cholesterol then I
might think about giving them drug a
knowing that drug a lowers cholesterol
but I also know that drug a can reduce
clotting the ability to keep plotting
drug B increases clotting but it also
increases blood pressure which has
subsequent effects on the first two so
this is quite a complex system of
interactions here but if I had access to
to all of these these connections here
then I should be able to make a good
decision about the drugs to to prescribe
now I've presented it this way because
this is what's called a causal graph or
a causal network and it's essentially a
shorthand it's a easy way for us to
visualize all the complex
interdependencies that exists in this
system and allows us to reason about and
to draw conclusions about
you know decisions that we can make in
this situation and there's a lot of
formal tools and algorithms that have
been developed to make explicit use of
such causal graphs so a simple version
of this is the simplest version really
is just this two node causal graph here
or you have where the nodes are
representing variables which are
essentially things that you can measure
in your environment so for instance the
could be cholesterol and a could be my
diet and so essentially I can I can
observe these variable values and think
something called a conditional
dependency and then I can infer the
existence of these arrows I can't
directly observe the arrows the causal
relationships but I can make an
inference based on these observations
and conditional dependencies are just
for example if I know that when I
observe a I always observe B but if I
observe bead that I don't always observe
a it's like 50 percent or something then
I can say with more confidence that
maybe that arrow of causality goes from
A to B and not the other way so there's
a lot more kind of sophisticated ways
that you can derive causal relationships
and much more causal complex causal
relationships and causal graphs from
just observing these these conditional
dependencies and frequencies but there's
a lot of situations in which observation
alone isn't enough and that's when you
have one of your variables is unobserved
so it's hidden right like so for
instance if there was some factor
underlying like illness in my patient
that is causing the high cholesterol and
I don't observe that then that's going
to lead me to erroneous conclusions as
to the causal relationships and in that
case what you have to do is you have to
make what's called an intervention all
that intervention does is it assets one
of the laser it's that's one of the
nodes to a specific value and it
disconnects it from from all the nodes
that are pointing to it and this makes
it a lot easier to be able to reason
about the directionality and the weights
of the subsequent arrows coming from
this particular node particularly if you
compare it to if I don't intervene on
that node and or if I intervene on a
different node so the popular version of
this which you might have heard of is a
randomized control trial which is of
course used a lot in medicine and
pharmacology and sort of the gold
standard in terms of trying to
understand causal relationships but of
course your humans are not conducting
randomized controlled trials on a day to
day basis to be able to make decisions
and to understand causality so that
leads to the next part of my talk which
is how does causality develop in humans
so when does it show up is it something
that exists from the time that we're
born is it something that develops with
experience so to kind of foreshadow a
little bit spoil it a little bit I'm
gonna answer or I'm gonna show you some
evidence that demonstrates that this is
actually both a bit of both so it not
only develops along such developmental
trajectory but it also later on in life
it starts to become more and more
influenced by experience and by the kind
of environments that were inhabiting and
and so it's it's sort of our like
viewing off of this set trajectory a bit
so starting from the beginning from a
baby
essentially although we know a lot about
what make these can and cannot do is
been they've been studied a lot but
what's relevant for causality is that it
essentially seems like they don't
demonstrate causal knowledge they have
rudimentary senses of physics of objects
and even numbers but they don't seem to
have notions of causality now
two-year-olds on the other hand can
learn to predict simple causal
predictive relationships essentially
between two events so I'll put pretty
simple relationships and they can't
spontaneously make interventions based
on causal understanding they can't try
to change something in the world in
order to get more causal knowledge but a
time where three or four years old we
can start to infer more complicated
causal graphs from just observing them
and we can even infer unobserved causes
so like that example
with my cats I'm not able to see my cat
but I can infer that it caused the door
to open nevertheless so they can start
doing these kinds of things consistent
with a more optimal sort of Bayes net
formalism at four to five we can start
to make informative and targeted
interventions based on causal knowledge
so now this is where we we can actively
seek out information in the environment
to develop causal knowledge and
interestingly enough they've been
they've been shown to in certain
situations actually perform be able to
learn causal relationships that are more
complicated than adults so I'm Alison
Gopnik and her lab have done really
wonderful work in this where they
essentially are putting both children
and adults in situations where from the
data the correct inference to make is
this more complicated causal conclusion
but nevertheless adults tend to not come
to that conclusion adults tend to infer
that it's a simpler relationship than it
really is and and that's because adults
tend to have pretty strong priors for
simpler causal relationships and as
you'll see this prior tends to be more
useful over a wider variety of
situations children don't have these
priors just yeah and that's why they can
essentially do better in these
situations adolescents are starting to
develop more strategies for causal
learning in a way as such that there's
more individual variability so you can
start to see differences from
person-to-person evidencing the
influence of experience and there's also
evidence for certain biases that start
to show up and when I say biases I just
mean deviations from what an optimal
agent would do so I'm not gonna get into
the details about this particular bias
called the associative bias but
essentially what's important to note is
that an optimal model so something that
kind of is able to do the best that you
can do in this given situation would
make predictions indicated by these
solid lines here whereas humans are
making predictions according to the
dotted line so you can see this like
deviations from what is optimal and so
overall you can just
to sum up really quickly we see that
children are developing in a pretty set
developmental trajectory from the time
that were born and developing
increasingly optimal causal reasoning
from observations at the same time at
when we get older we're starting to be
able to perform causal interventions and
actively seeking information and this is
exhibiting with individual variability
and the increased influence of past
experience and priors and also
manifesting these apparent deviations
from optimality so I want to get into a
little bit here what are the reasons for
these deviations from optimality and
when I say these deviations again this
is deviating from what the optimal sort
of or formally speaking optimal agent
would predict and these formal
approaches of causal reasoning are
actually typically tend to optimize for
specific situations so essentially
they're relying on specific assumptions
you know that the model class is perhaps
a certain way and that is not changing
for instance now humans do not optimize
for this they don't optimize for
specific cause of grass but rather the
real world and one thing we know about
the real world is that it's very dynamic
it's constantly changing and we can't
know for sure what the underlying causal
graph is there's a lot of uncertainty
over its form as well as over the
relevant variables that are involved and
so essentially what we have to do is we
have to learn a set of useful priors
from experiencing this this world
essentially and we also have to operate
under constraints of bounded rationality
limited time so I can't spend forever
making a decision I have to really be
able to do it quickly and so under these
situations perhaps you know with dynamic
environments perhaps he's apparently
optimal and the rational approaches are
not really optimal and that given these
source constraints maybe maybe humans
are more optimal so um but of course
these you know the universe is also not
completely random you
is structured and so then what is useful
for this set of structured tasks that we
are encountering on a day to day basis
is to have structured priors and having
a notion of a universe of structured
tasks from from which on which we're
training is actually the exact setup for
the next topic I'll be talking about
which is meta learning and this is now
gonna be relating more to machine
learning and artificial intelligence
because meta learning is a topic that
has was introduced in the 90s but has in
recent years become very very popular
this is just a small slice of the amount
of work that's being done in it right
now and everybody seems to have a
slightly different perspective as to
what meta learning is to me the
commonality amongst all of these
frameworks is that you essentially have
these two nested loops of learning where
the outer loop is trying to learn these
priors that are useful such that on the
inner loop you're learning much more
quickly and the inner loop is learning
tasks specifics whereas the outer loop
is learning over a distribution of tasks
it's learning that structure so so yeah
the next part of my talk I'm going to
talk about how can a I learned causality
in ways similar to humans through this
meta learning framework the the inner
loop that I talked about earlier is is
this is what it looks like in a machine
in our machine learning setup we have
reinforcement learning on both the inner
and the outer loop and reinforcement
learning just means that we're learning
from reward so the environment is going
to be passing to the agent and
observation and a reward every time step
and the agent is also passing back in
action and is trying to just get as much
a reward as it can we use a recurrent
neural network deep neural network
called an LS TN and all that's important
to know is that it's people able to
integrate over a past history of
observations and rewards and and its own
actions and is able to pass on this
hidden state that is essentially serving
as a memory
what now the agent itself is is to
parametrize essentially it's determined
by these weights theta which are tuned
by this outer loop of learning so the
weights data are essentially operating
as the priors that help it to learn more
quickly on the inner loop and and this
is all being optimized using something
called policy gradient where it's trying
to maximize for the total amount of
reward and of course a key part of this
is that on every episode so an episode
is just a limited number of steps of
interaction with the environment every
episode we're sampling a different task
from this distribution of environments
this is just that universe of tasks that
I was discussing earlier where there's
structure and so the question you want
to ask is in this setup can can we get
our our meta RL agent essentially which
is what I'm going to call it can we get
it to learn causal priors so so if we
want to so the question that we want to
ask is given different types of
experience can the same architecture the
same like meta RL architecture that I
showed you earlier
learn to use information to display
causal knowledge at different levels so
in other words essentially an out
analogous to this would be if I were to
try to train three different humans
where for the entire lifetime each of
those humans is getting a different kind
of an experience for their entire lives
and I want to see how well they can
perform causal reasoning and what levels
it can operate if this is true that that
in meta learning our agents can learn
these useful structured priors
then being able to get past the
different types of experience and then
measuring how well it can do is a good
way to test this all right so so really
quickly I'm just going to go through an
example episode oh and I forgot to
mention that the three types of
experience we give it are if it's just
learning from pure observation alone or
if it's learning from interventional
information where it can intervene on
these graphs and also I'm having access
to noise information which I'll explain
in a little
so in this particular in the in the
interventional experience agent this is
a going to be an example episode and
what we do is we have five nodes in our
causal graph and we sample them randomly
such that we are randomizing the arrows
between them the agent can only observe
the values of the nodes and not the
arrows and except for the hidden node so
it can't observe that fifth node value
at all and four four time steps its able
to select any one of the nodes to
intervene on and then look at the
observed values and on the fifth time
step then we're testing it by passing a
previously unobserved events so we're
setting this known out to an unseen
value previously unseen value and then
asking the agent to pick the one that's
the highest value and this is actually
quite difficult to do because of the
limited amount of information because
the agent is only test able to do take
four observations before it's tested on
this graph it has never seen before and
there is a gonna be random noise that's
that's added at every time step so so
just getting to the results now in order
to be able to interpret the results I
have to compare two different baselines
so essentially asking what is the best
that I can do given this amount of
information so the optimal associative
baseline is how well I can do if I only
know associative and not causal
information if I'm not taking into
account causality at all but I just know
that things happen together that's how
all I can do
I'm normalizing everything to how the
best I can do if I know the true
underlying causal graph and this is how
well an agent does if they have been
trained on only observational data so
you can see that they're doing better
than the optimal associative meaning
that they're able to make some kind of
causal inference from just observational
data given interventions though they can
do almost as well as essentially having
access to the ground truth calls a graph
now I'm just gonna copy this over and
and shift the x-axis because
I want to compare it to an optimal
counterfactual agent this is an Asian
that has access to the noise so in any
given situation there's gonna be noise
there's gonna be things you can't
account for if I can rewind time and
think back to oh if if only I had done
that in that particular situation then I
can do much better than if I just know
the underlying causal graph and that's
what this baseline is and our act and
giving our agents access to that noise
information it allows it to do better
than before essentially so I'm running
out of time now so I'm just gonna flash
up these takeaways and say thank you
very much to my collaborators and
colleagues especially issued a desk
Hooda who led on the work that I am I
talking about here and thank you very
much so that was wonderful we've got one
more talk before we move on to the panel
and this is pedicle he is founder of a
start-up let me get this right slow
rhythmic labs they do hyper personalized
bespoke podcasts now that sounds
absolutely fascinating to me but that's
not what he's talking about today he's
also spends a lot of time as a core
contributor to library called PI m3
which is a probabilistic programming
library so he's going to talk about that
now and I suggest that we try and flag
him down after this session to talk
about the bespoke podcast welcome to the
stage thank you very much
this is beautiful British weather so
thank you very much for coming into a
tent um I won't keep you too long I will
probably be under the 20 minutes but I
am this is my self this Potter coil I'm
not talking about podcasts
this is me on the Internet's and if you
want to learn more about probabilistic
programming if this interests you there
is a course I happen to run
London's expensive helper on Twitter out
so just say about one of my open source
contributions PI MC 3 we recently
released and 3.7 our recent release it's
Python 3 only we finally got caught up
with current reality we use rvs devs for
plotting we got a really nice theater
class and there's a lot of like under
the hood improvements which you won't
really notice what they make things a
lot easier to to adopt that's the kind
of link there so what is this why does
it matter
um so uncertainty is everywhere in the
world as James already talked about you
you're we'd we deal with uncertainty the
predicting the future as hard as
everyone who ever tried to do sports
betting has discovered and if anyone's
ever seen one Popeye may you'll know all
about that
so there's a lot of small data problems
in in the world there's surveys or a bee
testing there's etc etc and this kind of
thing that we would like to do but and
while building beers your methods can be
hard it can be also really hard to build
a deep learning model I'm very very anti
deep learning I you know I jumped over
that but I'm like in a long time ago so
so isn't everything a machine learning
problem this isn't one of the first
things you discover when you like become
a senior data scientist and you have a
junior data scientist and to get very
excited and so I've run away and do the
papers with code calm and you come back
and they think oh this is our solution I
have a hammer worse than you right so so
not all problems are machine learning
problems right so you have things like
heterogeneous data you have like you
know your life's very complicated oh you
have small data problems I have you
throw a random forest or an XG boo stuff
that you'll get substandard
results so you know this is Kylee oh
this is the hill I will die on um so one
of the questions is what is the
applications why does anyone care is
this just industry you know we live in a
capitalist society for good or ill um
but anyway you want to understand on
certain baby so I did a survey on this
I'm like eighty percent of use cases of
pmc three in the wild where like a be
testing about 15 percent were like
things like risk modeling or financial
modeling you know so if you have a
capital to allocate you want to allocate
in an efficient manner
there's other things like price modeling
so the common trend with this your the
thread that links these together is that
you have a lot of uncertainty I'm
certainly it's a key thing that you care
about here so that is what's called a
probabilistic programming language and I
think when I started working on this a
few years ago there weren't very many
but now there's loads this is slide is
out of date this is the best part about
modern-day innovation you got pyro from
uber what's an excellent library based
on PI parts you get PI MC 3 which I have
no opinion on publicly whatsoever you've
got Stan which is a which is a
domain-specific language written on C++
but I think I'm moving to oh camel and
and it's really like got a really
advanced kind of like community and I
think from an adoption point of view
it's relatively complicated and you've
got edward which was written by Dustin
Trani I think there's an Edward to know
I have trouble keeping up with him and
his Rainier which is by my friend avi
Bryant it's in the scala community he's
a stripe they're applying these things
of things like um you know identity
modeling you know trying to predict if
someone is fraudulent or not given
whatever priors you have about their
previous behavior so one of the
questions is who is actually using this
in the wild right so this is am a slider
I'm gonna regret putting up but I'm this
is who uses Stan this is probably out of
date as well you go people like people
at Google your Facebook Facebook you
stand for their profit modeling for
their time series modeling applications
which is really good paper by Sean
Taylor and a few others
you're like insurance you'll put like
Sabana Qualicum so using an agro who's
not reinsurance or a player Amazon's
using it a buyer you've got you know and
you go you notice a lot of these are in
regulatory environments were like
explain ability it's really important
this is gonna come up later on probably
on the panel how do we explain a I and
how do you explain your statistical
models and various other ones general
etc and who uses PI MC free this is much
more interesting slide I'm no bias
whatsoever I'm you know Google stripe is
a bit of a um Salesforce there various
other ones Quinto pian Thomas Becky and
Adrian here to come like work on Twitter
spy mc3 art quant opium they use it for
modeling portfolios so quant opions like
a platform for doing financial trading
and you're learning about how to do
quantitative finance and they have a lot
of applications there so you kind of
gotta can moderate we've sort of like
touching this are they wrong but like
you you have this is what a modern
Bayesian workflow is you'll notice a
difference between this and a machine
learning and workflow here you're using
their criticism to improve your model
whereas in a machine learning model
you're often trying to optimize for a
particular your objective function that
you've already decided in in the future
it beforehand so um one of the questions
is what's coming next there is gonna be
a pine c4
everything has to go up by a number we
had a summit resi in Montreal and which
was some great support from the
tensorflow probability team at Google so
I'm very glad to see this support of the
open-source community because I've never
been paid to write any of this code and
so it's a very important thing I think
as we move forward next generation PBL's
be able to use things that keep use and
next generation hardware which is
something I know nothing about but it's
very interesting and you can have a look
there if you're interested um i really
analysis piece were like it was quite a
high-level piece where you interviewed a
number of stakeholders where highly are
using PBL's
it's got like a lot of interesting and
there it's a data set that I haven't
found anywhere else in the world
otherwise I wouldn't have done it myself
so thank you very much and thank you for
listening to my talk I'm Jane and Rob it
back up on the stage as well and take a
seat that's fine thank you
so I think we've got about 20-25 minutes
to have a little bit of a jog through
questions of research frontiers all of
the panelists have kindly said that
they're happy to stay behind afterwards
to take questions from the audience as
well so if I don't cover a burning
question now you will have your chance
I'm going to start really generally and
then maybe we can drill down but I
wanted each of you to talk about some
research well talked about in the last
year what is the best research you've
seen what is the worst and why so who
would like to start absolute silence
I'm gonna go best research so we know
what's most popular so gptt is
definitely most popular I actually like
all these meta learning stuff and
transfer learning stuff I love all of
this work from parents who star and
Rueter and so on I think all of this
stuff is super super interesting and in
my mind that's going to be a future like
we're like that's the future I would
like to see where you get the model you
kind of tune it a bit and it kind of
with a couple of examples it does what
it should be doing and I think once that
happens that's really going to
revolutionize everything so everything
that moves in that direction I find
super mega interesting what's the worst
I don't know I can't really comment to
that sorry
no I think everything's useful yeah so
for the best
I was actually also gonna say like the
GPT too and the Gans stuff which i think
has just made it I mean I think that
they're very impressive and they're
gonna be impactful and I can't say
necessarily whether that's gonna be a
good or a bad kind of an impact but
they're definitely very impactful for
the work that I think shouldn't have
been done it it's more logical general
class of work that I think is a I work
that helps to perpetuate discrimination
or inequality there's a couple of
examples of this I'm not gonna name
names but there are ones that are trying
to for instance classify sexuality or
criminality based on just faces or work
that's increasing surveillance
increasing like basically power
imbalances I think that this kind of
work is being undertaken and it really
needs to be scrutinized it needs to be
kind of operating being done with the
permission of all those that are
impacted by it
thank you for perhaps we'll come back to
that question actually but let's put
onto this question and I want a worst
one as well yeah can't really argue with
that
and so I finally how ironic you know a
comp you know people who work for an
tech based company complaining for
surveillance but will when we parked
that
I'm sorry but I'm yeah so you know
definitely discriminatory and kind of
biased of some of the kind of fake news
kind of stuff I think was a bit
misjudged in terms of the best stuff
I've seen um I probably go with the same
answer the GPT - um there's being a lot
of really interesting stuff in
probabilistic programming like in the
variation inference stuff to allow
things to scale to a larger data sets I
can't think of any paper off top of my
head thank you so I'm gonna stick a
little bit with questions of fairness
and ethics this isn't isn't the ethics
stage as you know but we it's worth
spending little bit of time on do you
think that the ethical challenges for
sustainable progress in AI dwarfed the
technical ones and is enough attention
being shown to them so I think one of
the interesting things I think Panetta
Gavin's who's at a 16z said this that
when you talk to the man in the street
the thing they were worried about is
self-driving cars like I'm very bearish
on self-driving cars no disrespect to
anyone who's an ox bollocky or anyone
like that in the audience but I'm but
I'm very concerned about things like
bias I think we don't do enough to talk
about you know like how powerful some of
these things are and how we even think
about how our data collection processes
because often people will say the
algorithm is biased was actually the
data collection in the first place was
biased by whatever structural factors
were there in the first place so I think
it's very important to think of these
things you know CAC is changing the
world if we want to like take part on
this we should you know you know develop
some responsibility you know and
consider these things really important
you know in how we think about things
like ethical frameworks and we love
really good work on that Wow I think we
have more to do what we're going in the
right direction
yes I think there's couple of issues
there I think separate two things so one
is like how open the communities which i
think is very important because you know
if you get too much concentration of
knowledge data computes models and so on
in like one place that just like lends
itself to abuse in like any cattle eat
Aryan system and so on so I think you
know the reason you know why you know
why we do our stuff is because we kind
of care about this so I think so on is
like concentration of power which I
think at this point is concentrated
because for like ml you need lots of
like you need to invest lots of money
you need lots of CPU time and so on it's
not like Linux that you can just like
you know like hack around and just
create Linux right you can't just hack
around and create that alphago you know
like you need loads of money so so I
think there is a concentration of power
and I think as a community we need to
kind of push back and make sure things
remain open and are still open in terms
of like biased ethics and so on
everything humans do has ethical
concerns with it and I think in my mind
the right way forward is changing
incentives so one thing we lost or we
had collaboration with somebody else and
we were looking at like a number of
github stars for repositories that
specifically look at bias ethics and so
on nothing you know like they get like
hundred stars 50 stars you know and some
gang gets like thousands of stars so I
think you know until like regulation and
cent exchange I think you know this this
problems can't really be changed so I
think we need to change regulation yeah
I mean I think that absolutely this is
something that should be almost like one
of the most important things that we're
thinking about when we're doing research
I think that right now a lot of research
is being done like kind of in these
siloed situations where we do not have
to deal with the impact of this work and
the technologies that can be created
from it and I absolutely think that the
incentives are kind of set up a little
bit
incorrectly right now I don't know what
the solution to that would be but
anything that can sort of tie together
doing research with thinking also about
how that research is going to impact
those that you know like those in the
world where you know AI it's gonna be
everywhere it is everywhere and so we
need to have people working on this that
are representative or can be
representative of those people that it's
gonna be impacting and one way we can do
this is by ensuring that you have a
greater diversity of backgrounds in the
people that are doing research in this
and so and basically always having an
ethical charter or thinking about ethics
at any research company that you're at
or in academia as well thank you
couldn't agree more with that but let's
stick then with open source so deep
learning has progressed in an
unprecedented Lee open way over the last
few years since 2012
as soon as cutting-edge research is done
it seems to be available and yet there
seems to be an army of volunteers like
yourselves who are making software open
source and maintaining it what is the
role of their open source community
going forwards and how is it how can it
be sustained I'm looking at you it I
mean so the open source is in my mind
super super important because it's the
common our common infrastructure we all
use and like without it like like none
of it that you see around us essentially
would exist and you know in my previous
life I was actually you know the early
developers of Wikipedia as well and I
was also like kind of open sourcing
knowledge so I think that was also you
know I think super super important I
think yeah so so github for instance has
this new thing we're like trying it now
and like if you see a dependency that
they're like you know you can put up
some money and
we'll pay people and so on I hope that's
going to work I think if anybody's going
to make that work like actually funding
people who write code they're going to
make it work but I mean other people
have tried before and you know sometimes
it works it didn't work yeah I think how
we think about it as a company is
essentially we want to make something
that's useful for the community we want
to make something that's useful for
everyone and we think that companies are
part of everyone so essentially there
are certain things that only companies
need to do and this is kind of kind of
how we can make it sustainable
essentially have an open-source arm
because lots of kind of work common for
everyone and then have a kind of
enterprise arm that essentially makes
the whole thing sustainable so yeah but
we're still figuring out it's not very
small startup is trying to figure out
how how can we make this work but
hopefully we will yeah so pretty much
echo a lot of Rob's comments you know
like the github stuff I'm really really
excited about um the advantage they
probably have now is you know
Microsoft's been very supportive
open-source community over the last year
which is words I didn't expect to say in
my career actually so just take a step
back I've just said that Microsoft has
been very supportive of the open-source
community you know something that under
Ballmer wouldn't have been possible at
all a more previous leaders so um you
know I'm there's been tremendous reach
and you know not Friedman who's out to
see you github is being really like
positive about it and it's like you know
been integration but I still think that
we're very naive about this right so you
know there was a really good article
called roads and bridges which talks
about and I forget the author's name it
talks about like the underlying
infrastructure of a digital age and you
know where these contribution should
come from and the fact that most of
these people are on paid and you know
and we only discover these things when
we have security leaks when we discover
for example I think last year there was
an actual hack on nodejs
by someone basically cheating his way
into the community and putting like a
like a kind of like security backdoor
you know so like these are are becoming
better your bigger concerns if you're or
not if you run a company if you're a
manager look at like getting involved
and doing some sort of open source like
contributions
you know there's various like num focus
who's very involved for the pion c3 is
very supportive and allow these things
to be tax-deductible and so there are
stuff out there so I think we are
depending too much on something that
might bite us thank you
and I wanted to talk about probablistic
programming a bit more and in particular
one of the seams that has come up again
and again today has been the requirement
to have some sort of explanation for
machine learning models in what way does
being able to better model uncertainty
help with that if it does so I think
it's very context specific so I think a
lot depends on like what discipline
you're in you know I don't think any of
these things are ones you know they
don't solve all problems and there was
research where in the medical community
we care more by accuracy and less that
explained ability where is the finance
we care more about explained ability so
you know there seemed to be a natural
divide there um you know probably sick
programming lawyers you know better
understanding of uncertainty and that
allows a certain element of
incorporating prior understanding so
that can be quite helpful but you know
although I'm very fond of these methods
I don't think they are one size fits all
and I think that's one thing very
important for our community is how we
allow multiple like threads you know
multiple attacks on different problems
okay let's move on I don't know whether
Jane you'll be able to address this in a
couple of minutes but you were talking
about learning priors and there's been a
bit of a discussion in the community
generally about what what should be
learnt and what should just be
incorporated could you discuss that a
little bit please yes
this is something that actually came of
it I clear well so this relates to rich
Sutton's recent manifesto i don't know
if people have seen this talking about
how basically deep learning models we
i've been able to just learn everything
that any handcrafted knowledge that you
put into them will eventually be
superseded by just throwing enough data
at it and so it seems seems like you you
don't actually need to use any any human
knowledge to put into these systems and
I agree to that to an extent because I
think that in a lot of cases you can
structure your task environments in such
a way that the only way that the that
your model can essentially do well on
that task is to essentially learn that
abstraction or that structured prior
that such that it can it can do well on
this on this and related tasks however I
think that it it you sort of have just
pushed off the problem to defining the
task appropriately or to defining that
task structure which is also really
difficult so there's a lot of situations
in which you might want to to use a
prior that's general enough such a it's
not just sort of handcrafted human
knowledge but that it's it's really you
know that it's it's maybe the way that a
human learns or you know that it's
helpful for real systems that exists you
know you know that we know exists
because we studied it in neuroscience I
think that these kinds of things are
they have to work because they they do
work in the real world so I think that
that it's sort of yeah we have to be
careful about the things that we built
in but I think that you know it's not
like we should we should be in the
either/or alright I think we've we've
got about three minutes left so I'm
going to go back out to general again
and ask you to think about one research
question that you think will be solved
in the next couple of years
it's important and why Jane would you
like to start well I don't know what
this is actually gonna be solved in the
next couple of years but I think it's
something that should be solved and
that's being able to interpret our
models having interpretability I think
actually you mentioned something about
that as well
and and the reason is is that we are
starting to rely more and more on these
AI systems and models which are
governing essentially a lot of part of
parts of our lives and without being
able to know why they're recommending
certain things or what they're
optimizing for the decisions that they
make you can get into a really dangerous
cycle of of us kind of just like
trusting these these models too much and
so I think explain ability interpret
ability and being able to ensure that
essentially buys you safety guarantees
because you you know sort of the reason
your models are are making certain
decisions can you give any kind of
indication when you think that might be
possible if it's possible so the when I
don't like to I think people are already
working on this actually
industry labs and so I'm hopeful that
we'll start to see some solid progress
being made towards it in the next couple
of years I don't think it'll be solved
in a couple of years but as I said I
think if it's not solved then it could
be quite bad so maybe people or more
sort of effort will be made to solve it
if it hasn't been solved in a couple of
years so I largely agree with the AI say
if they thing I'm gonna answer that I
think more in another thing that I think
is gonna be very interesting is more
better like API designs right
so currently it's actually quite
difficult for researchers to write
certain models and certain classes and
we could get kind of better at our you
know symbolic understanding of that you
know we would like improve productivity
so there's you know one thank you and
Robert
your thoughts yeah I'm going to default
back to my old answer and that is
essentially anything a representation
learning or transfer learning yeah if we
can learn better representation of the
world I think that would be amazing if
we can interpret and I'll be even more
amazing so yeah I'm really looking
forward to I'm not gonna offer that
easily so are there any specific strands
of research along those lines that
you're interested in and they when might
there you be fulfilled I don't know so
like so so so one thing that I was
reading about in representation learning
is being able to untangle
representations right so and yes
essentially it's just like so when
you're when you're learning like how to
hire one to represent a person you might
handle different things so friends if I
want to change someone's face it might
change other parts of the physique so we
want to learn a representation where we
know that this parameter just like
controls the face and this parameter
controls the arms and this parameter
controls I don't know the movements and
so on stuff like that but when when AI
is learn representation everything is
kind of mangled and if you ever looked
at these kind of visualizations of AI
stuff everything seems kind of mangled
and it's really interesting sometimes to
just like see what AI actually sees in a
picture which kind of seems quite how
much is the genic so if we could somehow
kind of make that more human I think
that could maybe help with
interpretability as well because then we
will know essentially what it's thinking
and why is it making certain decisions
that wasn't 18 seconds by child I think
you did pretty well and I think it's
clear to everyone here that we've only
touched the surface of research
frontiers it's been an absolutely
fascinating discussion thank you so much
for taking part in it I hope you will
join me in thanking our speakers today
you
