Laura: Thank you. Thanks, I 
haven't even said anything of importance yet.
So my name is Laura, and I've been studying
various things for the past 11 years, actually,
and I'm still not done.
But I've also organized Rails Girls Berlin
for quite a while. Thank you, because you're
kind of making sure this community keeps growing,
and that's really important, so give yourself
a round of applause.
OK, so let's get this ship started. My reference
or not is good. So in 1998, Lauryn Hill released
her debut album, the miseducation of Lauryn
Hill and you might notice this has something
to do with the title of the talk today. Her
works all make the point of how the US American
school education system has been indoctrinating
black communities with white supremacy instead
of teaching those communities and everyone
else with black history and black presence.
So the question of who educates and who produces
knowledge is a question of who holds power
over a seemingly universal trying.
The Miseducation of Lauryn Hill is still one
of my favorite albums today and Kaja's, right
and this title will guide us through the 30
minutes, maybe only the next 25 if I forget
to breathe.
Today as we explore how not just humans, but
how machines learn and I promise that I will
leave out all the math so you should all be
able to follow. So let's take one more step
back in time into 1950 when Alan Turing published
a paper and one of the questions he asked
basically was whether machines think can think.
Instead ask whether machines can imitate thinking
and this is something you might have heard
of as the imitation game, where we try to
figure out if a human can actually make a
difference or recognize a difference between
interacting with another human, or a machine.
Now, for thinking it seems to be hard, but
for machine learning, we kind of have accepted
this term and I would like to question it,
obviously. So spoilers, I will argue today
that machines are not fully capable of learning,
but really good at making us think that they
are learning and while it was hard for for
Turing to think of how learning means. You
might have noticed the three steps in yourself
but I think also Melanie mentioned at least
two of these steps and the first one is where
you just learn to reproduce knowledge. It's
like learning vocab when you're learning a
new language, you need to get the terms right,
same as in programming, and then you do the
remixing part and that was actually where
we had the Pokémon characters created that's
remixing knowledge, taking knowledge from
one domain and putting it onto another domain
and then the final or most interesting level
of acquiring knowledge actually is the reflection
part where you learn to understand where the
limits of your knowledge are, and what the
knowledge can do and what it cannot do, and
what it should be doing and all of those things.
We'll get into that later. So we will now
look at three key elements that are involved
in machine learning and this is data, obviously,
algorithms, obviously, and maybe as a surprise
to people who don't know me, humans.
And I will show you that data on its own cannot
go beyond the level of reproduction. Algorithms
on their own cannot go beyond the level of
remixing and then humans cannot -- well, we
need the humans to do the reflection part.
So now you know what this is all about. You
may enjoy the sun. First a quick content warning,
though, when we get to the last part of this
talk, I want to give you an example of something
that is hate speech and it contains racism
and sexual violence so I will not read it
out loud. You can look at it or not look at
it, but I will also let you know right before
it comes up.
So let's get started with the reproduction
part of machine learning or knowledge. So
machine learning is quite good of reproducing
what we as a human collective already know,
it can identify spam messages, it can make
music recommendations, or translate from one
language to another and machine learning can
also take that knowledge that we as a human
collective have, and make it available to
a much larger set of individuals, right.
However, machines can also learn the wrong
thing. This is Caroline Sinder, I don't know
if you've seen her talk at re:publica this
year, but she mentioned her machine that put
too much focus on breakup music and that has
screwed up her algorithm ever since and Spotify
is just suggesting Munford & Sons all the
time. Now, I don't have a great taste of music
so I'm not challenging that at all. The only
thing she can do is stay away from that band
forever, because she cannot get away from
them at all.
You might have heard of some of these cases.
Because machines are really good at learning
bias. They apply bias in recreating tools
such as one that Amazon once built. They apply
bias in the justice or, well, the injustice
system.
And they apply bias in image classification.
Because humans do bias quite well, it will
be found in most datasets, and thus machines
learn to reproduce it. But it's also more
than that. So the Hito Steyerl puts it this
way: Those who own past data colonize the
future. We always use the past in order to
learn or predict something in the future,
so whatever has been will always be taken
into the future. So that's hopefully our first
learning for today. Data as one element of
machine learning will not learn beyond that
level of reproduction.
How about the next step of learning? Remixing.
And I don't mean like the scratching DJ thing.
Well, before we can continue this talk, I
need to know whether you are human.
And I'm asking you to please select all the
images with a bike. And since I have no proper
interface for this, we have to agree on something
and I have a suggestion and I hope you're
all down for this. So does everyone know what
jazz hands are? It's like you're wiggling
your fingers and you can do it up here or
something similar, whatever feels OK for you.
When you identify a bike, please give me your
best jazz hands or spirit fingers and if not,
just don't move. So does this image contain
a bike? Does this image contain a bike? Did
someone bring their bike and it's secretly
hidden in there? Does this image contain a
bike?
And does this image contain a bike? Mm-hm,
OK. This also contains some Rotterdam architecture
that you might want to check out after this
talk because it's really awesome, those cube
houses. So congratulations, you're not a robot.
However --
[cheers and applause]
Or you just tricked us really well.
However, I was the one asking the questions,
so we still don't know if I am. So let's pretend
that I actually am a machine, and you classified
some bikes for me and so obviously now I'm
going to learn how to classify bikes, because
that's what I was built for and I'm going
to start doing so by first simplifying this
image a little bit because I am not the most
resourceful machine and this has a really
good resolution and so I'm going to lower
the resolution and drop the colors and this
is now like the size of my pants for resolution,
it's still way too much. I'm going to lower
the resolution even more. Can you still idea
something? You know what it should be, so
you can, right? OK. So I will start comparing
this image with very basic shapes, like horizontal
lines, and vertical lines. You've seen these
in your life before. So I'm going to use these
in order to identify parts of the bike that
are standing out. And I'm going to start small
with just these shapes and it's going to get
more complex but I didn't draw more of these.
So you as humans you can identify some of
these shapes in the image, right? If I walk
around here and you can tell me which shape
is where, does anyone have an idea? You can
just yell at me.
AUDIENCE: This one.
>> This one on the bottom left? The other
one.
>> This one? Yes and look I even measured
and it's the same pixel size.
[applause]
Also, I think thanks to Melanie, because she
gave me a ruler. Right, so you as humans you
can idea that this is a horizontal line. You
might also be identify that over here is a
vertical line, right? But me as a machine,
I cannot. What do I need to do?
AUDIENCE: Numbers.
>> I need to do numbers, yes, but I promised
to leave out all the math. So what I would
do is start with one shape and start at the
top and put it on there and see if what I
have is similar to that part of the image,
then I would move on one or two pixel and
compare again and then I would move on and
you know what machines do, what would I do
then? Exactly and I would do this for the
entire image and then I would use this other
one and do that for the entire image and while
I'm doing this I'm not identifying shapes.
I'm just redrawing the image and creating
a new one. So I create a new image that's
based on the old image and the shapes that
I have. And so if I apply this one, my final
image no, sorry, if I apply this one image
is going to come out something like this,
you as a human may not see more in it and
if I apply the horizontal filter it's going
to look something like that and so I'm going
to do this with other shapes, as well and
take these new images and apply combinations
of shapes and so on and this will continue
to reduce the size, as well, and the images
will get more and more abstract and smaller
and you as a human will not be able to recognize
and once I'm done with that for one image
I'm going to do all the other images. So these
are the ones that you jazz handed out for
me. And then I will remember the quintessence
of them, and I will end up with some blobs
that you definitely can no longer recognize,
but I as a machine can now use them and this
is basically the intuition for convolutional
neural network. If you've never heard of it,
there you go. What I do is I extract different
shapes for images that are defining for the
shape of a bike and then I end up with knowledge
and this can now help me to determine which
shape is how important in a bike. And once
I have these abstract ideas, you can hand
me a new image and I can tell you whether
this contains a bike or not
This one does not contain a bike, but it is
Matz holding my name at the airport so I needed
to include this somehow. But you could also
hand me this image and I could probably say
this might very well include the image of
a bike.
However, there are some problems to this example.
-- to this approach and I want to show you
one example. There was a study last year actually
that tried to trick machines like me and I
believe that those researchers had a lot of
fun doing so, because what is this? Oh, yes,
so their classifier actually classified this
as a bike. Because the elements or from amotion
perspective, rather, the basic shapes of a
bike exist, right? They're present. And it
doesn't matter that in this setup, the image
makes no sense to a human and the bike, well,
is kind of broken. But this shows us how machines
tend to overlook the bigger picture. They
can mainly look at images as a combination
of shapes but they fail to see them as a whole
most of the time. So second learning for today,
algorithms as another element of machine learning
do quite all right at reproducing. They can
extract and reproduce shapes and see if other
images are similar in that respect and quite
literally this machine does well with remixing
images. But remixing of course does have a
different meaning, too, so when we come to
the context of so-called transfer learning,
algorithms can use knowledge from one context
that they have learned from one context and
apply it to a different context, but this
is still in the making, kind of. So however
algorithms do struggle with like the bigger
picture and this gets us to the third part,
and a drink of water.
So reflection: Imagine that I had asked you
to do a very different task. Imagine I had
asked you to identify hate speech. So while
detecting objects such as bikes is rather
easy, speech is much more complex, as we've
also heard today and within speech, hate speech
is even more complex.
So I'm currently assisting in a research project
that aims to analyze hateful communication
on social media, and commentary sections of
German use media in order to identify comments
and dynamics of hate speech. So we're looking
at comment sections of German news media on
the topic of migration, so you can imagine
there's a lot of terrible things going on,
and we're trying to develop methods and software
that can recognize hate speech early and that
can also suggest strategies for deescalation,
so everyone who was going, like, uh, freedom
of speech, it's about deescalation, it's not
about deleting. So let me tell you, finding
hate speech is a lot harder than finding bikes
and unlike other projects that tried to do
that, we involve both communication science
and computer science and they come together
in this project and so I will show you how
each of these perspectives try to learn hate
speech. Let's get started with the computer
science part, because this might be close
to what you already know. So to identify hate
speech, machines do what we call supervised
learning, right? Some of you know this. Have
heard of the word? OK. So what happens on
supervised learning is that we take a set
of data that already has the information on
whether a particular statement is hate speech
or not.
And then we split it into two sets, and you
can only see the train set here right now.
One is for training and the other one is for
testing. So the training begins.
And this is actually where the training part
can also be done with convolutional neural
networks, so we can use the identify part
to determine hate speech, as well. So unfortunately
text only works in one dimension and this
is dimensions. Text is either horizontal,
right, or it goes down and this English text
only goes from left to right so we're not
comparing our text to shapes, so I promised
to leave out all the math, but you can imagine
that each item in our dataset which is bigger
than this one, has some kind of relation within
the text and now instead of looking for horizontal
or vertical lines like we did before, we will
look at how words or even characters relate
to each other. So the size of these relations
can vary, we can look at the character level,
we can look how two or three characters relate
to each other, or as you can see here, how
two words can be related to each other.
And then we move our two-word filter over
our text, like this.
And that.
And next we could choose a differently sized
filter that for example looks at three words.
And so on. And eventually we will have numbers
that I did not include that show what kind
of words have what kind of relation within
hate speech segments as opposed to what kinds
of words have what kinds of relations in non-hate-speech
statements. According to the different filter
sets we apply.
So instead of having a classifier, we might
now have a classifier for hate speech. And
in order to check how good that classifier
actually is, we're going to use the second
part of our test dataset and we will check
how well our classifier can tell hate speech
from non-hate speech and since we humans already
know from our dataset whether a speech is
hate speech or not, we know what the machine
should decide, so we can compare what it should
decide to what it actually decided, but usually
the result is not very satisfying, especially
not in the beginning, so we need to go back
to the training part and well, actually adjust
numbers. We trick some numbers here and there,
and if you're looking for a scientific explanation
h I cannot give you one. Because most of the
articles say just parameter x or y until it
looks good. Thanks, science and we do this
until we're satisfied with our results and
we can calculate some measures for quality
and end up for example for a classifier that
will give us a green light for probably not
hate speech or a yellow light for eh, I don't
know, could be, could be not. And then a red
light for yeah, this is probably hate speech.
Sounds good so far, but where do machines
fail? Well, unfortunately machines cannot
tell us themselves where they fail, not in
this case, at least.
So luckily there are humans who don't take
machine results for the truth and they question
them thoroughly.
And this is a paper that was published within
the project that I work for, and to test the
waters, they took a huge Wikipedia dataset
and one from Twitter and both of them were
tagged not for hate speech but for toxicity,
but I want to give you one example. This is
not the content warning, this is just an example.
"Oh, I feel like such an asshole now. Sorry,
bud," this was tagged by the algorithm as
toxic, because based on the training data
that the classifier received, it very quickly
learned that swear words were an indicator
of hate speech. Well, in this case it was
actually someone apologizing for their behavior,
I would guess. So statements that include
swear words but do not actually attack someone
are apparently hard for machines.
There are other things, as well.
Hate speech that on the other hand does not
contain swear words is difficult for machines
to identify. Rhetorical questions are hard,
metaphors and comparisons, idiosyncratic and
rare words, sarcasm and irony or quotations
or references. So not really optimal, because
all of these things are used a lot in those
statements.
And that's because the algorithm is so focused
on the words themselves, and it cannot look
beyond the text that it has been given, right?
And it has no context for the actual meaning
of words and how this changes within different
contexts.
But we kind of skipped the part about the
underlying dataset on which machines learn.
Those datasets don't just appear, obviously,
they're created by humans. So how do humans
learn to identify hate speech? Very often
these datasets are created by crowd workers.
Has anyone ever been a crowd worker for any
task? OK. One person? Two people? OK. Thanks
for doing that. So for the Google perspective
API, for example, rather random people were
simply asked to rank an internet comment on
they don't explain to us why they made the
specific decision, right, they just clicked
and no context. So it's OK to do that for
bikes, because most of the time it's not so
bad if you misidentify a bike, but for hate
speech it's a different thing. So social studies
classes in particular they go about this differently.
Our first approach is to start with theory
and let's define of a hate speech. I have
brought you one example of a definition and
it goes like this." Hate speech is understood
as public communication that contains deliberately
and/or intentionally discriminatory messages,
thus, it is not a question of hate nor a question
of speech."
Are you confused? A little bit. I think this
is a very good definition, because it points
out that hate speech is not about an individual's
emotion or about emotions at all, really,
and it's also not limited to speech, because
hate can be expressed in many ways and it
can appear in rather rational ways but we
still need a way to identify this. So we're
going to read this definition for now and
once we actually have it we have to see how
it applies to specific statements so we need
to take the theory into the methodology.
So according to our definition, we could take
public user comments, then we have the public
communication and say that verbal aggression,
so things like insults, negative comparisons
or metaphors, negative circumscriptions are
hate speech, dehumanization. Calling for threatening
or legitimizing violence against people is
hate speech, and negative generalizations,
such as all X are something terrible, is hate
speech.
Now, we will take these practical understandings
and put them into a code book. It's kind of
a tutorial on how to find hate speech and
it contains the definition that you've just
seen and the more practical understandings,
but also lots and lots of examples in context.
And so what we do in our project is we give
this code book to humans, but not as many
as the crowd workers. Rather to a smaller
number, like 3 to 7 people, maybe and they're
called coders and then they read this book
and together in a workshop with the researchers,
they discuss a lot of examples and they work
together on them. And then they take even
more examples and the codebook home and like
rack their brain on these examples and meet
again to discuss these questions, because
this is where a lot of questions come up.
And I want to now show you an example why
it's important to discuss all the questions
to first of all have questions and then discuss
them and come to a decision on them, and so
now, this is the content warning for racism
and sexual violence in words. It's going to
be on the next two slides, I'll let you know
when they're over.
This is a translation of a comment that was
originally in German and I translated it by
words basically for now, so you may not understand
what this is supposed to mean. Does anyone
like kind of understand what it means? No,
OK, because the language is really coded.
It's using references to words or phrases
that, well, German politicians have made in
the context of migration.
But it's also turning them around and it's
really weird sentence so unless you're quite
familiar with this topic within the German
contention you'd probably be rather unsure
whether this is hate speech or not, or you
might just dismiss the comment, right?
I can tell you that we actually find three
indicators of hate speech within this very
short statement. It's legitimizing violence,
that's this part. It's also making an ironic
and negative generalization. That's the second
part, and then there's the third part that
is turning humans into objects again with
a note of sarcasm.
Now, keep this in mind and I will show you
the translation by idea. So this is translated
again, but what it actually means, so I think
you now understand why this is hate speech
after all that we've been through. And that's
it for the content warning. So once the coders
have asked and discussed all the questions,
we usually need to add better descriptions
and examples into the codebook. And this is
such an important step that this could actually
go on for one or two more rounds, right and
then eventually the researchers are confident
in the coders and their calculated agreement
is high enough and they will actually code
a larger dataset and that is then where the
algorithm can start to learn. But where do
humans fail? Well, no matter how good the
discussions and the trainings are, dataset
will never be perfect. Because humans, too,
have trouble understanding irony or sarcasm.
But more but more importantly, language and
specific words can change over time and sometimes
their meaning can change quite quickly. However,
we as humans can the capability to pay attention
to this, and that is now of the utmost importance,
because humans are the ones who can stop,
take a step back and look at the bigger picture.
Humans are the ones who actually complete
the entire process in machine learning. We're
the ones who educate or miseducate the machine,
so I would like to finish up with a quote
from Lauryn Hill's album now: "Consequence
is no coincidence."
Because the knowledge that machine learning
produces is no coincidence, it is a consequence
of what data we as humans choose to feed to
a machine. Its the consequence of how we as
humans build an algorithm, and finally, machine
learning is a consequence of what we as humans
decide to question, reflect upon, or take
with a grain of salt.
So other than I'm asking you now, whether
you program machines that learn, (air quotes)
learn,
