YUFENG GUO: In recent
years, new research
into understanding how
neural networks work
has really revolutionized
our ability
to interpret how they
operate, providing the tools
and building blocks for
explaining why they make
the predictions that they do.
Stay tuned, as we dive into
convolutional neural networks
and try to understand
how they see the world.
[MUSIC PLAYING]
Welcome to "AI
Adventures," where
we explore the art, science,
and tools of machine learning.
My name is Yufeng Guo,
and on this episode,
we're going to see how to use
Lucid to better understand
what neural networks
are looking for.
Convolutional neural
networks, or CNNs,
have significantly
advanced the state
of the art of image recognition.
They've achieved
incredible accuracy
across a wide variety of tasks.
But understanding
what's happening
between the inputs
and the outputs
in those hidden layers,
that's been slow going.
So today, we'll look at a
number of different approaches
to try to gain some
intuition behind what
it is that makes a CNN
work, ranging from focusing
on individual neurons,
all the way to looking
at the response of
an entire layer.
Before we dive in,
I want to get us
all on the same
page about why it
is that we care about how
a neural network arrives
at the prediction that it does.
Isn't it enough that the
final prediction is right?
Well, not quite.
In real world,
production use cases
of machine learning image
recognition systems,
a lot of additional value can be
gained by understanding the how
and the why of a
given prediction.
This helps the end
user of the predictions
better understand
whether it's being
made for the right reasons.
Let's take a
hypothetical example,
say, in medical imaging,
where small details can really
play a crucial role
in differentiating
between different
types of results.
What if, let's say,
a model not only
gave just the right
prediction for an image,
but it also provided information
about which parts of that image
were the primary contributors
that led to its conclusions?
Then, a doctor could
see whether this model
was coming to these conclusions
for the expected reasons.
Moreover, this approach could
even potentially highlight
new patterns, which
hadn't yet been identified
as useful in these images.
So what kinds of
tools and approaches
would we need in
order to construct
this kind of understanding of
an image model's predictions?
Well, let's set
the stage by first
understanding the structure of
a convolutional neural network.
Convolutional neural
networks get their name
from their many
convolutional layers.
And they work in
concert to understand
the different
details of an image.
In the early layers
close to the inputs,
the convolutional
layers are expected
to be looking for your basic
lines, and simple shapes,
and patterns.
Further along in the
network, the work
from these early layers
are passed onwards
and respond to more
understandable-looking inputs.
We might see images that
appear similar to various parts
of real world
animals or objects.
This information then
continues propagating onwards
through the network,
and eventually reaches
those final layers,
which are put together
in generating the final outputs
of the network's conclusions.
These tend to, of
course, look more
directly like the
categories that we
are expecting as the outputs.
There are a number
of parts that make up
any single convolutional layer.
At the lowest level is the
single individual neuron.
Neurons are the most
fundamental building blocks
of a neural network,
and the strength
of each neuron's
response to input
is what we will use to
understand their behavior.
So now, let's zoom
out a bit and see
how multiple neurons
can be connected
in a specific channel.
So a single channel is made up
of a two-dimensional rectangle
of neurons, which are used to
process the previous layer's
outputs.
There are many
channels in a layer,
each channel stacked on top of
the other-- much like a layer
cake.
And within a whole
layer, all the channels
receive the same values as
inputs from the previous layer.
But they each process that
input slightly differently,
looking for different features.
All of these outputs are
then combined and passed
onwards to the next layer.
Now that we have a
basic understanding
of how a convolutional neural
network's layers are assembled,
let's take a look
at how to find out
what a given neuron, or
perhaps a group of neurons,
is quote unquote "looking for."
When an image is passed
through the network in what
is called a forward pass, each
neuron responds or activates
to a different degree.
Now, we can measure that
activation's strength-- well,
because it's literally just
the number that comes out.
The greater the
magnitude of that number,
the stronger the activation.
Now, these activation values
can be positive or negative.
So we can have a strong
positive activation, as well as
strong negative activations.
So let's start by seeing what a
specific neuron in our network
responds most
strongly to by looking
at how it activates when
the network is presented
with an image input.
If we take this train model,
and then hold all the weights
static-- so they can't change--
and then, we run an image
through that network.
And we measure what a particular
neuron's activation value
is, we can begin to
optimize this input.
Initially, we'll start with
an image of pure static noise.
And because neural networks
are differentiable,
we can figure out how to adjust
that input image, that noise--
there's just nothing there--
to get an even higher activation
value for that specific neuron
we're interested in.
So we tweak that
image a little bit.
And if we're doing it right,
the activation of the neuron
will be even higher
when we pass that image
through the second time.
So then, we can do it again.
We can adjust the image
to boost that activation
of the neurons response to
the image and pass it through.
Over and over again,
we repeat this process.
And eventually, you
end up with an image
that has been optimized to
activate one particular neuron
of the network maximally.
The result is an image with
a focused region surrounded
by kind of a more
blended background.
Now, these patterns can
be all over the place.
And thankfully, Lucid has made
this really easy to impute.
There's no need to write an
optimization loop yourself.
You can just choose what
layer, channel, and neuron
you're interested in.
And it'll take care of the rest.
Now if we instead look to
optimize the activation of not
one single neuron, but instead
that entire channel of neurons,
we can get a pattern that
is a bit more consistent
and fills in the edges too.
And this is what we'll
be primarily seeing
throughout the rest
of our experiments.
Now, let's take a
look at some ways
to combine the different
neurons together,
allowing us to see how
neurons activate in pairs.
This is showing the
optimized input images
when optimizing for two
different channels of neurons
simultaneously,
rather than just one.
What ends up happening is
that the images show up
as a blend of the two channels
that they originated from.
And finally, let's see
how we can interpolate
between two different neurons.
Rather than just optimizing
for both equally,
we can weigh them
unevenly, giving us
images that are along the
spectrum between the two
channels.
This gives us even more context
about what sorts of patterns
these channels are
able to detect.
So far, we've gotten
some background on CNNs.
And we've seen some examples
of feature activation
at the neuron and channel level.
In the next episode
of "AI Adventures,"
we'll take these ideas
of feature visualization
and apply them to
the entire image
to create activation grids.
And then, we can use that to
produce an activation atlas--
which will give us insight not
to one neuron, channel, layer,
or just an image.
But it'll allow us to see how
the entirety of the network
operates as one cohesive unit.
If you want to learn more
about feature activations
and to experiment with
Lucid on your own,
check out this
Distilled.pub article--
really fantastic.
And try out the Lucid
Library on GitHub.
Thanks for watching this episode
of "Cloud AI Adventures."
And if you enjoyed it, be
sure to hit that Like button
and subscribe, so you can get
all the latest episodes right
when they come out.
For now, go check
out Lucid on GitHub.
[MUSIC PLAYING]
