Dear Fellow Scholars, this is Two Minute Papers
with Dr. Károly Zsolnai-Fehér.
We have recently explored a few new neural
network-based learning algorithms that could
perform material editing, physics simulations,
and more.
As some of these networks have hundreds of
layers, and often thousands of neurons within
these layers, they are almost unfathomably
complex.
At this point, it makes sense to ask, can
we understand what is going on inside these
networks?
Do we even have a fighting chance?
Luckily, today, visualizing the inner workings
of neural networks is a research subfield
of its own, and the answer is, yes, we learn
more and more every year.
But there is also plenty more to learn.
Earlier, we talked about a technique that
we called activation maximization, which was
about trying to find an input that makes a
given neuron as excited as possible.
This gives us some cues as to what the neural
network is looking for in an image.
A later work that proposes visualizing spatial
activations gives us more information about
these interactions between two, or even more
neurons.
You see here with the dots that it provides
us a dense sampling of the most likely activations,
and, this leads to a more complete bigger-picture
view of the inner workings of the neural network.
This is what it looks like if we run it on
one image.
It also provides us with way more extra value,
because so far, we have only seen how the
neural network reacts to one image, but this
method can be extended to see its reaction
to not one, but one million images!
You can see an example of that here.
Later, it was also revealed that some of these
image detector networks can assemble something
that we call a pose invariant dog head detector!
What this means is that it can detect a dog
head in many different orientations, and…look!
You see that it gets very excited by all of
these good boys…plus, this squirrel.
Today’s technique offers us an excellent
tool to look into the inner workings of a
convolutional neural network, a learning method
that is very capable of image-related operations,
for instance, image classification.
The task here is that we have an input image
of a mug or a red panda, and the output should
be a decision from the network that yes, what
we are seeing is indeed a mug or a panda or
not.
They apply something that we call a convolutional
filter over an image which tries to find interesting
patterns that differentiate objects from each
other.
You can see how the outputs are related to
the input image here.
As you see, the neurons in the next layer
will be assembled as a combination of the
neurons from the previous layer.
When we use the term deep learning, we typically
refer to neural networks that have two or
more of these inner layers.
Each subsequent layer is built by taking all
the neurons in the previous layer, which select
for the features relevant to what the next
neuron represents, for instance, the handle
of the mug and inhibits everything else.
To make this a little clearer, this previous
work tried to detect whether we have a car
in an image by using these neurons.
Here, the upper part looks like a car window,
the next one resembles a car body, and the
bottom of the third neuron clearly contains
a wheel detector.
This is the information that the neurons in
the next layer are looking for.
In the end, we make the final decision as
to whether this is a panda or a mug by adding
up all the intermediate results, the bluer
this part is, the more relevant this neuron
is in the final decision.
Here, the neural network concludes that this
doesn’t look like a lifeboat or a ladybug
at all, but it looks like pizza.
If we look at the other sums, we see that
the school bus and orange are not hopeless
candidates, but still, the neural network
does not have much doubt whether this is a
pizza or not.
And, the best part is that you can even try
it yourself in your browser if you click the
link in the video description, run these simulations,
and even upload your own image.
Make sure that you upload or link something
that belongs to one of these classes on the
right to make this visualization work.
So, clearly, there is plenty more work to
do for us to properly understand what is going
on under the hood of neural networks, but
I hope this quick rundown showcased how many
facets there are to this neural network visualization
subfield and how exciting it is.
Make sure to post your experience in the comments
section whether the classification worked
well for you or not.
And if you wish to see more videos like this,
make sure to subscribe and hit the bell icon
to not miss future videos.
Thanks for watching and for your generous
support, and I'll see you next time!
