Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér.
Hold on to your papers because these results
are completely out of this world, you'll soon
see why.
In this work, high-resolution images of imaginary
celebrities are generated via a generative
adversarial network.
This is an architecture where two neural networks
battle each other: the generator network is
the artist who tries to create convincing,
real-looking images and the discriminator
network, the critic tries to tell a fake image
from a real one.
The artist learns from the feedback of the
critic and will improve itself to come up
with better quality images, and in the meantime,
the critic also develops a sharp eye for fake
images.
These two adversaries push each other until
they are both adept at their tasks.
A classical drawback of this architecture
is that it is typically extremely slow to
train and these networks are often quite shallow,
which means that we get low-resolution images
that are devoid of sharp details.
However, as you can see here, these are high
resolution images with tons of details.
So, how is that possible?
So here comes the solution from scientists
at NVIDIA.
Initially, they start out with tiny, shallow
neural networks for both the artist and the
critic, and as time goes by, both of these
neural networks are progressively grown.
They get deeper and deeper over time.
This way, the training process is more stable
than using deeper neural networks from scratch.
It not only generates pictures, but it can
also compute high resolution intermediate
images via latent space interpolation.
It can also learn object categories from a
bunch of training data and generate new samples.
And, if you take a look at the roster of scientists
on this project, you will see that they are
computer graphics researchers who recently
set foot in the world of machine learning.
And man, do they know their stuff and how
to present a piece of work!
And now comes something, that is the absolute
most important part of the evaluation that
should be a must for every single paper in
this area.
These neural networks were trained on a bunch
of images of celebrities, and are now generating
new ones.
However, if all we are shown is a new image,
we don't know how close it is to the closest
image in the training set.
If the network is severely overfitting, it
would essentially copy/paste samples from
there.
Like a student in class who hasn't learned
a single thing, just memorized the textbook.
Actually, what is even worse is that this
would mean that the worst learning algorithm
that hasn't learned anything but memorized
the whole database would look the best!
That's not useful knowledge.
And here you see the nearest neighbors, the
images that are the closest in this database
to the newly synthesized images.
It shows really well that the AI has learned
the concept of a human face extremely well
and can synthesize convincing looking new
images that are not just copy-pasted from
the training set.
The source code, pre-trained network and one
hour of imaginary celebrities are also available
in the description, check them out!
Premium quality service right there.
And, if you feel that 8 of these videos a
month is worth a dollar, please consider supporting
us on Patreon.
You can also get really cool additional perks
like early access, and it helps us to make
better videos, grow, and tell these incredible
stories to a larger audience.
Details are available in the description.
Thanks for watching and for your generous
support, and I'll see you next time!
