Dear Fellow Scholars, this is Two Minute Papers
with Dr. Károly Zsolnai-Fehér.
In the last few years, we have seen a bunch
of new AI-based techniques that were specialized
in generating new and novel images.
This is mainly done through learning-based
techniques, typically a Generative Adversarial
Network, a GAN in short, which is an architecture
where a generator neural network creates new
images, and passes it to a discriminator network,
which learns to distinguish real photos from
these fake, generated images.
These two networks learn and improve together,
and generate better and better images over
time.
What you see here is a set of results created
with the technique by the name CycleGAN.
This could even translate daytime into nighttime
images, reimagine a picture of a horse as
if it were a zebra, and more.
We can also use it for style transfer, a problem
where we have two input images, one for content,
and one for style, and as you see here, the
output would be a nice mixture of the two.
However, if we use CycleGAN for this kind
of style transfer, we’ll get something like
this.
The goal was to learn the style of a select
set of famous illustrators of children’s
books by providing an input image with their
work.
So, what do you think about the results?
Well, the style is indeed completely different,
but the algorithm seems a little too heavy-handed
and did not leave the content itself intact.
Let’s have a look at another result with
a previous technique.
Maybe this will do better.
This is DualGAN, which refers to a paper by
the name Unsupervised dual learning for image-to-image
translation.
This uses two GANs to perform image translation,
where one GAN learns to translate, for instance,
from day to night, while the other one learns
the opposite, night to day translation.
This, among other advantages, makes things
very efficient, but as you see here, in these
cases, it preserves the content of the image,
but perhaps a little too much, because the
style itself does not appear too prominently
in the output images.
So CycleGAN is good at transferring style,
but a little less so for content, and DualGAN
is good at preserving the content, but sometimes
adds too little of the style to the image.
And now, hold on to your papers, because this
new technique by the name GANILLA offers us
these results.
The content is intact, checkmark, the style
goes through really well, checkmark.
It preserves the content and transfers the
style at the same time!
Excellent!
One of the many key reasons as to why this
happens is the usage of skip connections,
which help preserve the content information
as we travel deeper into the neural network.
So, finally, let’s put our money where our
mouth is and take a bunch of illustrators,
marvel at their unique style, and then, apply
it to photographs and see how the algorithm
stacks up against other previous works.
Wow.
I love these beautiful results.
These comparisons really show how good the
new GANILLA technique is at preserving content.
And note that these are distinct artistic
styles that are really difficult to reproduce,
even for humans.
It is truly amazing that we can perform such
a thing algorithmically.
Don’t forget that the first style transfer
paper appeared approximately 3-3.5 years ago,
and now, we have come a long-long way!
The pace of progress in machine learning research
is truly stunning!
While we are looking at some more amazing
results, this time around, only from GANILLA,
I will note that the authors also made a user
study with 48 people who favored this against
previous techniques.
And, perhaps leaving the best for last, it
can even draw in the style of Hayao Miyazaki.
I bet there are a bunch of Miyazaki fans watching,
so let me know in the comments what you think
about these results!
What a time to be alive!
This episode has been supported by Weights
& Biases.
In this post they show you how to easily iterate
on models by visualizing and comparing experiments
in real time.
Weights & Biases provides tools to track your
experiments in your deep learning projects.
Their system is designed to save you a ton
of time and money, and it is actively used
in projects at prestigious labs, such as OpenAI,
Toyota Research, GitHub, and more.
And, the best part is that if you are an academic
or have an open source project, you can use
their tools for free.
It really is as good as it gets.
Make sure to visit them through wandb.com/papers
or just click the link in the video description
and you can get a free demo today.
Our thanks to Weights & Biases for their long-standing
support and for helping us make better videos
for you.
Thanks for watching and for your generous
support, and I'll see you next time!
