Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér.
This is a recent DeepMind paper on neural
rendering where they taught a learning-based
technique to see things the way humans do.
What's more, it has an understanding of geometry,
viewpoints, shadows, occlusion, even self-shadowing
and self-occlusion, and many other difficult
concepts.
So what does this do and how does it work
exactly?
It contains a representation and a generation
network.
The representation network takes a bunch of
observations, a few screenshots if you will,
and encodes this visual sensory data into
a concise description that contains the underlying
information in the scene.
These observations are made from only a handful
of camera positions and viewpoints.
The neural rendering or seeing part means
that we choose a position and viewpoint that
the algorithm hasn't seen yet, and ask the
generation network to create an appropriate
image that matches reality.
Now, we have to hold on to our papers for
a moment and understand why this is such a
crazy idea.
Computer graphics researchers work so hard
on creating similar rendering and light simulation
programs that take tons of computational power
to compute all aspects of light transport
and in return, give us a beautiful image.
If we slightly change the camera angles, we
have to redo most the same computations, whereas
a learning-based algorithm may just say "don't
worry, I got this", and from previous experience,
guesses the remainder of the information perfectly.
I love it.
And what's more, by leaning on what these
two networks learned, it generalizes so well
that it can even deal with previously unobserved
scenes.
If you remember, I have also worked on a neural
renderer for about 3000 hours and created
an AI that predicts photorealistic images
perfectly.
The difference was that this one took a fixed
camera viewpoint, and predicted what the object
would look like if we started changing its
material properties.
I'd love to see a possible combination of
these two works, oh my! Super excited for
this.
There is a link in the video description to
both of these works.
Can you think of other possible uses for these
techniques?
Let me know in the comments section!
And, if you wish to decide the order of future
episodes or get your name listed as a key
supporter for the series, hop over to our
Patreon page and pick up some cool perks.
We use these funds to improve the series and
empower other research projects and conferences.
As this video series is on the cutting edge
of technology, of course, we also support
cryptocurrencies like Bitcoin, Ethereum, and
Litecoin.
The addresses are available in the video description.
Thanks for watching and for your generous
support, and I'll see you next time!
