Dear Fellow Scholars, this is Two Minute Papers
with Dr. Károly Zsolnai-Fehér.
In early 2019, a learning-based technique
appeared that could perform common natural
language processing operations, for instance,
answering questions, completing text, reading
comprehension, summarization, and more.
This method was developed by scientists at
OpenAI, and they called it GPT-2.
The goal was to be able to perform this task
with as little supervision as possible.
This means that they unleashed this algorithm
to read the internet, and the question is,
what would the AI learn during this process?
That is a tricky question.
And to be able to answer it, have a look at
this paper from 2017, where an AI was given
a bunch of Amazon product reviews and the
goal was to teach it to be able to generate
new ones, or continue a review when given
one.
Then, something unexpected happened.
The finished neural network used surprisingly
few neurons to be able to continue these reviews,
and upon closer inspection, they noticed that
the neural network has built up a knowledge
of not only language, but also built a sentiment
detector as well.
This means that the AI recognized that in
order to be able to continue a review, it
not only needs to learn English, but also
needs to be able to detect whether the review
seems positive or not.
If we know that we have to complete a review
that seems positive from a small snippet,
we have a much easier time doing it well.
And now, back to GPT-2.
As it was asked to predict the next character
in sentences of not reviews, but of any kind,
we asked what this neural network would learn?
Well, now we know that of course, it learns
whatever it needs to learn to perform the
sentence completion properly.
And to do this, it needs to learn English
by itself, and that’s exactly what it did!
It also learned about a lot of topics to be
able to discuss them well.
What topics?
Let’s see.
We gave it a try, and I was somewhat surprised
when I saw that it was able to continue a
Two Minute Papers script, even though it seems
to have turned into a history lesson.
What was even more surprising is that it could
shoulder the Two Minute Papers test, or in
other words, I asked it to talk about the
nature of fluid simulations, and it was caught
cheating red handed.
But then, it continued in a way that was not
only coherent, but had quite a bit of truth
to it.
Note that there was no explicit instruction
for the AI apart from it being unleashed on
the internet and reading it.
And now, the next version appeared, by the
name GPT-3.
This version is now more than a 100 times
bigger, so our first question is, how much
better can an AI get if we increase the size
of a neural network?
Let’s have a look together.
These are the results on a challenging reading
comprehension test as a function of the number
of parameters.
As you see, around 1.5 billion parameters,
which is roughly equivalent to GPT-2, it learned
a great deal, but its understanding is nowhere
near the level of human comprehension.
However, as we grow the network, something
incredible happens.
Non-trivial capabilities start to appear as
we approach a hundred billion parameters.
Look!
It nearly matched the level of humans.
My goodness!
This was possible before, but only with neural
networks that are specifically designed for
a narrow task.
In comparison, GPT-3 is much more general.
Let’s test that generality and have a look
at 5 practical applications together!
One, OpenAI made this AI accessible to a lucky
few people, and it turns out, it has read
a lot of things on the internet, which contains
a lot of code, so it can generate website
layouts from a written description.
Two, it also learned how to generate properly
formatted plots from a tiny prompt written
in plain English.
Not just one kind - many kinds!
Perhaps to the joy of technical PhD students
around the world, three, it can properly typeset
mathematical equations from a plain English
description as well.
Four, it understands the kind of data we have
in a spreadsheet, in this case, population,
and fills the missing parts correctly.
And five, it can also translate a complex
legal text into plain language, or, the other
way around, in other words, it can also generate
legal text from our simple descriptions.
And as you see here, it can do much, much
more, I left a link to all of these materials
in the video description.
However, of course, this iteration of GPT
also has its limitations.
For instance, we haven’t seen the extent
to which these examples are cherrypicked,
or in other words, for every good output that
we marvel at, there might have been one, or
a dozen tries that did not come out well.
We don’t exactly know.
But the main point is that working with GPT-3
is a really peculiar process where we know
that a vast body of knowledge lies within,
but it only emerges if we can bring it out
with properly written prompts.
It almost feels like a new kind of programming
that is open to everyone, even people without
any programming or technical knowledge.
If a computer is a bicycle for the mind, then
GPT-3 is a fighter jet.
Absolutely incredible.
And to say that the paper is vast would be
an understatement - we only scratched the
surface of what it can do here, so make sure
to have a look if you wish to know more about
it.
The link is available in the video description.
I can only imagine what we will be able to
do with GPT-4 and GPT-5 in the near future!
What
a 
time
to be alive!
Thanks 
for watching and for your generous support,
and I'll see you next time!
