[LOGO MUSIC]
GURU SOMADDER: We are going to
be talking about the streaming
tech behind Stadia as well
as behind Project Stream
and show you why that tech
works as well as it did.
I'm Guru Somadder, an
engineering leader at Google
and part of the team
that brought you
Project Stream and now Stadia.
I came into cloud gaming
almost by accident, an accident
that I'm still trying to recover
from eight years and counting.
I also spent a decade
and a half building care
at great networking products.
ROB MCCOOL: Hi, I'm Rob.
I'm the tech lead on
our streaming system
and my background is
in network servers
for the web and structured AI.
I started working on
cloud gaming in 2007
when a friend called
me and said, dude,
you got to see this.
And a lot of people do that,
but this time they were right.
I took a two year
detour in 2012 to work
through the semantic
understanding
systems in Google search.
Hey, I'm Khaled.
I've been our product
manager at Google
for about five years with a
background in game development,
most recently on Project
Stream and now Stadia.
GURU SOMADDER: So we are
part of an internal sub
team that we call playability.
It is a measure of the
overall quality of gameplay
as perceived by players.
And there are many variables
that affect playability,
the game being played,
the environment
it is being played in, the
controls, that display,
and many more.
Our team is focused on ensuring
that the experience delivered
by Stadia is
consistent, enjoyable,
and as exact in it's
delivery as that
intended by the developer
of a particular title.
This means we are focused
on designing and building
our real time great
adaptation and streaming
infrastructure, also
known as the streamer,
building the API sent tools
that allow you to interface
with our platform, and
of course, in optimizing
the player's overall gameplay
experience all at a Google
scale.
As our team was
being formed, we were
given what would be considered
a pretty straightforward goal,
deliver the best
gaming experience.
Simple, right?
Well, as it turns
out, not so much,
especially as we are trying
to establish a completely
new avenue of gaming.
In this talk we want to
walk you through our journey
and our efforts in
achieving that goal.
So here's what we are
going to cover today.
We know that not
everybody here is
conversant with the problem
space and challenges
of cloud gaming.
So we going to
talk and establish
some common understanding.
We'll talk about the
extensive research
we took to understand this
space, the streamer, which
is the realization of that
research, Project Stream,
we chose a wide
scale test for it.
And finally, wrap up
with some of the APIs
and tools we have available
for even deeper optimization.
But first, some background to
establish common understanding.
To be successful, a
cloud gaming system
has to satisfy three
fundamental requirements.
It has to be real time,
consistent, and exact.
Players expect the game
to react to their input
with expected changes in a
natural and real time manner.
It must also provide a
consistent experience
across a variety of
player environments.
And it must provide
an experience
that is exact in its delivery
as that game developer intended.
So how is this different from
your typical video streaming
service like YouTube?
To better understand
that, let's look at one
possible and greatly
simplified architecture
for that use for
passive video streaming.
Media to be transmitted
is typically
hosted in remote
data centers and can
take advantage of offline
encoding and optimizations.
Communication and
transmission is typically
done over a reliable
protocol like HDTV
sent in 2 to 10 second chunks.
Real time latency is
less of a consideration.
Junk nature also lends
itself really well
to standard internet
caches along the way.
Clients fetch content on
demand and cache as needed.
These caches make the client
playback smooth and very
resilient to network failures.
So how is cloud
gaming different?
First of all, it's
interactive, which
means that latency
requirements are simply
too stringent to allow
buffering of any kind.
We also cannot afford any
computationally intensive bit
stream compression to
accomplish quality.
The open world nature of games
also makes caching of content
they provide simply not
possible or very difficult.
And network impairments
can be highly visible
and must be avoided
at all costs.
A cloud gaming platform must
maintain an optimal experience
under all possible conditions.
So what does your typical cloud
gaming platform look like?
Well, it possibly looks
like something like this.
Games are hosted in
remote data centers.
Players typically
have 10 clients
that, by themselves, are not
capable of hosting such games
or providing the unique
experiences enabled by a cloud
gaming platform.
A game's raw audio and video
are taken and coded packetized
and streamed over the internet
all under the watchful eyes
of a real time data
repetition engine.
This process reverses
on the client side.
You depacketize,
decode, and send it
to the display or
the controller.
In a similar manner, a
player's input and voice
are taken and sent
to the remote game.
Of course, all of this is
happening in real time.
So why is all this so difficult?
Well, key to be
successful at cloud gaming
is to be able to deliver
a consistent player
experience at scale.
This means scaling across the
entire spectrum of endpoints,
being able to deal with
their hardware and software
fragmentation and
non-uniformity.
Scaling across all
locations that players
access the service, and scaling
across all network types,
whether they are wired,
wireless, lossy, or have
variable or high latency.
So what's the first thing
that comes to your mind
when you think
about cloud gaming?
What about latency?
Well, let's talk about latency.
We have spent a lot of time
doing extensive research
on the effects of latency.
Here are some examples.
Typical HDMI
transfers to your TV
take anywhere between
16 to 33 milliseconds.
Typical monitor refresh
rates are on the order
of 4 to 16 milliseconds.
USB ports for peripherals
range from the order
of 8 milliseconds.
Even in the USB pooling
case, gaming peripherals
provide specialized
hardware and software
that try to mitigate this
latency, but it is not zero.
So did you walk into a
talk on neurosciences?
Nope.
This is all about
cloud gaming still.
But here are some interesting
findings from a neuroscience's
publication by SJ Tarp in 2001,
characterizing the effects
of a brain and nervous system.
It takes at least 13
milliseconds for you
to process visual stimuli.
It takes anywhere between
70 to 180 milliseconds
for your brain's command
to reach your fingers.
Yes, even you have latency.
And here is another
example, all of you
have seen lightning
and heard thunder.
But you know that light
travels fast, really fast,
186,000 miles per second.
But did you know that we
use that to our advantage?
It takes less time for a
packet from a map data center
to reach you than it takes
for your brain's command
to reach your fingers.
Mitigating the effects
of latency is key.
Our goal is to make it
imperceptible to players.
Visual fidelity is
another challenge.
Fundamentally,
it's because we are
trying to maintain a fine
balance between quality
and responsiveness,
but in real time.
To understand cloud
gaming, you need
to understand the impact
of codecs and encoders
and decoders and
select them wisely.
Codecs typically need
a significant amount
of implementation
specific optimizations
to maintain both real time
quality and responsiveness.
Encoders convert media
based on the selected
codec and the encode strategy.
But they are inherently
laggy and they
overshoot or
undershoot by design
to maintain optimal quality.
And finally, decoders
take this converted media
and convert them back
into a format consumable
by playback devices.
But this is challenging
on many fronts.
Decoder landscape is
fragmented and non-uniform.
Chipsets driver compatibility
and capability and performance.
Consistently dealing with
this is key for scaling.
So here is a practical
example of how
this plays out in real life.
As you can see, the demands
for a menu navigation where
the game is running at a
much higher frame rate,
but not as visually
rich, is very
different from a visually rich
cinematic scene that is usually
running at a low frame rate.
Imagine you are flipping between
your menu and this in a game
and imagine the
encoder of it needs
to maintain consistent
quality but able
to deal with this in real time.
And of course, networking
is a challenge, right?
You need to deal with
network impairments.
You need to deal with
competing traffic.
You need to deal with real time
and mitigate this in real time.
For example, if your family
starts watching YouTube,
you need to be able to get
your fair share of bandwidth.
And unlike dash players, we
cannot take advantage of client
sized buffers.
And so lost concealment and
mitigation is very tough
and you need to head this off.
And if this was not
enough, gameplay ability
adds a whole new
dimension of complexity
to this affected by
the game, the player,
and of course, the
underlying network.
As can be seen by a paper
published by S Vang and S Day
in 2009 on cloud
gaming playability,
the factors affecting it are
both subjective and objective.
Frame rate, resolution,
bandwidth, delay, quality,
smoothness, all play into it.
So now I am going
to hand it over
to Khaled, who is
going to walk you
through the next
phase of our journey.
The foundational research
that forms the basis
of our cloud gaming platform.
KHALED ABDEL RAHMAN: So with all
the different factors affecting
playability, we really wanted to
understand gameplay experiences
from a data driven perspective.
And while we did
understand a lot
about what makes for a good
game or a good experience
in general, we really
wanted to dig deeper
into the science of it all.
We decided to
start running tests
across a wide
variety of conditions
for each of the
factors identified
as affecting playability.
We analyzed industry accepted
techniques based on MOS,
or mean opinion
score, as guidance
for evaluation in these tests.
And we use it as our
subjective measure
of quality of experience.
In most of our studies players
had either a short form
or long form play
session, followed
by a quick survey where they
rated the experience on a one
to five scale.
And they were also able to
give us free form feedback
when needed.
And for each game
we tested, we also
did a baseline study
in which we identified
thresholds at which the game
itself is considered good.
The baseline studies always
ran with the highest possible
graphics settings at the
highest possible frame rate.
And in its most
basic form, our goal
was to deliver a
better experience
but do so in a browser even
when the local machine we're
comparing to is a
$6,000 max backed PC.
So we isolated our research
into the factors that
are relevant to our platform.
Items like different available
bandwidth, different codecs,
and games running at
different frame rates.
And while each of
these factors have
had different weighted impact
on the overall experience,
together they had a
significantly higher
correlation.
So the research
needed to experiment
with each of these
factors individually,
as well as in conditions
where they were
interacting with each other.
And we did a lot of research.
We built custom
tooling that allowed
us to simulate a variety
of network conditions, test
different prototypes, different
rate adaptation techniques,
codecs, and infrastructures.
And our approach to
testing was two pronged.
The first is in
controlled environments
with applied variables.
This happened in one
of our many labs.
It involves hands
on guided testing.
And we worked with
over 1,500 players
to test tens of thousands
of structured gameplay
prior to Project
Stream's launch.
For the second approach we
looked at wider scale testing.
And we utilized the
wide gaming community
that we have working at Google.
And we asked them to
just play on the platform
and use it just like they
would with anything else.
This allowed us to gather
proper feedback about the end
to end experience from a
variety of endpoints, network
conditions, and locations.
And for Assassin's
Creed alone, we
were able to do A/B tests on
some 24,000 hours of gameplay
prior to launch.
This was by far one of
the biggest resources
that we had at our disposal.
And we can't thank the Googlers
that participated enough
for their help and feedback.
So delving deeper
into what we tested.
Latency is the
first word we hear
when we say we're working
on game streaming.
And the case you're wondering,
yes, we did a lot of research
about latency.
And as a matter of fact, we
have constant weekly research
sessions about every single
aspect of latency, in addition
to our wider scale testing.
And of course, you will never,
ever, ever experienced anything
like this on our platform.
So to understand
latency, we first
need to understand its impact
within different games, genres,
and mechanics.
It's important to note
different mechanics have
different latency requirements
even within the same game.
And even the same mechanic
across two different titles
can have wildly different
behaviors and requirements.
On the other hand,
latency perception
is also closely tied to the
player's level of experience
with a particular title.
A casual player can
fully enjoy a sports game
without significant
latency perception
even when we dialed those
numbers up to absurd levels.
On the other hand, a
more experienced player
might be looking to do some
skill shots, special passes,
and things that are not feasible
at high or inconsistent latency
levels.
So we tested multiple games,
genres, and mechanics.
And we did so across
a wide audience
of casual and
experienced players.
We started mimicking
network impairments
that affect gameplay and
asking players to rate
the experience without
telling them what was active.
And our targets
naturally started
aligning with the mechanics
that had the highest temporal
and spatial precision
required by the most
skilled of foot players.
And over the course
of the research,
we not only
understood the details
of how latency
impacted gameplay,
but we also defined our
own window of operation.
The exact values
we need to target
where we can receive an
input, run a game loop,
deliver a frame to end endpoint.
And then the green shaded
zone that you see here
from tests like these, informed
our decision on rate adaptation
and encoding behaviors, which
Rob will talk about later.
Another example pillar we
delved into is video quality.
And while it doesn't come
up as much as latency
when we talk about
game streaming,
you can see that it
correlates directly
with an overall quality
of a gameplay session.
Within the general
area of video itself,
there are also multiple
factors that impact quality.
Using different codecs has a
media and well-known impact
and quality, we
ultimately decided on VP 9
as our primary codec, providing
scalability, as well as
high quality 4K performance.
And then bitrate itself is
another one of those factors.
In general, higher bitrate
mean higher visual fidelity
on most codecs.
But in reality, any
research into bitrate
must also take into account
screen size, the resolution,
frame rate, and every
other encoder variable.
And while we do look at generic
metrics like PSNR and SSIM,
we were really more interested
in the player specific impact.
And we started wondering what
would that impact be when
we're tweaking those variables?
And a lot of times these
results were expected.
But other times they
were surprising.
Like how 30 FBS sometimes
performed better
at lower bitrates.
While that didn't end up meaning
that we'd drop the 30 FBS when
the bitrate available
is lower, it
did push us to
explore that solution
and eventually come up
with different things
like dynamic resolution
switching that
would ensure a smooth
60 FBS gameplay,
as well as sustained
high fidelity.
So you noticed there that
a result on video quality
did not immediately imply that
the overall gameplay is better,
nor that the overall quality of
a gameplay session is better.
So the example from
before, video quality
mattered a lot in the
overall experience, but so
did a high frame rate, smooth
gameplay, and low latency.
If video quality
was the only factor
that we're looking into
then 24 FBS might prove
to be more cinematic and
taking seconds to buffer
would not be an issue at all.
The biggest takeaway
from our research
was window targets for each
of our playability factors.
As you saw, some
of these variables
are directly proportional,
while others,
like video quality and latency,
are inversely proportional.
And, collectively,
all of these factors
together contribute
to the quality
of a gameplay experience.
The challenge is to
deliver an experience
in which each factor stays
within their defined window.
A delicate balance that
allows you to deliver a data
validated excellent gaming
experience to a player.
Rob will be telling
you about the streamer
and how we achieve that balance.
ROB MCCOOL: Hi everybody.
Let's talk about the streamer.
But, before we begin,
I'd like to clear up
some misconceptions.
There've been some
rumors floating around.
First thing is we don't
utilize quantum entanglement
in the implantation
of this system.
There is no spooky action
at a distance in Stadia yet.
So that means that our
latencies are strictly non-zero.
Second, we have
not yet found a way
to channel tachions through
either copper or fiber
conduits.
Tachions are, of course for
you sci-fi geeks, particles
that move backwards in time.
So our latencies is are
still strictly non-negative.
Finally, we support
many different network
types, many open standards,
but we do not support RFC 1149.
Who's got 1149?
You got your phones
out, I can see it.
Come on, somebody.
IP over avian carriers,
pigeons, geese, stuff like that.
We tried it, huge mess.
The facilities department still
won't return my phone calls.
It was an April Fool's
joke, look it up.
It's fun.
Anyway, the streamer
is a delivery engine
that's purpose built
for game streaming.
It's a program that
runs alongside the game
and takes information
from the client,
as well as from Google's
content delivery network.
And it makes real time
decisions about how
to maximize quality while
keeping latency imperceptible.
It's built on open standards,
such as web arts you see,
and supports many platforms,
including browsers, living room
devices, and mobile phones.
Today I'm going to talk about
rate adaptation for game
streaming.
We continually
monitor and estimate
current network conditions,
such as bandwidth, packet loss,
and delay.
And we adapt the games
transmitter and media stream
accordingly in real time.
So cloud game streaming
is a complex system
and has many actors
and many variables that
are not directly observable.
How many people have played
a real time strategy game out
here?
Warcraft?
StarCraft?
Somebody.
Yeah, there we go.
We got some.
All right, so if you've
played a game like that,
there's a mechanic
called fog of war
where you can't see
something until you
send a unit to go observe it.
Cloud gaming is actually
a lot like that.
There's a bunch of information,
but you haven't seen it
until you go observe it.
And you don't know if it's
changed since the last time you
looked at it.
And just like a real
time strategy game,
you have to make
decisions all the time.
What are you going to focus on?
Are you going to
focus on economy?
Are you going to
focus on units of war?
And gaming, are you're
going to focus on quality
or are you going to
focus on latency?
Because if you don't
have enough bandwidth
and you focus on quality,
you can fall behind.
So there isn't a
single right answer.
There is no single optimization
to make the perfect cloud
gaming streaming technology.
It's always balancing
a series of trade offs.
So if you improve one
thing, oftentimes it
comes at the expense
of another thing.
So the solution is not
an elegant algorithm
that you can describe with a
few lines of Greek symbols,
maybe get a master's
thesis out of it.
It looks very different.
So the right algorithm is
an important foundation.
So speaking a real time
strategy games, how many people
in here have ever implemented
path finding in a game?
There's got to be somebody.
There we go.
We got some.
All right, so when you
implement path finding,
you have to start with the
foundation of a good algorithm.
A star tends to be one of
the ones that people use.
So, once you
implement it though,
you'll find there's a
bunch of edge cases.
You've got units
walking into walls.
You've got different
things happening.
The algorithm needs
a little bit of help.
So you put A star together,
you add some other algorithms,
you throw in some
heuristics and then
you have a working product.
So let's talk about why
Google is uniquely positioned
to solve these sorts
of complex, messy,
and honestly, sometimes
very ugly problems.
Search shaped Google's
culture very substantially.
Many of the approaches
that search uses, including
blending many algorithms, using
heuristics, the use of humans
as well as automated metrics
to evaluate quality, modeling
complex, real world
situations with machine usable
simplifications, predictive
systems, and reinforcement
learning are all things that are
vital to creating a streaming
platform.
Search follows the
same algorithm pattern
I've described elsewhere.
So do we have any page
rank fans out here?
Come on, Google I/O,
give it up for page rank.
Let's go.
I got a couple, all right.
So page rank is
actually very simple.
It's a very powerful, very
easy to define algorithm.
And it's an
important foundation.
It's been that way
for a long time.
But it's not enough to
use page rank by itself.
You have to combine
it with other things
to make it work
across all situations.
So balance is important as well.
In search you can
tune for precision,
in which case, you'll
filter out relevant results
but you'll be sure that
everything you're showing
is correct.
Or you can focus
on recall, where
you show them a lot of
results, but some of them
might be wrong.
So, just like path
finding or cloud gaming,
you have to balance.
You have to go into the
middle and find what
you want your product to be.
Search is the same way.
Google has a lot of
experience with this.
And that's part of
cloud gaming too.
So how do we create a
cloud gaming engine?
So first we have to choose
an algorithm as a foundation.
And in a few minutes
I'll tell you about that.
Next, we have to balance.
So one of the important
balancing acts
is latency versus quality.
What Guru likes to say is
that if all you care about
is quality, you can do that.
Just stream one
frame per second.
Your latency will be horrific,
but it's going to look good.
If all you care about
is latency, just stream
at 2 megabits all the time.
Quality is going
to be horrendous,
but man, it'll be
real responsive.
So we have to find
some middle ground,
and that's the balancing act.
We have to use incomplete
and noisy signals
to determine what's
happening and then make
a choice for what to do
based on that information.
But you don't always
have to take away
from one thing you want
to increase another thing.
Technology improvements
are a rising tide
that lift all boats.
If you consider the
improvements like a new video
codec, such as AV1,
then with this codec
you can achieve better
quality for the same bitrates
and that improves the
experience across the board.
So here's a basic overview
of how video encoders work
and what different types
of video frames there are.
In order to use the
lowest number of bits,
a video encoder will
first send a frame
that has everything in it.
And then after that, it will
send only what's changed.
These full friends
are called I-frames,
and they're like the jpg
images that your digital camera
or your phone can produce.
They're very large,
so after that encoders
produce frames that
contain only differences
from the frames before them.
Those are called
P-frames and you
can continue to display the
video as long as you don't
lose any of those P-frames.
If you lose one, you
have to set an I-frame
to start decoding again.
There is another
type of frame called
a B-frame, which
can't be decoded
until some frame in
the future arrives.
And using those would
increase our latency
to perceptible levels and
so we can't use those.
So encoders actually have
a really hard job to do.
So as an example, imagine
you're in the menu in the game
and then you choose the
action to start the game.
And then you press a button
and the whole screen explodes.
And there's glass shards and
there's fire and explosions,
and maybe there's, like, an
epic guitar chord going bram--
OK, that's not important
to this discussion.
But those videos are
called scene cuts.
And the encoder has to produce
a steady stream regardless
of whether the stream changed
a little or it changed a lot.
And so in order to preserve
consistent visual quality,
the encoder is
allowed to produce
frames that are sometimes
larger and sometimes smaller.
Tuning that behavior
is extremely
important for cloud gaming.
So many encoders also include
powerful computationally
expensive and very
slow pre-analysis steps
that try to figure out the
best encoding strategy.
One of the more prevalent
approaches to that
is to encode a frame multiple
times with multiple strategies
and then pick the
right one afterwards.
And at the extreme case, the
people who encode Blu-rays
will actually have a
combinatorial explosion,
where they'll try all of
the different possibilities.
And then they'll walk
a path between them
to get the best possible quality
at the lowest possible bitrate.
So they have weeks to do that.
By comparison, we have a
handful of milliseconds.
So we have to be really,
really careful about the types
of analysis that we enable and
which things that we don't.
So here's my simplified overview
of how a network actually
fits together.
There are many hops between an
edge data center and a player.
Our data centers have to be
on the edge of the network
as close as possible to the
players, because as Guru
mentioned, speed of
light does factor
into our latency calculations.
The network consists
of a heterogeneous set
of interconnected devices.
Each of these devices has
their own characteristics
and their own
buffering strategies.
The connections between
each of these components
each have a capacity that's
measured in bits per second.
And they span a
physical distance which
introduces a minimum latency.
Each device also typically has
a buffer attached to it in order
to absorb packets that can't
yet be sent because the linked
speed is too slow.
Some of these
buffers are shallow,
such as the ones in data
centers, and some are deep.
The deep ones led to a term
that was popular a little
while back of a phenomenon
called buffer bloat.
When these buffers
fill, that's congestion.
Because packets are waiting in
buffers and not being sent so
you get latency.
If these buffers overflow,
then packets are lost
and they have to
be retransmitted.
So on the receiver side, we
use Web RTC extensions provided
by our team in Sweden to disable
buffering and displacient
things as soon as they arrive.
For our application and
control and congestion
to keep the buffering as
low as possible is vital.
We'll see in the
following slides
a couple of strategies
for doing this.
So here's one of the more
prevalent congestion control
schemes on the internet today.
It's called CUBIC.
It's actually very old.
But it's very prevalent,
because it works very well.
There are some complexities
to the way it operates,
but for the purposes of this
discussion, the way it operates
is that it keeps increasing
the transmission rate until it
causes packet loss.
When it sees that, it dials
back its rate a little bit
and then chooses that
as its steady state.
So it's kind of like a bull
in a China shop that way.
Many odd video
streaming sites like you
still use TCP with a CUBIC
congestion control algorithm.
But if we use this
for cloud gaming
it would be catastrophic.
Because every time
we cause packet loss,
the player would
experience a frozen frame
until they could
restore a picture.
And that would take
hundreds of milliseconds.
CUBIC is a relatively
old algorithm,
but it's obviously a
good one for its purposes
since it's still in use.
In the next slide, we'll see
something more appropriate.
So let's turn the clock
back, way back to the 1980s.
I was still playing with GI Joe.
I was saving the
world from Cobra.
There was a computer
scientist named Van Jacobsen,
and he was saving the internet
from congestion collapse.
I think I had more fun though.
So Van works at Google now.
And what Van realized
is that CUBIC
was developed at a
time when routers
had much smaller buffers.
Packet loss as a signal
makes a lot of sense
only if your
buffers are shallow.
So, in recent years, he's
developed a new congestion
control scheme called BBR,
which stands for bottleneck
bandwidth and round trip time.
BBR works by using delay as a
signal instead of packet loss.
So you first transmit
below the speed
of the slowest link in the
chain, called the bottleneck
bandwidth.
And then you keep transmitting
higher and higher bit rates
until the RTT increases.
Now you know that you've
started filling a buffer
and you've reached the
bottleneck bandwidth.
You choose that as your
steady state transmission
and you keep the latency
low while utilizing the link
as best as you can.
So this is actually very
close to what we need.
BBR balances right at
the edge of inducing
buffering but not beyond.
So we get the best
possible bit rate
while keeping the
latency imperceptible.
So let's look at the traditional
streaming media system
that Khaled and Guru
mentioned, called
DASH, dynamic adaptive
streaming over HTTP.
DASH is a very resilient, very
versatile, very widespread
technology.
It's agnostic in
latency increases,
it's stable in many
different adverse situations,
such as network migration
from Wi-Fi to cellular.
And it splits the
video and audio
into a set of chunks,
usually timed at intervals
of 5 to 10 seconds.
Each of these chunks is
transmitted in a burst
at the maximum line speed.
So that means that even
if your video is only
encoded at 10 megabits, it'll
send bursts of 150, 150, 150,
if your line supports
a speed like that.
When the network can't
deliver data fast enough
for the client to buffer it,
then a re-buffering event
occurs.
DASH players will then change
to a lower bitrate stream,
but these quality
changes tend to be
at a relatively coarse
time interval measured
in tens of seconds usually.
DASH streams also have to
be seekable and then contain
frequent I-frames.
You can see that as orange
lines on the graph over there.
All of these
characteristics are very
inappropriate for cloud gaming.
So let's look at what's
more appropriate.
So here we see how a cloud
gaming engine actually works
during good network conditions.
We can see that at each frame
the congestion controller
is requesting a
specific target bitrate.
It chooses the
target based on what
it perceives as the current
state of the network
buffers along the path.
Unlike DASH, P-frames
are used consistently
unless something bad happens.
I-frames are sent only
on demand and they're
tuned to be very
small, so there is
a small quality hit for that.
The controller is constantly
adjusting the target bitrate
based on per packet
feedback signals that arrive
from the receiver device.
The controller uses a
variety of filters, models,
and other tools.
And then synthesizes
them into a model
of the available bandwidth
and chooses a target bitrate
for the video encoder.
So when buffers are filling,
like near the fifth frame
in this example, the
controller notices and reduces
its transmission rate
to drain the buffers.
Unlike a traditional
congestion control algorithm,
our congestion controller must
track media frames, not bytes.
Because if it was
to track bytes only,
it might delay the second
half of the video frame
in the middle of transmitting.
And then that would
introduce latency.
So here we see an example
of something bad happening.
So the congestion controller
is taking swift action
to remediate it.
So in this case, near
the transmission of tram
frames 3 and 4, a competing
flow has filled the buffers
despite our attempts
to reduce transmission
rate from 20 megabits to 15
megabits to 12 megabits to 6.
We find that at frame 5 we've
just lost too many packets.
And we can't restore
them quickly enough
to preserve
imperceptible latency.
So we must request an
I-frame to drain the buffers
and restore full quality
experience at frame 6.
The congestion
controller has to deal
with many different situations,
many different networks,
and many different
receiver devices.
And it has to deal
with them blindly
without explicit information
as to what's going on.
So it uses this model to
preserve a high quality
experience with
imperceptible latency.
So Google's experience
designing algorithms, modeling
extremely complex
situations in code,
and working at scale
with millisecond latency
puts us in an ideal
position to enable
this experience for players.
We tune at the millisecond and
sometimes microsecond level
and make timely informed choices
to keep latency imperceptible
while maximizing quality.
We blend many different
signals, models, feedback,
active learning, and sensors,
in a tight feedback loop
to produce a precisely
tuned experience.
We work closely with our
hardware and video encoders
to tune their operation in
a network aware and network
sensitive matter.
We use predictive models
and advanced algorithms
to know what to expect on
chaotic Wi-Fi and mobile
networks and prepare
for it, while not
overreacting or acting with
excessive conservatism.
And now Guru is
going to tell you
about the infrastructure
that enables this experience.
We're behind a little bit.
We're behind.
GURU SOMADDER: Thanks Rob.
So you've heard about our
extensive user research
and morse tachions
and carrier pigeons,
also known as a
streaming infrastructure.
Now let's talk about what makes
the underlying foundation so
strong.
Everything you have
seen and heard so far
is built on Google's
three pillars
of excellence, unparalleled
infrastructure that
controls the player's
experience end to end.
We've also been making
continuous investments
in technologies, like hardware,
data processing, EIN security.
And, finally, over the years,
we have built the blueprint
to launch products that
allow our partners to succeed
all with a 10x mentality
towards inventing the future.
As you can see
from this map, this
is a representation of Google's
network infrastructure.
These represent
internet exchanges.
We have 90 of them.
100 interconnection facilities
worldwide, and more than 7,500
edge nodes deployed deep inside
network operators and ISPs.
These are located
close to players
and enable us to provide a
consistent experience at scale.
Our cloud gaming platform
also leverages our continuing
technology investments.
Hardware acceleration
technologies
that offset diminishing
returns of Moore's law,
leveraging data across Google
services for deep insight
into player latency for an
optimal player experience,
and, finally, machine learning
to improve all aspects of game
development.
It's not just our technology
but how we apply it
that gives us a unique
ability to build
an always on, always available,
and highly secure cloud gaming
platform.
And, finally, ingrained
into all of this
is Google's DNA to scale, what
Google engineering is built on.
We have an unparalleled
ability and willingness
to throw some of
our best engineers
at the toughest problems
building on an existing
culture of success.
For example, YouTube,
maps, photos.
We also have world
class end point
reach to players on the web, in
the living room, and on mobile.
And, finally, Google
over the years
has invented and deeply
influenced a wide range
of internet protocols and
algorithms, HDMI5, Quick, RTC,
VP9, and BBR.
To sum up our cloud
gaming platform
is built using some of the same
core concepts, best practices
and technologies that
power some of our most
successful and
scalable products.
I'm now going to hand
it over to Khaled
to continue on with the rest
of the journey, Project Stream.
KHALED ABDEL RAHMAN: So
after getting our streamer
into a solid state,
our natural next step
was to test it out, validate
it, and optimize where needed.
Assassin's Creed was
the perfect test.
It had a wide variety
of environments,
meaning varying
bitrate requirements.
It had a ton of
different mechanics,
meaning varying
latency requirements.
And there was a ton of
excitement for the game.
Everyone wanted to play it.
So we would be able to get a
broad audience across the US.
Internally, we wanted to make
things out of Project Stream.
We wanted to test and
improve our streamer.
And we wanted to be hands on
with an optimization phase
that developers go through
and build the tools for it.
So upon receiving the
first bill of the game,
we were pretty impressed
with the overall quality.
We were excited to have a
title at this particular stage
in development.
We wanted to try
so many new things
and luckily we got
the chance to do so.
The Ubisoft team in particular
was very collaborative and open
to exploring these ideas.
And our team had always
been a fast moving one,
but we were also
very much grounded
and research validated results.
So we made sure that both
our platform optimization
and our developer tools
came from hands on
and world validated results.
Some of the more
interesting results involved
manipulating perceptual
quality, with which
all synthetic
benchmarks showcased
would have no improvement.
But every tested player
said that they did.
And another exciting
thing to note
was, because our
platform was live,
it can constantly improve.
So we were able to do A/B tests,
not just before we launched,
but also after we launched.
And, eventually we improved
the visual fidelity
while the game was live, meaning
that players one day logged
in to play their game
and their experience
was just better without
the need for any updates
or any disruption
of the service.
We ended up with
really good results
and an overwhelming response.
But in reality what
we were excited
about most was what we
were able to achieve
and how the Project
Stream is now
allowing us to scale
to the platform
you now know as Stadia.
Through Project Stream we were
able to learn and eventually
improve our streamer,
now capable of higher
quality at lower
bitrates, faster network
adaptability, 4K, HDR,
and surround sound,
amongst other things.
On the other hand,
we expanded our tools
to include the playability
toolkit for optimization.
I'm going to give you a
quick overview of some
of these tools.
So we did make sure that
our platform provides
an excellent experience as
is, but a natural extension
was to provide tooling for
developers that wanted a bit
more of a deeper optimization or
customization on the platform.
At the center of our
tools is our test client.
From a developer portal
you're able to launch
any build of your game,
and you can immediately
access a wide variety of
tools directly through the UI.
Some of the tools here
include network emulation,
where you can try a different
network impairments yourself,
like different DSL
and Wi-Fi models
and see the behavior under
different network conditions.
Another utility available
is a quick capture tool,
allowing you to record video
of a rendered, encoded, and
decoded frame at the same time.
This allows you to debug
the entire video pipeline
in one click.
So quickly going over
some of our APIs,
the stream profile APIs allows
you to set encoder preferences
for your title,
this aligns directly
with what we noticed while
launching Project Stream.
Assassin's Creed really relied
heavily on encode utilization.
And we've made three
profiles available so far
with the intent to provide
more in the future,
including custom profiles with
full encoder settings exposed.
The MediaStream API
gives you direct access
to all relevant values from
the stream in real time--
things like current
resolution, RTT, and more.
This can be used to dynamically
adjust behavior or mechanics as
needed.
The stream capabilities
API sends a signal
about the particular
endpoint being used
and all of its
relevant capabilities.
This allows for some
amazing play anywhere
experiences, where
you can seamlessly
move between a less capable
device and a full setup that
has HDR and surround sound
within the same session.
The game would be able to auto
detect it, switch these modes
on, on a suspend and resume.
The frame token API,
on the other hand,
allows you to associate
a delivered frame
with a particular
input, allowing
for easier
compensation mechanics
similar to what
you might be used
to from traditional
multiplayer games.
So these were just a few of
the many, many APIs and tools
we provide as part
of this toolkit.
We've been working
for years to make
sure our streamer has
unparalleled performance,
able to provide excellent
experiences for players
at scale.
We've also made sure
that a lot of these tools
are at your disposal,
allowing you
to take full advantage
of our platform
and deliver your true
vision for your game.
We're so excited for you to join
us in this new era for gaming.
Please apply to be a
developer today at Stadia.dev.
[LOGO MUSIC]
