[MUSIC PLAYING]
MAT SCALES: Hey, everyone.
So as you heard, my
name is Mat Scales.
I'm with the Web Developer
Relations team at Google.
Today, I'm going to talk
to you about creating media
on the web.
So you heard earlier
from [? Tal ?]
about how there are billions
of users coming online,
and how the mobile web is
a great way to reach them.
One of the things those
users are going to want to do
is to create and share media.
Now, the web has always been
a great platform for sharing--
sharing what we're doing,
things that we've found,
things that we've made.
And you just post a
link to whatever it is,
and anyone can just click
it and go straight to it.
But until very
recently, creating
photos, recording
videos, editing
and filtering these
results was pretty tough,
if not impossible.
Now, thanks to some
pretty good progress
over the last
couple of years, you
can do all of this on the
mobile web right on the device.
New APIs have landed that let
you create rich media content.
So I'm going to talk today a
little bit about the challenges
that are still being worked
on, and I'll look ahead
to some even more
exciting things that
are coming in the future.
But first, I'd like to introduce
Jenny and Peter from Instagram,
who are going to
tell us how Instagram
are using these features
on the mobile web today.
[APPLAUSE]
JENNIFER LIN: Thanks, Mat.
Instagram's mission is to
strengthen relationships
through shared experiences.
As we continue to connect
more of the world,
our biggest opportunities are
emerging markets, countries
where more and more
people are starting
to use the internet
through mobile devices.
To this end, Instagram is
investing in mobile web
support, to make it
easier for people
to use Instagram where
devices have limited storage
or connections are unreliable.
As a media-heavy
application, how
do we deliver the
Instagram experience
within the limitations of web?
Today, we are going to share
some of the best practices.
Most of you probably
recognize Instagram
as a native application.
What most of you
probably didn't realize
until this conference is
that starting this year,
we started building out
our mobile web experience.
This is Instagram.com.
Thanks to the
existence of new APIs,
Chrome will prompt you
when certain criteria are
met to install a progressive
web application like ours
to the homescreen,
in the background.
Now, I'm going to show you a
demo of our progressive web
application.
Here's the native app
that you're familiar with.
Like the native app, you can
see that the progressive web
app gets its own icon.
When you click it, it
loads the same experience
as Instagram.com, but without
looking like it's in a browser.
This allows users, particularly
in emerging markets,
whose phones or
connections may limit them
from downloading, using, or
wanting to use the native app,
to get a true app-like
experience using our web
product.
You have stories, and you
have the content you follow,
with the people in the feed.
And you can take photos as well.
So I'd like to take a photo
of all you lovely people.
Can we get the house
lights on, please?
All right.
And then you can see
that we have the filters.
So let's post this.
I'm not the best at Swype.
And then share it out.
And there you go.
So--
[APPLAUSE]
Back to slides, please.
Oh, thanks.
Now we're going to talk about
the technical details about how
we implemented several
of these features.
Specifically, we're
going to deep-dive
into how to add
the progressive web
app to the homescreen,
video playback using
adaptive streaming, image
capture and filters on web,
as well as offline support.
One thing to note
is that the majority
of the features we're talking
about today is Android only.
However, because our target
audience are emerging markets,
and emerging markets are almost
exclusively Android markets,
this didn't really
limit our reach.
So let's talk about how
to get the phone to prompt
us to add to homescreen.
There are a few requirements
to make this work.
First, you need a web app
manifest, like this one.
The required fields
are Name, which
is used in the prompt,
Short Name, which
is used on the homescreen,
Start URL to Load,
and the icon that's
used on the homescreen.
In addition, you need to have
service workers registered
on your site.
For this, we use Workbox, a set
of libraries and instructions
for service workers that
Jeff talked about yesterday,
and [? Eva ?] mentioned as well.
And this is
router.registerRoute.
As a prerequisite for
using service worker,
your site must be
loaded in HTTPS,
something we are already doing.
And finally, the person
coming to the site
needs to trigger an
engagement metric.
Right now, Google
has it set to 20,
30 seconds on the site
for the prompt to trigger.
For testing and
development, though, you
don't want to be waiting 20
or 30 seconds every time,
so you can get the prompt to
trigger without the engagement
metric by going to Chrome Flags
and then turning on Bypass
Engagement Check Mode.
This is what the
prompt looks like.
Unfortunately, it doesn't
allow for much customization,
and doesn't display information
about what you are adding.
Now, Owen did mention
yesterday that there
is a new add-to-home
modal flow coming,
but since that
isn't available yet,
and despite requiring
an additional click,
we implemented our own modal
to give more information
to the users before
having Chrome prompt them.
In order to accomplish this,
we added an event listener
before registering
the service worker.
This event listener listens for
a Before Install Prompt event
and prevents the
event from triggering,
and then, instead, saving it
off to be triggered later.
So we deliver a modal, and
when the user clicks Add,
we then show the Chrome prompt
which was previously deferred.
And that's how you can give your
mobile web users the feeling
of being in the native app.
Now, Peter will
talk about how we
made this experience more
engaging with optimized video
performance.
[APPLAUSE]
PETER SHIN: Thanks, Jenny.
In areas where network
resilience and reliability
issues are commonplace,
video playback
is generally a poor experience.
You have a choice between
a low-quality video
or a higher-quality video
that buffers and stalls.
One solution for this experience
is adaptive bitrate streaming.
With ABR, the video is
broken up into a sequence
of segments, where each segment
is encoded in multiple bit
rates.
Here we have high,
medium, and low.
A manifest is then
created which contains
detailed information of each
segment and how to fetch it.
Now, using the
segment, the clients
are then able to
decide either to switch
to a higher or lower bit rate
depending on current network
conditions.
Now, our first
step of integration
was to choose a
client-side player.
We were looking for something
that was open source,
supported open standards,
and was extensible.
We found Shaka Player was
a great out-of-box solution
for our initial experiments.
So in this video, it's a
comparison between our existing
video player on the left, and
Shaka Player on the right.
They're actually playing
at the same speed,
but you can see the video on
the left buffering and stalling
as we transition to
2G network conditions.
The adaptive video,
on the other hand,
is able to continue
with smooth playback.
Now let's look at
how we integrated.
So this is a
standard integration.
You'd create a new
Shaka Player instance
and pass it a video element.
You'd then load the manifest
file that I described earlier.
So with this approach, there's
actually another round trip
just to fetch the manifest file.
But we wanted to avoid this.
In our case, before even
creating the Shaka Player,
we already had the
manifest content.
So we had to approach
this a little differently.
Fortunately, Shaka has a
plugin system available.
We create a custom networking
plugin, in this case.
So we came up with a scheme--
here it's IGW-- where the
video ID is in the path.
We then get the manifest
content from an existing store
using the video ID.
We then create a response
with the manifest content
and return it through a promise.
Finally, we register
our custom scheme
through the networking engine.
Going back to the
integration, we simply
load the player with the
URI with the custom scheme
that we created
earlier, and we're done.
So we obviously wanted to
measure the impact of ABR,
so we tested with three
different variants.
First, there is
default Shaka settings.
This is just vanilla Shaka.
And then we added
our custom settings.
So we added some overrides.
And finally, we added
a custom ABR manager.
So in the second variant, one
of the properties we changed
was the switch interval.
The switch interval is
the minimum amount of time
before switching bit rates.
We wanted to test how a more
aggressive switching strategy
would impact user experience.
In the third variant, we
added our custom ABR manager.
So this gave us a
lot more flexibility.
With the custom manager,
we had more control
over how we measured bandwidth
and when we switched bit rates.
In the feed page, we actually
have multiple instances
of Shaka Player
at any given time.
Now that we had control over
how we measured bandwidth,
we also could keep track
of the latest measurement.
Using a feedback
loop, we can then
pass that back in to
newly-created Shaka instances,
ensuring we have an
accurate default bandwidth.
We try to follow
some best practices
during these experiments.
In our feed, we made sure to
only instantiate players when
needed, and to actively use
the Destroy API when we didn't.
We understood that it
would take some time
to find that sweet spot for
how frequently we'd switch
or how much we'd buffer.
Either being too
aggressive or too passive
would change results.
And finally, we were mindful
of the types of videos
that should be supported.
Instagram has videos as
short as three seconds.
So depending on the
video, it might not
make sense to support ABR.
While we expect to continue
to iterate on our experiments,
in general we have
high hopes for ABR
and its positive impact
on user experience.
Now Jenny is going to talk
about our experience adding
image capture and filters.
[APPLAUSE]
JENNIFER LIN: When we started
working on the Instagram
mobile web initiative at
the beginning of the year,
it was important to us
to bring the mobile web
users into the
ecosystem of Instagram.
As you heard in our
mission statement earlier,
we're looking to
strengthen relationships
through shared experiences.
It is difficult to
share experiences simply
by watching other
people's experiences.
You need to be able to
create media that captures
your own experiences as well.
Since this was one of the
first features we added,
we're actually not
using the newest APIs,
which Mat will be talking
about in a little bit.
We're simply using the
image capture tag--
ah, I'm sorry, the input tag.
This is what it looks like.
Now, it's also possible
to add a capture field,
but we purposely
left this out so
that the user can either upload
the photo or take their own.
We originally launched the
creation flow without filters
so that our mobile users could
start sharing their experiences
as soon as possible.
But since filters is an
important part to the Instagram
brand, it was important for
us to implement it as well.
We used WebGL to
implement our filters.
Since our native
app uses OpenGL,
we were actually able to
reuse the same shaders, which
let us bring over the same
exact filters as the app.
As you can see, though,
the filter previews
are done differently.
This is because it is too
slow to calculate and load
all the filters.
So we use a standard balloon
image, no matter the photo.
At Instagram, like with
Peter's video experiment,
we A/B test everything
before launching.
So we put out filters to
some percentage of users,
and we actually found that
several of our key metrics
dropped significantly.
When investigating why, we
considered a few different UI
flows, including seeing
if there was a way
to test whether or
not the balloon images
themselves were the problem.
But then we took a
step back, and we
decided to test an even
more basic assumption.
We took our control,
which was the creation
flow without filters at all,
and then we tested a variation
where we did all the WebGL
processing in the background
with no user-facing changes.
This test taught us a lot,
because this variation
took the same hit in the
metrics as the variation
with the user-facing filters UI.
The performance hit
of WebGL was what
caused the metrics to drop.
So our next step was
improving the performance.
We started with logging
the timing of everything.
Instagram always crops
photos into a square,
and we learned that
creating the initial crop
was a significant bottleneck.
Specifically, we were doing
it with a computationally
expensive data URL
and blob conversion.
Since the WebGL
rendering context.text
image 2D accepts an
HTML canvas element,
we're able to return
a canvas directly.
And WebGL will read that
canvas as the source
pixels of the texture, like so.
When the user is done
selecting the image,
then we can render
a canvas.toBlob,
which we found to be two times
faster than canvas.toDataURL,
and then generate the
data URL from the blob.
This reduced the time to
first WebGL draw by 35%
and reduced the
time to transition
to the next step of
the flow by about 85%.
And this was just one of our
performance improvements.
Another performance improvement
we did was to lazy load
all of our shaders,
which are filters,
instead of compiling
them all up front.
This was our original code.
And as you can see, we would
create all the filters on init,
and then return them when
we needed a filter program.
We refactored the code to
create a helper function,
and that would initialize
the filters that
don't exist already,
and only calling them
when the filter was needed.
This reduced load
time by about 68%.
These performance
improvements really
improved our filters
experience, bringing filters
to our mobile web users.
Next, we made sharing
work even while offline.
Peter will talk about why
and how we made this happen.
[APPLAUSE]
PETER SHIN: So I know this
is the second-to-last talk,
but for a moment,
let's imagine you just
arrived at Chrome Dev Summit.
You're super excited
to hear all the talks
and see all the demos, and you
want to share this experience.
So you take out your phone,
take a photo, maybe a selfie,
and you hit Share.
Unfortunately, the
request times out,
because all the
access points near you
are completely overloaded.
But it's OK.
You can turn off Wi-Fi,
and then you use your 4G,
and you're good to go.
But what if you
couldn't simply do that?
What if that wasn't
even an option?
What if every day
was like being stuck
at a conference with
really bad Wi-Fi?
When we think of
offline support,
we don't think of someone
getting on an airplane.
We think of someone who deals
with poor network conditions
as a part of daily life.
In those moments,
when they really
want to share their
experience, there
is so much friction right now.
So let's look at a demo of how
we're trying to solve this.
So here we have the PWA.
You saw it earlier.
And let's go offline
with the Airplane mode.
So we get a toast
telling us we're offline,
but we can still post.
So let's post a photo.
[LAUGHTER]
You add some caption,
maybe "offline demo."
[LAUGHTER]
And we get a toast telling us
that when we connect again,
it will be posted.
So let's go back online again.
Let's go back to the slides
and see how we did this.
So first, we wanted to notify
the user that they're offline.
So we listen to
the Offline event.
The Offline event actually
is a little bit unreliable.
For some phones,
Battery Save mode
is actually considered
an offline state,
so it'll actually get triggered.
To guard against this, we
added a lightweight get request
to ensure that
we're truly offline.
And then on error,
we show the toast.
With Workbox, as Jenny
mentioned earlier,
we then register a
post request route.
Service workers can't
cache push requests,
so you actually need
to store the request
in a client-side store.
Here, we use an xDB, where we
break down our request object
and store it.
We create an offline
post helper function
that will reconstruct
the request that we just
stored earlier and send it.
We then use the
Background Sync API that
was mentioned in earlier talks.
Now, earlier in the
demo, if it worked,
we were planning to send
the sync event manually
through dev tools, which would
have then triggered the sync
event.
So in practice,
depends on the device
to actually trigger
this event, and it's
based on a number of conditions.
So in the callback,
we then check
if there's any
items in the queue,
and then we call our
offline post helper,
and [? show them ?]
the notification.
So as you can see, recent
advances in the web
have enabled us to create
a feature-rich experience
for mobile web users, especially
those in emerging markets.
We're actively
testing the features
that we cover today, and are
continuing to iterate and learn
along the way.
It's been incredibly
challenging, exciting,
and humbling to work on
features and handle cases
that we might not think
about in our day-to-day.
We mentioned shared experiences
several times during this talk,
so it was really great to be
able to share our experience
building out these features.
I wanted to give a shout
out to our teammates who
aren't on stage with us.
They've all done amazing
work to get us to this point,
and we hope to continue the
momentum we've already built.
And now back to Mat, who
will describe the latest APIs
for media creation.
[MUSIC PLAYING]
[APPLAUSE]
MAT SCALES: Good job.
Cool.
Thank you, Jenny and Peter.
So we've seen some of
what the web can do,
and it looks pretty great.
But we can do more
with new APIs.
So one of the things
that Jenny mentioned
is that Instagram are delegating
image capture to the input
element.
You can actually do
this inside your app.
Now, it used to be a
little bit limited.
We've had access to the
camera for quite a long time
through getUserMedia,
which is part of WebRTC.
It allows you to get a stream
from a camera or the microphone
or both, and it's
pretty simple API.
But you couldn't really do
too much with it before.
You could use it
for WebRTC, which
is what it was designed for.
But other than that,
you could present it
back to the user
using a video tag,
or you could grab
an image from it.
And the way you'd do that is
that you'd take the video--
you'd take the
stream, you'd put it
into a video element
as the source,
then you'd draw the video,
one frame of the video,
into a canvas, and then
you'd take the canvas
and turn it into a blob,
and then you'd get the blob
and you'd turn it into an image,
which is pretty longwinded,
and was also just limited by
the APIs, because getUserMedia,
those streams are
limited to 1080p HD,
regardless of what
your camera can do.
But we now have a new API
called the Image Capture
API, which makes this a little
bit easier and much better.
So it takes the stream that
you get from getUserMedia
and gives you a new object
back, an image capture object.
And it gives you a
Take Photo method.
And when you call this, it
tells the physical device,
the camera, to take a
full-resolution photo
and just give you
straight back a blob.
So none of this drawing
canvas nonsense in the middle.
You also get some extra options.
So you can see here
that in this example,
I'm passing through a
fill light mode setting.
This is setting what
the flash should do,
so here I'm saying that the
flash should be set to auto.
You can also set the
automatic red eye reduction
through this method.
Similarly, for audio and video,
there's the MediaRecorder API.
Again, you take a getUserMedia
stream and pass it in,
and you say what MIME
type of output you want.
Then you get a [? picture ?]
available event every time
that the recorder has buffered
up enough data to give to you.
And at the end, you can
reassemble this all up
into a blob, for either
the video or audio
file that you're creating.
And as well as
using getUserMedia
to get these
streams, you can also
get them straight from a
canvas, or from Web Audio.
So this is how you'd do
things like live filters.
You could take a video, draw
each frame into a canvas,
apply your filters, and then
use a stream from the canvas
to create new video
which is the output.
And if your canvas was applying,
like, Instagram's filters,
you could get a filtered
video out [? of the site. ?]
Now, not everything
here is perfect.
There are still things
that we need to work on.
One of the issues that you'll
have trying to do these things
is simply device performance.
I mean, as I said,
[? Tal ?] has been
talking about how we have to--
many of the users
that are coming online
have devices that aren't
the same as the devices
that we use.
Many of them are
not as powerful.
And things like drawing
Instagram's filters
are extremely
computationally intensive.
There are also limitations
in the APIs themselves.
So as an example, let's
talk about something
that I tried to do.
So I wanted to create
a boomerang effect.
What I wanted to do was
take a recorded video
with MediaRecorder,
straight from the camera,
and I wanted to then create
an output video which
played the video forwards and
then played it backwards again.
And then it would loop.
It's called a boomerang effect.
So I tried to do something
that was pretty simple.
I'd play the video,
and on each frame,
I would set where in the video
I wanted to be in that frame.
And I would have it--
as soon as it got to the end,
it would set the direction
to minus one, and it
would come back again.
And then the idea was to
record this with MediaRecorder.
And it's awful.
It's trash.
This is a video that I took,
and this is the full quality
that I got.
It's extremely jerky.
It's difficult to tell exactly
when it's going forwards
and when it's going backwards.
Why did this happen?
There are a couple of reasons.
One of them is that
at the moment, when
you use MediaRecorder
to record a video,
the output is in a WebM, which
is optimized for streaming.
And this means that it doesn't
put in the index of where
in the file each frame appears.
It just assumes
that you're going
to play it through right
from the beginning,
and then it would just
iterate through the file.
So if you want to
seek, then it has
to go right back to the
beginning of the file
and work its way through until
it finds the correct location.
Now, there are
some optimizations.
If you're playing forwards,
then it can make a rough guess.
Oh, you got this far,
and it was this time,
so it's somewhere after that.
Going backwards, you have to
start again from the beginning.
It also means that you can't do
the even simpler trick of just
saying set the play
rate to minus one,
because that also doesn't work.
It would have the same issue.
You can fix this
with a library, which
will take the video
that you've created
and actually manipulate the
bytes to put in that index
information so that you
can then make it seekable.
But it's pretty low level,
so a pretty chunky library.
It would be better if the web
platform did this for you.
Another issue is that
recording with MediaRecorder
is always real time.
So I thought I could fix this
issue by taking the video,
lining up exactly
where I wanted to be,
and then saying to the
MediaRecorder API, hey,
I want one frame.
Just record one
frame, and then wait
until I'm ready
for the next one,
and then say, OK,
take another frame.
That doesn't work like that.
It always records at
exactly real time.
So if it's janky when it
goes into your canvas,
then it will be janky
when you record it.
It also means that if I tried
to take a one-minute video
and do a boomerang of it, it
would always take exactly two
minutes to create.
I can't do it faster
than real time either.
So it's not a perfect
solution right now
for some of the things
that you might want to do.
But of course, we want to
see the web, the mobile web,
as a great platform
for media creation.
So we're working hard to
address all these things.
Another new thing that's coming
that we're looking forward to
is WebAssembly.
You heard about this
earlier from Alex.
One of the things that
this allows you to do
is take native libraries,
recompile them for the web,
and then use them in your page.
And people have already been
experimenting with native video
libraries, like [? half ?]
[? of ?] [? MPEG, ?] to do
media manipulation on the web.
Video doesn't-- oh well.
We're also excited about
the Shape Detection API.
This lets you detect things
like text, bar codes,
and faces inside an image.
This is currently behind a
flag, but Francois [INAUDIBLE]
demonstrated this
at I/O earlier,
and actually had a demo
out in the forum area,
which you might have seen.
[INAUDIBLE] taking pictures,
so just wait a minute.
If you'd like to know
more about the things
that we've been
talking about here,
I've been creating
a sample application
that I called Snapshot.
The source code is
available on GitHub,
and I've been documenting my
experiences in a video diary,
which is available on YouTube.
In summary, I think that
what companies like Instagram
are doing with media on
the web is incredibly cool.
I hope you're all as excited
about the future for this
as I am.
And thank you very
much for coming.
[MUSIC PLAYING]
