[MUSIC PLAYING]
[APPLAUSE]
DRU KNOX: Hi, everybody.
So as they said, my
name is Dru Knox.
I'm a PM on the Storage Team.
I work on a few other projects,
but storage is really what
I'm here to talk about today.
Before I get started,
though, my mom
told me right before I
came on that my grandma was
going to be watching this talk.
So please laugh at
all of my jokes,
otherwise it will be cripplingly
embarrassing, just throughout.
So again, please,
thank you for that.
So before I get started, I
want to do a show of hands.
I'm the talk right after lunch,
so you guys are all probably
in food comas, not
really paying attention,
catching up on email.
So a little calisthenics
to get you guys going.
First, how many of you are still
capable of raising your hands?
You didn't eat too
much, you can--
[LAUGHTER]
Good.
A few people lost,
but that's OK.
Now onto the real question.
How many of you have
used client-side storage
in a meaningful way,
not just playing around
with Service Worker in a demo
app in one of your sites?
Now, keep your hand
up if you view that
primarily as a critical
performance optimization,
not offline.
About right.
So when we look at
these kind of numbers
through Chrome usage metrics,
we see that about 2.5%
of page traffic uses things
like IndexedDB or cache storage.
So my goal today is to
convince all of us--
so hopefully you'll all
have your hands up next time
around at CDS-- that
client storage is
the most important performance
optimization you can make
for load time in all browsers.
And most importantly,
because we all
know that caching and
all this is important,
I want to convince you
that it's available today
and that it's kind of a
low-hanging fruit for you
to pick up everywhere
for all of your users.
So why is this important?
We've heard this number
repeated a lot, which
is that you lose
half of your users
if your site takes more
than three seconds to load.
I won't belabor the point,
but it's pretty scary.
It's kind of a horror movie.
But when you think about it,
it is actually a lot worse
than that, because on
the average 2G network,
it takes three seconds
just to get the first byte.
So we're kind of already hosed.
We are fighting
an uphill battle.
And what's worse,
320 milliseconds
is how long it takes to load
1 megabyte off the network.
This is really hard.
The deck is kind of
stacked against us.
So we need some tools to help
us not just improve our loading
performance, but avoid the
need to hit the network at all.
So we know it's important.
We know we've got
to do something.
But I don't just want
to preach horror movie.
I want to give you guys
some actionable steps.
So in my talk today
I want to walk
through how you should
reason about spending
your time on client storage,
where are the biggest
wins, the least amount of
work for 80% of the value,
some technologies
that you can use,
along with some libraries that
make it easier, more economic,
how much storage space
you have available.
And then if you guys
are all really good
and you laugh at all
my jokes and my grandma
is really proud
of me at the end,
I'll give you guys
a view of some
of the future things
we're looking at that
are kind of exciting.
Now, before I move
on, I was told
I should explain the first line,
because nobody thought that how
you spend your time, that these
emojis were conveying that.
My girlfriend said
it made no sense.
But she's also an iOS developer,
so what does she know?
[APPLAUSE]
How are you going
to invest your time?
Web developers are pulled a
thousand different directions.
Lots of us are
full-stack engineers.
Unfortunately, we're
working in a place
where flexbox is still one
of our most powerful layout
primitives.
We don't really have
infinite resources
to do infinite things.
So before I get
started, I want you,
for the purposes of this
talk, to think about storage
as cache, not offline support.
Offline support is
really important,
and it's something that's
been touched on a lot.
I just want to focus on cache as
a performance optimization here
today.
As a cache, you kind of have
this spectrum of investment
that you can make.
Browser cache, all
the way on the left,
is sort of the default.
It's relying on the browser
to get things right for you,
hoping that your responses are
cached and that they aren't
cleared before the next time
the user visits.
And on the other end,
you're building a spaceship.
This is Service
Worker, cache storage.
You're optimizing
everything to the nines.
You're hitting like
three seconds' load time.
This is sort of what you've been
hearing from a lot of folks.
So looking at the first one,
browser cache, doing nothing.
It does have some real benefits.
You speed up repeat
visits for your users.
And that's not insignificant.
But unfortunately, it only
works for network responses.
It's unpredictable.
You don't know when it's
going to be cleared out.
And it's got pretty
coarse granularity.
It's the level at which
you served up files.
So this is not great.
It's kind of sad.
Optimized browser cache is
probably what a lot of you
are doing today, and
it's a really good step.
This allows you to not only
get repeat visits sped up,
but you can actually get
some proactive page load
improvements using things
like Link Rel Equals Import
or any number of things
to try and load things
before they come in.
But it still only works
for network responses.
So these are a lot
of the optimizations
you've seen people
suggesting sort of off-handed
as they've been giving talks.
It's still unpredictable
because, again, you're
relying on where the
browser is storing things.
And there's not much
granularity here either,
because it's still on the
level of network resources
that you've served up.
It's still not great.
Content caching is
where the first big step
function can come in, in terms
of improving performance.
You get proactive page load
improvements like before,
but now you can work
for all response types.
So when I say content
caching, I mean
things like saving image Blobs
in cache storage, if it's
available, or in
IndexedDB, and then
serving your image
tags with a Blob URL.
All kinds of things like that.
You have some predictability
because the things that you're
storing in cache
storage or in IndexedDB,
you have control over it.
But you're still using
network responses
for some other things, so
it only gets a yellow here.
It's not perfect.
You have content granularity--
and this is really important.
Granularity is
something where you
want to be able to
change something and not
have to re-download
your whole bundle.
So the more you
can break things up
and have your cache invalidate
for only small pieces,
the better.
So again, you get granularity
for content but not
for your network requests.
So it only gets a yellow.
This is still pretty
valuable, though.
It gets a Smiley Face from me.
Full cache control,
the spaceship.
This is a lot of work.
I'm going to be
honest with you guys.
I've never effectively done
it, and I work on this team,
and I should theoretically
be able to say
I've done 1,000 of these.
So it's great when
you can nail it,
and you've seen a lot of really
big production apps that have.
But with this comes
a lot of work.
You get proactive page load
improvements like before.
You get all response types.
Again, that's great.
It's fully predictable
because now you're
pulling in even the network
responses into a cache
that you control and
that's really valuable.
You can guarantee your user
a certain performance level.
You also have
content granularity
for your network
requests and your content
that you're serving.
And as a major bonus, you get
offline support, which people
have talked about a lot.
So I would be
crying tears of joy
if everyone would start building
their apps like this today,
but I understand that
that's pretty hard.
Realistically, I
think you guys are
going to want to sit somewhere
between the optimized browser
cache that most people are doing
today, and content caching.
This is kind of the sweet spot.
If you can serve all of your
content from IndexedDB or cache
storage, or something
like that now,
you really have access to
storing all of your content,
even if you're not building that
spaceship with Service Worker.
So you can get full performance
levels for your site, not just
the app shell, or
something like that.
I've talked a lot about how
this is maybe not too hard.
It's kind of low-hanging fruit.
It's really important.
But let me put my
money where my mouth is
and dig into some code.
So first of all, as
I put this together
I really fretted
about whether or not
I should put my thens on
a new line or attached.
I was afraid I would get
flamed one way or the other.
I went with their own line,
but, please, don't hurt me.
That's my best go.
So here I am using a redux app.
It can work with any framework.
You could use your own
view-binding library.
I just create my
store, get it set up.
Here's the magic part.
I'm using an IndexedDB wrapper
library to store my state,
essentially whenever
there is a change.
And this is done
asynchronously, so it's not
going to block the main thread.
And then later, when
I'm re-inflating
my state, instead of
hitting the network
or pulling from Firebase,
or something like that,
I grab it and fire a
database-loaded event,
which will reinstate my state
without having hit the network.
So this kind of
pattern, as you can see,
was only three to
five lines of code.
And it can avoid an entire
network hop for your whole app.
You have your entire app state
all saved local disk really,
really easily.
So this is a pattern
that I think works really
well with redux-style apps.
But it really can
work for anything.
Maybe it's a little
more work if you
don't have a single object
you're trying to save.
If you wanted to
tweet this, here's
a good slide with all
the syntax highlighting.
In general, there are
a few best practices.
I hinted at them but
just to make them clear,
when you're managing your
cache on the user's device,
you want to make sure you're
doing client-side chunking.
This means you might
pull in an initial bundle
and then kick off requests for
smaller more granular pieces
so that you can revalidate only
small chunks as things change.
This pattern is a
little more complex
but can really save
you network bandwidth.
You also want to preload
pages the user might
be about to visit.
So if you imagine
you're on some new site,
you might want to load
all of the articles that
are shown above the fold,
or something like that.
You also want to save
commonly repeated components.
So if you have a hero image,
a logo, anything like that,
just save as many
of them as you can.
Get rid of as many
network hops as you can.
What should you be using
to do this, though?
You guys might be aware
that the web is not
really one for having a
single answer to a problem.
There's lots of different
ways to do things.
But thankfully, it's pretty
simple in terms of what
you want to use on the browser.
So if your data is
URL addressable,
you should use the cache
storage where it's available.
It's really simple.
It's kind of like
a key value pair.
It works really great
with Service Worker.
So it's you're
no-nonsense, easy solution.
If you've got structured
data, or if you
have a lot of users who don't
have access to cache storage,
IndexedDB is where
you want to go.
These two combine.
They're asynchronous.
They're modern.
They're getting lots of
attention from browser makers.
This is sort of your
bread and butter.
This is where you want to
be doing all of your work.
Now, in terms of availability
of cache storage, I have here,
can I use usage-weighted slides?
It's available in a
lot of places already.
So I know some of
you are thinking,
oh, I don't want
to deal with having
to do progressive enhancements
or fall back to IndexedDB.
But you can hit a lot of your
users with cache storage today.
And it's only going to
improve in the future.
So I have here just
a few libraries
that we think, on
the Chrome team,
are great for helping to
improve your interactions
with IndexedDB.
They all give Promise support.
Some of them give database sync.
Some of them even try
to recreate SQL syntax.
But these are all
great libraries
in terms of ergonomics.
But also they've thought
a lot about making
themselves minified
so they don't
impact loading performance.
So that is a really big one.
For cache storage,
it's a newer API.
There's not quite
as much available.
We heard from Jeff about Service
Worker Toolbox, Service Worker
Precache.
And Webpack has
the offline plugin.
But otherwise, there
aren't quite as many things
that are available now for use.
If I had to guess, I would
say this was probably
the area you guys were
most skeptical when
I started the talk.
You are making websites.
We aren't using
device resources.
This is the whole
point of the web.
It's ephemeral.
It doesn't stick things around.
I'm sure that's been
changing as we've been
talking about service workers.
and kind of convincing
you guys of offline.
But it's still a real question--
how much space do you get,
and how reliable is it?
So at first I started
looking at empirical ways
that I could do this, small demo
apps to try to fill the cache,
fill the storage partition,
and see what would happen.
But then I realized,
why don't I just email
all the different storage teams
and the different browsers
and ask them how much
space is available.
It turns out, that
worked way faster.
So the browser quota
limits today kind of fall
into two camps.
So we have percentage-based--
Chrome gives you
6% of free disk
space per origin.
Firefox gives you a little more.
It's 10%, shared across eTLD+1.
So this is like play.google.com
and movies.google.com,
which share storage.
Safari gives you at least
10% of free disk space.
And Edge is-- well, it's a
little bit more complicated.
But thankfully, it's
still fairly reasonable.
Edge is largely a
desktop browser,
so you can rely on being
in one of the higher tiers.
So you don't have to worry about
all four of these all the time.
And based on usage
statistics and sort
of looking into
our own telemetry,
we found that a simple
number, simple rule of thumb,
is that you have 50 megabytes
available on all devices
and all browsers today.
So this will get
higher as you're
working on higher end phones.
But you can think of this
as your minimum budget
that you can use to try
and improve performance
on your site.
If you remember my slide
before, or a ways back,
if it takes 320ms to load
a megabyte across the wire,
and you've got 50
megabytes available to you,
that's 16s of load time
that you can save averaged
across all your users' visits.
That's pretty huge.
Sixteen seconds could take that
19-second loading app down to 3
if you were able to condense
all of those network
hops into something
that you could cache.
If I could, I would take
this off and do a mic drop,
because I think
that's probably one
of the most exciting things to
me about client-side storage.
But with great power comes
great responsibility.
We're now using resources
on the user's device.
And this is kind
of reinventing what
we think is the contract
we make with users.
So first and foremost,
you need to make
sure you're measuring and
thinking about your app's
overall storage footprint.
So this is something like
figuring out your eviction
strategy to make sure you
don't just balloon up to 6%
after three visits.
But the whole
point was we wanted
to be using the user's device.
So we can't just
keep arbitrarily
lowering our storage footprint.
That would get us back
to where we are today.
So the second number that comes
in is your read:write ratio.
And this is something
that Chrome Storage Team
thinks about globally
as well, which
is where we try to make changes
to our eviction policies that
reduce the storage footprint
without lowering this ratio.
That means, we're trying
to clear out data that's
never going to be read again.
Now, sometimes you'll
clear something
and maybe it was going to
be read in three months.
And sometimes that's right.
But sometimes you actually do
want your cache to stick around
for three months.
So another metric
that you can look at
is when you store
something, check
to see if you had cached
the resource before.
And if you had, a look
at the time difference
between the two and
that will give you
a sense of how long
of a horizon was
this sitting there kind of
useless on the user's device.
So these are the three
numbers that Chrome looks at
and Chrome really cares about.
And I think it's a really useful
way to think about storage.
But I would love to
hear from other people
if they have other
metrics that they
think are important to track.
It's a really interesting space.
Your eviction strategy is not
the only thing at play, though.
The other browsers have eviction
strategies of their own,
or at least some of them do.
So for Chrome and Firefox,
we evict your storage
or any storage on the device.
When Chrome or the
disk is full, we
evict the most least recently
used-- most least recently
used.
That's not great-- the
least recently used domain
from the list.
Now, it's important to
note, this is very rare.
Chrome clears a domain storage
less than 0.1% of the time.
So for the most part,
when you store something
it sticks around.
But it is something
to keep in mind.
Safari and Edge, however,
don't clear IndexedDB.
So you can treat
that as persistent.
Now, on Firefox
and on Chrome there
is something to try and help
you work around the eviction
policies when it's
important, and it's
called persistent storage.
It's shipped in
Chrome 55, and it's
in development with Firefox.
So the way this works
is you essentially
request the persistent
storage permission,
and then Chrome will exempt
you from automatic clearing.
We'll also, when the user
clears browsing data,
pop up a prompt for
any persistent storage
sites that says, you're
also going to clear these,
is that OK?
So this is trying
to help make sure
that if you've entered a
contract with your user
that something should
be available offline,
you can guarantee that.
Unfortunately, when
you do user surveys
and you ask them about storage,
it becomes pretty clear
that it's not something
they either want
or are able to effectively
reason about upfront.
If you ask a user, hey, I'd
like to store 100 megabytes
of your data, is that OK?
They'll either freak
out and say no,
or they just won't really
understand the question.
But they are very good
at reasoning about,
your storage is full, which
site would you like to clear?
So because of this,
we try to avoid
showing a permission prompt
when you request durable storage
or persistent storage.
Instead, we have a heuristic,
which we use to either
automatically grant or deny.
And a heuristic is essentially,
if you're treated like an app,
you'll get app-like
storage persistence.
And that means if you've
been added to home screen,
if you have push notifications,
if you've been bookmarked,
or if we've seen
if Chrome tracked
that the user is engaged with
you, with your site, a lot.
So if any of those hold,
you'll get the permission.
Otherwise, it will be denied.
We recommend that you hold
off on showing offline UI
until you've received that
permission so that you
know it will be around.
It's certainly
not a requirement,
but it's a little bit
of a best practice.
And then use the quota
estimate API to make sure
that you aren't
ballooning your storage,
or you want to make
sure if you have
a regression where you get
some kind of storage leak,
you can clear it up because now
you can't rely on the browser
to protect you anymore.
You have to kind of take your
life into your own hands.
That's sort of the end of my
practical area of the talk.
You guys laughed at a few
of my jokes, so I like you.
I feel like we've grown
close over this time.
I want to give you
a look into what
were thinking for the future,
because I think it's important.
I think it's going to change
the way we think about storage,
and apps, and offline,
and all these things.
So first and most
importantly, we
want to give you
guys more space.
We're kind of channeling
our inner Elon Musk
and get more people,
more space all the time.
This is kind of a new paradigm.
On Chrome we're
thinking about, how
can we start giving PWAs or
web apps in general access
to as much device storage
as native apps get?
And we want to do
this because we think,
as PWAs has become more
common, the divide between apps
and PWAs and all these things
are going to become less clear.
And so we want to
give developers
all the tools that they need
to make great experiences.
Some of you might be
cringing or freaking out.
It is a little bit of a divide.
You could go to a
website, and you
could take all of your storage.
So this is something we're
thinking about very carefully.
And again, those three
metrics I talked about before,
we're tracking
those very closely.
And we're going to be ratcheting
this up slowly to make sure
that bad ecosystem changes
aren't coming into play.
So this is something, though,
to be looking for storage
increasing over time.
We're also thinking about
some new functionality.
And it kind of falls
into two stages.
We have one set of things
that are in development that
are actually pretty far along.
IndexedDB Observers
are a way for you
to help synchronize transactions
with IndexedDB across tabs.
So if you have a kind of UI
that uses something like that,
it's really effective for
making it quite simple.
Async Cookies are
going to be a big win.
They'll also be available
in Service Worker,
so that's pretty awesome.
Both of these are
WICG or "wi-CG"
specs that have some
degree of implementation,
so they're coming sooner
rather than later.
Then we have a couple
areas of exploration
that we're looking at.
So I mentioned a lot of
libraries that will give you
Promise support for IndexedDB.
We know it's great.
We know it's the
way that people want
to work with Async
code moving forward,
or at least a large
subset of people
want to work with it that way.
But it turns out layering
on IndexedDB Promises
to transactions
is kind of thorny.
It turns out, to hit all
the Edge cases, it's hard.
So this is an area
that, in terms
of baking into the
platform, we're
still thinking about
and figuring out.
The last one is
kind of exciting.
And it's personally
very, very cool to me.
We're calling it Writable Files.
It's the idea that we want
to start giving WebApps
the ability to get re-usable
handles to files on the user's
device.
So I'm sure a lot of you have
gone to some website where they
ask you to upload a file or
they offer to download a file,
and you download it as a
zip, or every time you made
an edit you had to
re-download the file,
you'd re-upload the file.
It's not a great user flow.
So instead, what we want to
do is create a way for apps
to get a handle to a file
such that they can just
track changes to it
like a normal app would.
Again, this is an
area of exploration.
We're figuring out how to
get the privacy and security
models right.
But it is a spec that's
available on the WICG
and we would love to get people
commenting there, telling us
what use cases they
might have for it,
what concerns or mitigations
they have planned.
It's all on GitHub.
We would love to get
collaboration from everyone.
I hope that I've made an
OK, hopefully great, case
for the idea that caching
with client-side storage
is a huge win that you have
available to you today.
It's not the easiest or the
first thing you'll jump at.
Performance optimization is kind
of-- in a lot of ways, at least
for me, when I'm developing--
a secondary impulse
to making it look right or
adding a cool new feature.
But it can be a huge win,
and it can really translate
to increased bottom line.
So it's very important.
And I hope that I've
convinced you guys
that there is stuff that you
can do via Link Rel Preload,
loading Blob URLs for images,
storing things in IndexedDB,
that work across all browsers
and that you can do today
and that it's not too much
work to get it working.
A few concrete takeaways
is that storage
isn't just about offline.
Think about it as a
performance optimization
just as much, if not more.
Offline is amazing, but until
your page is loading quickly,
there's not going to be anything
available offline anyways.
Fifty megabytes is available
to you on all browsers,
on all devices.
And this is going to
go up as time goes on.
And it will go up when
you have users who
are using higher end devices.
But this is sort of
your bare minimum budget
you have for
improving performance.
IndexedDB is for
structured data.
Cache storage, where
it's available,
is for URL addressable data.
That's it.
Thank you, guys.
I really appreciate
you taking the time.
[APPLAUSE]
[MUSIC PLAYING]
