This is a brilliant tweet.
But I don't want you to pay attention
to the tweet.
It's good, sure, but I want you to watch the
numbers that are underneath it.
That's a screen recording,
and the numbers are going up and down,
all over the place.
They should steadily rise, but they don’t.
There aren't that many people tapping ‘like’
by mistake and then taking it back.
So why can't Twitter just count?
You'll see examples like this
all over the place.
On YouTube, subscriber and view counts
sometimes rise and drop seemingly at random,
or they change depending on which device you're
checking on.
Computers should be good at counting, right?
They're basically just overgrown calculators.
This video that you're watching,
whether it's on a tiny little phone screen
or on a massive desktop display,
it is all just the result of
huge amounts of math that turns
a compressed stream of binary numbers into
amounts of electricity
that get sent to either a grid of coloured pixels
or a speaker,
all in perfect time.
Just counting should be easy.
But sometimes it seems to fall apart.
And that's usually when there's a
big, complicated system
with lots of inputs and outputs,
when something has to be done at scale.
Scaling makes things difficult.
And to explain why,
we have to talk about race conditions, caching,
and eventual consistency.
All the code that I've talked about in The Basics
so far has been single-threaded,
because, well, we’re talking about the basics.
Single-threaded means that it looks like a
set of instructions
that the computer steps through
one after the other.
It starts at the top, it works its way through,
ignoring everything else,
and at the end it
has Done A Thing.
Which is fine, as long as
that's the only thread,
the only thing that the computer's doing,
and that it's the only computer doing it.
Fine for old machines like this,
but for complicated, modern systems,
that’s never going to be the case.
Most web sites are, at their heart, just a
fancy front end to a database.
YouTube is  a database of videos and comments.
Twitter is a database of small messages.
Your phone company's billing site is a
database of customers and bank accounts.
But the trouble is that a single computer
holding a single database can only deal with
so much input at once.
Receiving a request, understanding it, making
the change, and sending the response back:
all of those take time,
so there are only so many requests
that can fit in each second.
And if you try and handle
multiple requests at once,
there are subtle problems that can show up.
Let's say that YouTube wants to count
one view of a video.
It just has the job of adding one
to the view count.
Which seems really simple, but it's actually
three separate smaller jobs.
You have to read the view count,
you have to add one to it,
and then you have to write that view count
back into the database.
If two requests come along
very close to each other,
and they’re assigned to separate threads,
it is entirely possible that the second thread
could read the view count
while the first thread is still doing its calculation.
And yeah, that's a really simple calculation,
it's just adding one,
but it still takes a few ticks of a processor.
So both of those write processes would put
the same number back into the database,
and we've missed a view.
On popular videos, there'll be collisions
like that all the time.
Worst case, you've got ten or a hundred of those
requests all coming in at once,
and one gets stuck for a while
for some reason.
It'll still add just one to the original number
that it read,
and then, much later,
it'll finally write its result back into the
database.
And we've lost any number of views.
In early databases, having updates that collided
like that could corrupt the entire system,
but these days things will generally at least
keep working,
even if they're not quite accurate.
And given that YouTube has to work out not
just views,
but ad revenue and money,
it has got to be accurate.
Anyway, that’s a basic race condition:
when the code’s trying to do two or more
things at once,
and the result changes depending on the order
they occur in,
an order that you cannot control.
One solution is to put all the requests in
a queue,
and refuse to answer any requests until the
previous one is completed.
That's how that single-threaded, single-computer
programming works.
It's how these old machines work.
Until the code finishes its task and says
"okay, I'm ready for more now",
it just doesn't accept anything else.
Fine for simple stuff, does not scale up.
A million-strong queue to watch a YouTube video
doesn't sound like a great user experience.
But that still happens somewhere, for
things like buying tickets to a show,
where it'd be an extremely bad idea to accidentally
sell the same seat to two people.
Those databases have to be 100% consistent,
so for big shows,
ticket sites will sometimes start a queue,
and limit the number of people accessing the
booking site at once.
If you absolutely must count everything accurately,
in real time, that’s the best approach.
But for sites dealing with Big Data, like
YouTube and Twitter,
there is a different solution called
eventual consistency.
They have lots of servers all over the world,
and rather than reporting every view
or every retweet right away,
each individual server will keep its own count,
bundle up all the viewcounts and statistics
that it’s dealing with,
and just it will just update the central system
when there's time to do so.
Updates doesn't have to be hours apart,
they can just be minutes
or even just seconds,
but having a few bundled updates that can
be queued and dealt with individually
is a lot easier on the central system
than having millions of requests all being
shouted at once.
Actually, for something on YouTube’s scale,
that central database won't just be one computer:
it'll be several, and they'll all be keeping
each other up to date,
but that is a mess we really don't want
to get into right now.
Eventual consistency isn't right for everything.
On YouTube, if you're updating something like
the privacy settings of a video,
it's important that it's updated immediately
everywhere.
But compared to views, likes and comments,
that's a really rare thing to happen,
so it's OK to stop everything,
put everything else on hold,
spend some time sorting out that
important change, and come back later.
But views and comments,
they can wait for a little while.
Just tell the servers around the world to
write them down somewhere, keep a log,
then every few seconds, or minutes, or maybe
even hours for some places,
those systems can run through their logs,
do the calculations and update the
central system once everyone has time.
All that explains why viewcounts and
subscriber counts lag sometimes on YouTube,
why it can take a while to get
the numbers sorted out in the end,
but it doesn't explain the up-and-down numbers
you saw at the start in that tweet.
That's down to another thing: caching.
It's not just writing into the database that's
bundled up. Reading is too.
If you have thousands of people requesting
the same thing,
it really doesn't make sense to have them all
hit the central system
and have it do the calculations
every single time.
So if Twitter are getting 10,000 requests
a second for information on that one tweet,
which is actually a pretty reasonable
amount for them,
it'd be ridiculous for the central database to
look up all the details and do the numbers every time.
So the requests are actually going to a cache,
one of thousands, or maybe
tens of thousands of caches
sitting between the end users
and the central system.
Each cache looks up the details
in the central system once,
and then it keeps the details in its memory.
For Twitter, each cache might only keep them
for a few seconds,
so it feels live but isn't actually.
But it means only a tiny fraction
of that huge amount of traffic
actually has to bother the central database:
the rest comes straight out of memory
on a system that is built
just for serving those requests,
which is orders of magnitude faster.
And if there's a sudden spike in traffic,
Twitter can just spin up some more cache servers,
put them into the pool that's
answering everyone's requests,
and it all just keeps working without
any worry for the database.
But each of those caches will pull that information
at a slightly different time,
all out of sync with each other.
When your request comes in, it's routed to
any of those available caches,
and crucially it is not going to be
the same one every time.
They've all got slightly different answers,
and each time you're asking a different one.
Eventual consistency means that everything
will be sorted out at some point.
We won't lose any data, but it might take
a while before it's all in place.
Sooner or later the flood of retweets will
stop, or your viewcount will settle down,
and once the dust has settled
everything can finally get counted up.
But until then:
give YouTube and Twitter a little leeway.
Counting things accurately is really difficult.
Thank you very much to the Centre for
Computing History here in Cambridge,
who've let me film with all this
wonderful old equipment,
and to all my proofreading team
who made sure my script's right.
