- Some of the government's
most important websites
are crashing when we need them the most.
More than 22 million people have filed
for unemployment in the last month,
an unprecedented number driven by
the global coronavirus outbreak.
Now Congress has put aside
an extra 250 billion dollars
to handle the new applicants,
but as people go to the
state level systems to file,
a lot of those websites
are just timing out.
- Some say that applying
for unemployment benefits
is nearly impossible.
- The state computer system
is having some trouble.
- They need to fix the website.
- This isn't how the internet usual works.
Services like Netflix and Zoom have seen
a huge surge in traffic too,
but aside from a few hiccups,
you'd never know the difference.
Most web engineers plan
to be able to handle
ten times the regular traffic
without breaking a sweat.
But government systems
don't work that way.
And it's surprisingly
hard to shift them over.
A lot of that is because
of the backend programming,
most of which is written in a
coding language called COBOL
that dates all the way back to the 50s.
But to understand why
they're still using COBOL
and why it's such a
problem, you have to see
how these sites were originally built.
And most importantly, you have
to look at the big picture.
The story of COBOL starts in 1959,
way before personal
computers or the internet.
A corporation or university
might have a computer network,
but you were really only
going to run programs
within your specific system.
So each network developed
slightly different rules
and it became really hard
to transfer programs or data
from one network to another.
So a group of engineers
including legendary
Navy programmer Grace
Hopper, started working on
a common programming language that could
bridge those networks
and be the main language
for businesses going forward.
They called it the Common Business
Oriented Language, or COBOL.
By the 70s, COBOL was the standard.
If you were managing a
huge database system,
you wrote all your code in COBOL.
And that dominance is a big part of why
it's still in use today.
This is by no means a dead language.
It's something that certainly millions,
possibly billions of
financial transactions
rely on COBOL on a daily basis.
- If you want to switch off COBOL,
you basically have to start from scratch.
So a lot of people just stuck with it.
It also locks you into a particular kind
of server architecture.
Running COBOL code meant
you were running everything
off a handful of servers
on your internal network.
When it was developed,
that was the only option.
And even later there were
real advantages to it.
You could teach your server special tricks
for handling your specific kind of data.
And deploy programs to the whole network
without having to install them
on every specific machine.
But it was also putting a lot of weight
on that one server.
If that server goes down,
the whole network goes down.
And if you try to bring in a replacement,
you'll need to teach it
all those special tricks.
But when the internet happened,
you suddenly had to worry about
keeping your service running
in the face of huge shifts in usage
and constant code updates.
That meant treating your servers
in a completely different way.
As engineers started to put it,
they're not pets anymore,
now they're cattle.
When you've got 50 servers running,
it doesn't matter if
one of them goes down.
You just bring in another one
and you make sure they're all so dumb
and interchangeable
that you can cycle them
in and out without anyone noticing.
You don't train them, you just herd them.
And because these are global web services,
that also means you can
distribute your herd
all around the world, scaling up or down
depending on how many people are visiting
the site that morning.
With cloud hosts like Amazon Web Services
or Microsoft Azure, you don't even need
to buy a whole server.
You can just rent one percent of a server
for a few hours, just to make it through
that morning's spike in demand.
Name any online service that's launched
in the last 20 years.
They basically all work
on the cattle model.
That means lots of
basically disposable servers
cycling in and out.
But a lot of these state
unemployment systems
have been running
continuously for 40 years,
processing thousands of
applications every week,
all on COBOL.
They never switched over
to disposable servers.
Which makes it hard to process
the kind of traffic surge
that YouTube of Netflix
would take in stride.
It's not that COBOL is a
bad programming language,
but it locks you into a bad
way of managing your network.
It forces you to treat
your servers like pets.
And because switching off
of COBOL is so much work,
a lot of government systems
have never been able
to make the leap to the cattle model.
- It's incredibly difficult
to even find workers
who know COBOL.
The language is old and some of the people
still fluent in it are even older,
with many approaching retirement age.
This has become a recipe for disaster
in states that still operate under COBOL.
Governors like New Jersey's Phil Murphy
have called for programmers
to come out of retirement
to help maintain their
overwhelmed systems.
- You can't really move a
COBOL program to the AWS cloud.
So it just sits there getting older
and a little harder to maintain each year.
Programmers called this technical debt.
And if you aren't spending
money on upgrades every year,
it piles up fast.
- For more than 10 years,
the federal government
has been pressuring
state Medicaid programs
to update their aging systems.
They've been handing
them large sums of money
to modernize, but it's
still an enormous lift.
- Before these folks retired, many of them
had been fired, they'd been laid off.
And then they'd actually
been brought back in
in crisis moments to fix and
upgrade the COBOL systems,
which ideally they should
have just been kept on
to maintain the entire time.
- The real problem is,
we just haven't been
spending money maintaining these systems.
We haven't wanted to or we thought
we could skate by without it.
And then when millions
of people suddenly need
unemployment checks, the entire system
is buried in technical debt.
It's a hard lesson, but
if we want the reliability
that we expect from web services,
we're gonna have to pay for it.
Thanks for watching.
If you want to know more about COBOL
and this whole saga, by
colleague Makena Kelly
wrote a great article in the description.
And let us know in the comments
if there's anything else you
think we should be covering.
