The MD5 algorithm was
created by Ronald Rivest.
He is one of the
fathers of cryptography.
He's been doing this
for quite some time.
If you have an opportunity
to go out to YouTube
and look at some of the
presentations he's given,
he really is one of the
founders and brilliant thinkers
in cryptography.
The MD5 hash algorithm, itself,
was published in April of 1992.
As the name implies,
MD5 comes after MD4.
The MD5 message digest algorithm
is a 128-bit hash value,
so the information that you
get once you hash something
is 128 bits long.
In 1996, however,
there was a discovery
of a number of collisions
that were found with MD5,
and one of the
things that you'll
notice as you examine
hash algorithms is one
of the biggest challenges they
have is to make sure that there
can't be two separate pieces
of information that end up
creating the exact same hash.
That's called a collision,
and in the world of hashing,
that's a bad idea.
When they found
these in 1996, it
was a pretty bad set
of vulnerabilities,
and they realized this
particular algorithm is not
very resistant to these
types of collisions.
In fact, in December
of 2008, researchers
created a certificate
authority certificate--
this is a pretty big deal-- that
looked absolutely legitimate
when you did an MD5 hash
against that certificate.
And they were able to build
other certificates-- these
are the kinds that might
be used on web servers,
for instance-- that appeared
to be completely legitimate
and issued by a third-party
provider of certificates.
So someone, technically,
could take that certificate,
put it on their web
server, and your browser
thinks that that certificate
is absolutely valid,
and that is a very,
very bad idea.
That means that I could
pretend to be Microsoft.
I could pretend to be eBay.
I could pretend to be anyone.
To give you a feel for what
these collisions look like,
these are two separate
pieces of information.
Everything in red is text that
is different between them,
but everything else
is exactly the same.
But clearly, those are
different pieces of information,
and unfortunately, the MD5 comes
up with exactly the same hash.
And that's our
collision right there.
That's what we're
trying to avoid.
And right after this,
turns out they ended up not
using this particular
method to create
these certificates any longer.
Rapid SSL was decided not
to really release or provide
any of those types of
certificates any longer
because of these vulnerabilities
that were found in MD5.
Another common hash algorithm
is the Secure Hash Algorithm
or SHA.
Some people say S-H-A. It's one
that was created in the United
States by the National
Security Agency, a government
agency within the United States.
It is also a Federal
Information Processing Standard.
So it's one when the government
creates these standards,
they decide to roll
it out across all
of the federal agencies, and
that's the method that they
use to provide certain hashes
of their important information.
One that was widely
used is SHA-1.
This is a 160-bit
digest, so a little
bit bigger than the MD5
we were just looking at.
Unfortunately, again
a common problem
with hashing
algorithms, in 2005,
there was a
publication that talked
about collision attacks
that could occur with SHA-1.
So the natural
progression, then,
is to create one that's
a little bit better,
and SHA-2 was released.
This is now the
preferred variant
of this SHA hash algorithm.
This is a bigger
digest, 512 bits.
The idea, usually, being that
if it's a longer number of bits,
it may be more difficult
to find collisions
between the different hashes.
SHA-1 is now retired for
most US Government use.
They've all been said, there's
problems with using SHA-1,
collisions are there.
Everybody please
start moving all
of your different applications,
all the development that you're
doing, and all the
products that you
use to provide this hashing
over to the more secure SHA-2
standard.
RACE-MD is an entire family of
different hashing algorithms.
It was created by
RACE, and this RIPEMD
stands for RACE Integrity
Primitives Evaluation Message
Digest.
That's a mouthful.
RACE stands for the
Research and Development
in Advanced Communication
Technologies in Europe.
So this is a European
agency that was really
created so that there
could be some standards
around communications
through all
the different
countries in Europe.
This is a centralized standard.
There is centralized
management associated
with the technologies
that they're creating.
So this hashing algorithm,
or sets of algorithms,
was created just
for this purpose.
The original version
of this, the RIPEMD
was found to have
collision issues in 2004,
and because of that, they've now
moved to a RIPEMD-160, which,
to this point, does not
have any known collision
issues associated with it.
This is an interesting mix
between MD4 from a design
perspective, but it has similar
performance characteristics
to SHA-1.
So there's a nice balance
there between the usability
of this hash and the speed at
which they're able to use it.
There's also other standards out
there, RIPEMD-128, RIPEMD-256,
and RIPEMD-320, and obviously,
the different hashes
might be used for
different things.
When you apply a hash algorithm
to a file or a document
or an email, you end up getting
this nice little signature
at the bottom.
So all you really know is the
document that you've received
is exactly the same as the
document that was sent,
but you can't really verify
who sent the document.
So this little technique,
which is the Hash-based Message
Authentication Code, or HMAC, is
one where you take a secret key
and you combine it with
the hashing process
so that on the other side, you
can apply the same key to it
and see if the person who sent
it really was the person you
were expecting,
because only two of you
would know what that key is.
This means that you're
not only able to verify
that the data has
not been change,
but now, you know for
sure who sent this data.
It is absolutely verified
just based on the hash.
Again, we're not
changing anything
with the text or the document
or the original piece
of information that was sent.
You don't need fancy,
asymmetric encryption.
This is a simple symmetric key.
You're using the same
key on both sides
to be able to determine
this information.
This is commonly seen,
actually, in IPsec.
It's commonly seen in TLS,
which is the big brother now,
the new version of SSL.
And it's a simple process to
simply add this key to a very
standard set of paddings
and implement that
within the message
to create the hash.
It actually is one where
you have multiple passes
to finally come up with what
the final hash might be.
So you reverse this
process on the other side.
You simply go through
the same thing.
If you end up getting exactly
the same hash at the end,
then you know the other side
had that same secret key,
and you can feel very good
that the person who sent this
is now verified.
