- (indistinct) I get to cheat
(coughing)
We're now Live.
We're Live.
All right.
So we're here with Mike Stay,
another fantastic speaker for
a virtual "DEF CON SAFE MODE."
He is covering how he
recovered a six digit sum
of worth of Bitcoin from
an encrypted Zip file.
And I guess if you just wanna like
quickly go into your talks
spend just like a minute or two
and then we'll start
asking you some questions.
- Yeah, sure.
So, short summary is
I used to work as a reverse engineer
back in the late 90's.
I broke the Zip encryption
that was used by Info-ZIP,
which was the open source version.
And so everybody accepts (indistinct)
based their encryption on that,
particularly WinZip,
that had like 95% of market at the time.
And yeah, so then 20 years later,
somebody locked up their
Bitcoin in a ZIP file,
that they made on their Lenox box
and forgot the password.
So they came to me and said,
"Hey, I found your paper.
What's the current state of the art,-
- (laughing)
- Can you help me with this?"
- And this is your first time
talking to DEF CON right?
- It is, Yes.
- Yep. So we've-
- (coughing)
- Given him fair warning,
but there is a tradition
for first time speakers
of DEF CON they get to take
a shot with us on stage
QA session is the
closest thing to a stage.
So thank you, Mike,
for providing DEF CON with
some wonderful content
and cheers to you man.
- Thank you for having me.
- Alright.
Okay. So I was actually kind of surprised,
so I have never thought
about ZIP encryption
as being something that would
be difficult to get around.
So you did go in through
a couple of different types of encryption.
I was also surprised that like, well,
I wasn't surprised that early word
was as difficult as it was,
but I later on, the 40 bit encryption
that was just really
difficult to brute force,
that one kind of surprised me.
Do you have any other have you worked
with any other type of encryptions,
that have been surprisingly difficult,
to like get into for being such a legacy
weird proprietary protocol?
- Let's see.
- There were a couple where
they clearly knew enough
to be dangerous,
then completely screwed it up,
like early word Perfect.
The founder of the company
had broken that one himself
and then when they released
their new version saying,
Oh, now we're using strong crypto,
nobody, you're able to break this.
We went in and found that
they took the password
and then ran it through
Dez in the wrong way
and got out some vector,
and then just exort their file with it.
It was something ridiculous like that.
So they had Dez,
but they didn't use it right.
- It was so close.
- Just quite,
- Yeah, They were ones like
Microsoft access 97,
I think was one where they
had RC four encryption,
but it was a fixed key.
And so they would RC four encrypt
the file with his fixed key.
And then you'd go to
this offset in the file
and look up the password.
And it was just sitting
there and playing tags.
- (indistinct)
- Oops.
- Some of the details might be off,
it's been 20 years.
Yeah(indstinct).
- So go ahead.
- I want to ask a really,
while we wait for people
to come up with some really
good technical questions
to throw at you.
Am gonna do one that
Alright. So let's say that
I don't know everything
there is to know about
encryption out there.
Let's say, I want you
to do, when I asked you
to do a similar thing
that you did in your talk.
And I know that my
password starts with a word
and has some unknown thing after that.
Are there things that I can provide you,
that I might know about the password
that will help you get through this
or this, the encryption-
- Yes(indistinct)
- Dictionary attack, right?
- The product you're
using has strong crypto,
the guys that built it
knew what they were doing,
then pretty much a dictionary attack
is the only option you've got left.
And so there are
specialized attack software
that you can get,
One of them is called Hashcat
It's built for running on GPU farms.
That was what we were
originally looking at
as maybe writing a
Hashcat module for this.
But it's really designed
for processing a key space.
And so you can give it a dictionary.
You can then say, take this,
and then do all alphanumeric strings
up to length six after it,
or take this and try all
different capitalization's
replace vowels with numbers,
say, an I goes to a one
and an is zero and so on.
E to a three.
Whenever you do that sort of thing,
you can, there are these rule sets
that you can say, okay,
Hashcat this is what
you're going to start with.
And these are the rule sets,
that I want you to use when processing
and it cause this the best of my memory,
what the password looked like.
On the other hand,
if you're doing something
like correct battery
horse staple from XKCD,
you've got too much entropy.
And that's really the way
to protect your files.
If you're doing something it's
just make it longer, right?
Cause if you go from 26 characters,
which is, all lowercase letters to 97,
you've roughly tripled,
that's adding a two bits
per character to the entropy, right?
So if you've got a length,
eight password 26 to the eighth
has, all possible lowercase letters there.
But if you go up to 97,
all printable characters,
that's only adding two bits per password.
I'm sorry, two bits per character.
So on a length that password
that's, let's see, 26
is about five bits and that's 32.
And two times eight is 16.
So it's adding three characters
to the length of your password.
Adding printable characters to a password
of the same length,
is just adding a few more.
But if you go and add a whole bunch more
characters to your password,
make it a long one.
That'll make it really secure.
- And so-
- Your Passphrase instead
of a short, random string.
- Yeah.
Even if you're using English words,
right, if you make it a
passphrase rather than pass word,
that'll make it really so
vulnerable to a dictionary attack.
There may be other attacks
if the cryptos bend,
but if the crypto is good,
then it'll protect you.
- So just, this is entirely
for my own curiosity.
So after you broke through the Zip file,
you got the password that you could use
to decrypt the Zip file.
- No, we didn't recover the password.
- Ooh you didn't recover the password.
Okay.
- Yeah.
So the way, the way Zip works
is it derives a 96 bit key
from the password-
- Okay
- It way the 96 bit key that we recovered.
Now, if we wanted the password,
we could take those 96 bits
and then go launch a hash
cat attack using dictionary.
And some other stuff,
that others have had
worked out to get a few
of the initial characters.
- That's where it fits
into the type of password
cracking that many of us are
familiar with ( indistinct)
on the river, okay.
- (indistinct)
- We've got 96 bits, then
there's something you can do
with the dictionary attack,
little, see whether the
initialization process
gives you those bits for now.
- That was great.
Yeah. What I was going to
ask is if you got far enough
to see if like a dictionary
attack would actually work.
I guess the Zip file in less time
than you spending all this
time to brute force it.
But If you didn't-
- He suspected it was on the order of 20
something characters-
So probably not
- Or more making quite a
while to it's a brute force.
- With this technique
that you went through work
for any encrypted Zip file or,
- Yeah.
- Yeah.
- This will work on any Zip file,
so my original attack
back in the late 90's required five bytes
five files in the archive,
with the same password.
This one we were able
to get away with too,
because we also knew the the timestamp.
So if you've got the timestamp
and you've got two files,
then this will work on any of them.
- So how does the number of
files affect the credibility
of as a file?
- When suppose you don't
have the timestamp.
- Okay.
- Okay
- In info-ZIP, it was meant
to run on Unix machines
as well as windows machines.
So on PK Zip, they just
allocated some memory
and used whatever bytes were there.
- Yap.
- Those random bytes.
In info-ZIP on many Unix machines,
it would initialize the bytes to zero.
And so there would be no randomness there.
So that used the process
ID and the timestamp
to get a little bit of entropy.
And then fed the XR of those
two into CS, Rand function
and generated a bunch of bytes.
But they thought maybe that's too weak.
Right?
There were some known plain text attacks
and they're like, well, if
they brute force the timestamp
and the process ID,
then they can derive
the rest of these bytes
And so they took the password
and encrypted those bytes once.
And that's what they
used as the random bytes
when they encrypted that
on the rest of the file.
But when they encrypted it twice,
because of the way the Zip cipher works
it produced the same stream
byte twice at the beginning.
So it encrypted at once
and then it decrypted it for
the first bite of each file.
- So when you say that is it-
- When you (indistinct)
files in the archive,
I have every 10th output of
that, of CS, Rand function,
and 40 bits were enough,
to figure out the 31 bit internal
state of CS ran function.
So once I knew the internal
state of CS ran function,
I could generate those
first 10 bytes of each file.
And then I would do a
bunch of bit guesses.
And because of the way
the cipher was designed,
not all 96 bits were used
when producing each
output byte of the stream.
So I guess like 40
something bits up front,
and then because I had five files there
and I knew what those bites had to be.
I could filter all of those bits.
I could say, I've got to know
which of these 40 bit guesses
are correct before moving on
to the next stage.
- Got it.
- And so by having five
files, I could both derive
the internal state of CS ran function
and filter my guesses
and finish one stage before
moving on to the next one.
And so it was a parallel divide
and conquer attack.
In this case, I only had two files.
So even though I was making a 40
something bit guess I only had
two bites to filter it with.
So like, that meant two to
the 24th wrong key guesses-
- (laughing)
- Went to the next page
and I had to guess more.
And so it just kept
getting bigger, bigger,
and bigger up to two to the 60
something before I could
start pairing it down
at the other end.
- So just for clarification,
it's resetting the stream cipher
every single time that
encrypts a separate file,
and that's why you're able to do this?
- Yes, yeah
- Okay
- So it starts over again
with the same password,
resets it to the original state,
and then starts from there.
Coz you want to be able
to extract a single file
from the Zip file.
Without having to extract all of them.
- That makes sense.
- Answer to one of the questions
we got with the new attack,
"Is there an acceleration
in having more files
in the archive?"
- Absolutely.
In the original attack,
more files you had, the faster it went.
And so this is just a refinement
of the original attack.
But certainly having more
files means more bits to filter
with and getting rid of
false positives earlier
- And sort of closely
associated with that.
Would you know if this kind of attack
works with the other encrypted archives,
like 7 Zip and RAR?
- Most archive software now
uses AEs like RAR five switched
to the S two 56.
So this isn't gonna to
work against anything
except Zip files.
- Going for best standards
I like to see that.
(laughing).
- Even WinZip switched to AEs awhile ago.
- So fair enough.
We had another question.
"Do if your client was the legit owner
of the Bitcoin?"
- I can't be certain,
but we looked him up online.
We knew his real name
and we looked him up online
and found that he had
reason to be owning Bitcoin.
- Didn't seem too shady.
It wasn't someone reaching
out across the dark web from-
- It was part of his employment,
that he would be dealing with Bitcoins.
- Fair enough.
- Now that makes sense.
So now this is really interesting.
Do you expect with putting this out
here and providing this talk,
do you expect to get
more of these requests
to crack more things?
If you do get more of these requests,
do you have an answer prebuilt
of how you might respond.
- As far as breaking
into Bitcoin wallets?
Yeah.
When I first wrote this up on my blog,
we got a whole bunch of requests
and for most of them, I had to say, Nope,
sorry, the best you can
do as a dictionary attack.
Many of them said I bought Bitcoin
with a credit card ages ago,
but now I can't find my wallet.
"Can you help me?
I've got the my credit card records."
So no, we need a little more than that.
The one that was most interesting
was the guy who claimed that
his hard drive had crashed
that had Bitcoin on it.
And so we were working with
him to get some data recovery,
but after a while, it became clear
that he was perhaps
schizophrenia or delusional
that he believed that
someone was cheating him out
of his Bitcoin and it's stolen.
Anyway, it was..
- Interesting-
- (indistinct)
- There are about four situations
where we could potentially
recover software.
One of them is if you printed out
or wrote down the seed phrase
for generating the 128 bit key, right?
When you generate it,
the wallet software always says,
"Keep this in a secure place," right?
And it's this 30 odd word phrase
that'll generate the 128 that key.
So if you've got that,
you can recover the key,
you can recover all your Bitcoin.
The next case is
if you have had damage
to your hard drive, right?
If the hard drive crashes,
then the data in the
sector is probably okay.
And even if the data in the sector is bad,
we only need eight bytes to be okay .
That has the encrypted key in it, right?
So if we can recover that data,
then we can probably recover your wallet.
If you have the wallet software,
you don't have the original phrase,
but you used a weak password,
then we can try and do the
dictionary attack approach.
- Right.
- (indistinct)
And then the least probable
there have been wallets
with security flaws
that make them susceptible
to breaking more easily.
And if you happened to use one of those
back when they were being used,
most of them have been fixed since then.
But if you happened to
use one that had a flaw,
then we could try to exploit that flaw.
- (indistinct)
- This was an attack on, a Zip file,
but you're, you're talking directly
about Bitcoin Wallets.
Do they also-
- Yes Bitcoin Wallets
- Do they also use some Ziplike structure?
Have you attacked the
Bitcoin wallets themselves-
- Bitcoin wallets takes the key,
the private key information
that you sign your transactions with
and the password and
generates a symmetric key
from the password and some salts
and then encrypts the private key.
So that private key is
really what gets you access
to the Bitcoin.
What we can do is try to
either recover the private key
by means of that really long phrase,
regenerate that same private key,
or we can attack the password
if you've got the wallet
so that we can decrypt the
private key that you had stored,
or we can attack some
flaw in the, in the cipher
where for instance,
when they were coming up with
the symmetric key, they didn't
use the entropy properly.
And so there's a much smaller T space
that we would have to brute force.
There are very few of
those that were out there,
but there were some.
so there's possibility we could do that.
- So I like asking this question
of people who know this
technology really well.
Feel free to tell me, no,
you're not gonna answer this question,
but do you yourself hold any
value in any cryptocurrencies?
Cause you seem to understand how it works.
- I don't because I have,
I mean, there's no inherent
when you pay taxes,
you pay taxes in dollars
because the government says
you have to pay taxes in dollars.
So there is this built in
necessity to own dollars,
at some point.
There is no built in
necessity to own Bitcoin
or any other cryptocurrency, right?
There's and for Bitcoin and Ethereum,
I think that a proof of
work has shown itself
to be a susceptible, to
attacks like civil attacks,
51% attacks like Bitcoin cash
and Ethereum classic have
both suffered 51% attacks.
They were rebuffed eventually,
but if Google wanted to deploy
their whole infrastructure,
they could completely own Bitcoin.
- (laughing)
There are existing companies
that could do that,
not to mention Nation States, right?
If the, if the U.S wanted to take it down,
they've got this thing
in Twilla here in Utah
that they could deploy
against taking down Bitcoin.
So my personal take and
we designed a system
to do this is that you need
to use a consensus algorithm
of true Fonality,
That proof of stake in bandwidth.
And then after a certain point,
when you have enough witnesses,
you say this block is finalized
and it can't ever change.
Ethereum is trying to
move in that direction
with their proof of stake algorithms.
- (indistinct)
- But I've heard, of proof of stake,
but the finality piece is new to me.
- You've definitely given
me a few pieces already
that I'm going to need to go Google.
- (laughing)
- (indistinct)
- So we got another question,
it's sort of a meta question.
One of the people that watched
your talk had a little bit of
struggle following your math,
they understood all the
aspects individually
that you talked about.
But zooming back out,
they seem to like lose
pieces in their head.
And they'd want to know
it's like, how do you juggle this?
And like, are you aware that
some people like fucked,
like follow your talk
might have difficulty
zooming in and out like that?
- Yeah.
So.
I had some options when doing this talk
one was to go really deep and really hard
on the technical stuff.
And the other one was to
give enough background
and the basic idea of how
this attack played off
and the challenges we faced.
And so I chose to be less detailed
for the sake of the story, rather than-
- Yeah
- Go deep into it.
- (indistinct)
- If anyone has any technical questions,
take them offline,
I'll be happy to talk through them
with you pointed at lines of
code and that sort of thing.
- That's great.
That's awesome.
Is there-
- As far as keeping it in my head
I would have to wake up
and then come down and reload everything.
I had stuff on whiteboards
all over my office pictures.
It was a process.
I would even have to remind myself
about what was going on
because I couldn't keep
it all going at once.
And it was a month long process
of trying to think through
over and over again,
how things are going wrong
and what I might be able to do to fix it.
So if you don't get it from one short 45
minute talk, I certainly don't blame.
- (laughing)
- Makes sense.
- Did you discuss this at the end?
I'm sorry, I missed this point.
Did you actually get any
compensation for this work?
- We did. Yeah.
So when he first talked
to us, we said we'd like so much upfront.
We estimate that the total cost
will be about this much.
We took longer than we said we would
we expected it to be done in three months.
That was October,
So November, December, January,
it was April before we
actually got the key back.
But because of all of the
extra quick analytic work
that I did, it took a 10th
of the time on the hardware.
So the, the hardware
constant ended up being
only roughly 10, 15 grand,
as opposed to the a hundred grand
that we thought it would
take at the beginning.
So he gave us a big bonus
afterwards which was nice.
- I actually missed this.
Another one of the, the speaker goons
was mentioned that it was on AWS.
And they want to know
if the 10 to 15 was about
what you were expecting
from compute cost.
- No, we were expecting it
to take far more, right?.
The original estimate was
around two to the 64th work,
which is comparable to finding
a collision in shell one,
Sorry, shell one was 160 MD Five.
- (indistinct)
- And there were there
was some recent work
where to find a collision.
They had to deploy an enormous
amount of work to do it.
I guess, MD Five they were
able to do because of flaws
in the site, in the, in the hash function.
Shell one took roughly a 100K
of GPU time to break.
And so we were estimating,
it would be comparable to do this.
- And is this is what your
company does like data recovery?
Or is it specific to-
- Not originally.
Originally we were working on
distributed operating system.
We could get clients interested
if we can get in the door,
but it was right at the time
when cryptocurrency was taking off
and we didn't have to talk to anybody-
- (laughing)
- To get in.
Then we started doing some consulting work
there built up a team of
about 20 scholar developers
that were top
of the cream of the crop
top of their field.
And then built the, our chain
cryptocurrency platform.
Our chain started having
some financial troubles.
So we allowed them to hire the Devs early.
We had a contract
that we'd hold onto them for a while,
and then they could hire them after
they'd worked for us for a year,
but we let them hire them early.
So they've taken over the Dev team.
And then we started working
on some other things
and this particular consulting
job came up with a nice time
and it was a whole lot of fun,
so it's okay.
But right we're looking for
any interest in consulting work
that people have.
- So that was exactly what
I wanted to say where into.
Now that you've done this talk.
do you have another research item
on your, to do lists that
you're trying to aim at?
- Sure.
At the moment I'm doing
some consulting work
for the Ethereum foundation.
I've got some consensus algorithm research
that I'm working on.
We're working.
We got access to GPT three,
So we're building-
- How cool!
- An adventure game,
kinda liker air dungeon
but with more structure
using GPT-3-
- That's ow some.
- We've got various ideas
for voice assistance
that we'll be able to carry out,
call somebody at a restaurant
rather than figure out
every different restaurants
online ordering system.
You just have your assistant call them up
and have a conversation.
GPT-3 seems to be able
to have conversations.
So maybe we can use that.
- This was probably like
a really small piece
of what you're doing,
but I used to be like
incredibly into muds.
So an adventure game
that's generated by GPD-3.
Sounds Interesting.
- Yeah. So we're working on
the room generation for quests.
We'll have things like you
have to convince this character
to give it to you.
He's got desires and needs.
So, you know you'll
have to be role playing
while you're doing this game,
interacting with these characters.
- Yeah.
Well, there is a person
that goes by the handle,
evil MOG is running a Devcon
CTF mud right now just a shout out to him.
- We are really close to being at a time.
There are-
- (laughing)
- Lot of questions over
here about specific pieces,
specific technologies.
I think I'd like for people
to bring those to you
on a less moderated basis.
So we'll let those go for now.
Before I let you go,
I want to know what is the
thing that you would like
us to take away from this?
If there is a final idea
that we should walk away from your talk.
- That attacks on cryptography
only get better, right?
At the time MD Five was proposed
128 bits for a 64 bit
attack was inconceivable.
And yet within five to 10 years,
they were able to attack that one.
And then Shaw, there are attacks
AEs with the bike click attack.
They've now broken,
I think, seven or eight of the 10 rounds.
If there is something that you need
to keep secure, choose the best software
and have a plan for upgrading your crypto
in any products that you put crypto into,
because the attacks are gonna get better.
You'll need at some point to transition
from the broken system to a new one.
And so that will come
up during the lifetime
of your product.
So be thinking about it.
- That's definitely good advice
(laughing)
Fallible you got any more questions
you want to sneak in under the hood?
- No, I think I'm good.
I really appreciate the work
that you've done here.
And thank you for coming to present
and giving your time-
- Thank you so much for your time.
- Do this QA session.
There are some more people
that have some more questions coming in
if you're willing to do so,
if you would put your contact
information in the track, one channel
and discord here, we
will get that out there.
Folks can, can look you
up if you're willing
to be available to that.
I also recommend you put all
of your company information
if you're willing to do so,
because that's a good way
for people to find you
for those contracts
you were talking about.
- Great, thank you very much.
- Alright.
Take care.
