The following content is
provided under a Creative
Commons license.
Your support will help
MIT OpenCourseWare
continue to offer high quality
educational resources for free.
To make a donation or to
view additional materials
from hundreds of MIT courses,
visit MIT OpenCourseWare
at ocw.mit.edu.
PROFESSOR: Sorry.
I have a bit of a sore
throat, hoarse voice.
I was talking a
lot this weekend.
OK.
Also, today we're going to
do transaction malleability,
segregated witness,
and I'm endorsing
an ICO for the first
time ever publicly,
and it's Anne's intermittent
cookie offering.
So if you guys want cookies.
It's an airdrop and
you just get them.
so it's my first
ICO I'm endorsing.
OK.
So malleability.
So malleability is
the ability to deform
under pressure formateer.
And so bitcoin is modeled
off of gold, which
is the most malleable metal.
You can make gold
leaf and stuff.
So clearly we need to design
bitcoin to be malleable.
No, I'm joking.
OK.
Actually in the context
of cryptography,
it's not super hard
definition, but it
started with Cipher text, where
if you can modify a Cipher
text and that modification will
carry over into the plain text
when it's encrypted.
It also applies to sort of
messages and signatures.
In our case, signatures
can be malleable,
which means you can
change the signature
and it's still a
valid signature.
So given a signature
S1 on message M1,
you modify the signature
to S2 or S prime
and it still signs
the same message.
It still validates as true.
So when we were
defining signatures
this wasn't really an
attack we'd considered.
There's still a signature and
you can't forge a signature,
but you can dot
an I or something
and the signature is
slightly different.
You can't create one yourself,
but given a valid signature
you can make a slightly
different valid signature.
And that's how it works in
the bitcoin signing algorithm.
But there's all sorts
of different contexts
where malleability
exists in cryptography.
And then part of
it is things still
have to work for
whatever definition.
So if you malleates a signature
and it no longer validates,
well, that's sort
of a trivial, like--
yeah, sure, you can do
that to any bunch of bytes.
You can just flip some of
them and the whole thing
doesn't work anymore.
That's easy.
But properties where
it still works.
So this leads to some
weird stuff in bitcoin
where you can
change a transaction
and it's still valid.
And that's generally
not what you want.
If you've got some
kind of contract
or some kind of payment
and you write a check,
you don't want someone to
be able to modify the check
and still be able to cash it.
And I don't really use checks
much, but they draw the line.
Like $100 and then
they draw a line
so someone doesn't write and
$0.99 or something after.
Not that that's like the
greatest attack ever.
Or $100 and then someone
puts like $1,000.
I don't know.
But it's sort of like
that where someone
can change the
thing you sign, can
change the thing you are
agreeing to after the fact,
and it's still valid and it
does something you don't expect.
OK.
So a review of the
transaction format,
which should be probably
in people's minds
if you were looking at the
homework, the problem set.
And one thing to focus
on is that the outputs
don't look like the inputs.
These are fundamentally
different things.
The outputs specify a
transaction ID and the inputs,
and then the outputs specify
a script and an amount.
There's another 4 byte field
here that doesn't matter.
So basically you're spending
from a transaction and a sort
of row and you're spending too
just this arbitrary pub key,
but you're not
spending from a pub key
and you're not spending
to a transaction.
You don't identify your
transaction itself here.
Almost every website that
shows blockchains is it
gets this wrong.
So if you look at
like, I don't know,
blockchain.info is
probably the most popular
and you just look
at a transaction.
They don't have it anymore.
OK.
So you look at a block and
then you look at a transaction
in the block.
We're going.
No.
No.
No, not yet.
There we go OK.
So you look at a transaction.
Yeah, it shows.
This address is sending
to these two addresses.
And blockchain.info is
particularly egregious
and that there may actually
be more than one input and two
outputs in this transaction.
They hide change transactions.
So it looks like, hey, this
address had some money in it
and it sent it to
these two addresses.
And if I click, oh, where
did this money come?
It come from
18eecz, and it shows
here's the bitcoin
address, and, oh, it's
got multiple transactions this
addresses has been involved in.
This is not how bitcoin works.
They are running their
own database and sort
of making up this view of the
network, which is incorrect.
Transactions don't
send from an address,
they send from another
transactions previous output.
And this is very confusing
because in the case,
let's say in this
transaction, there is a--
what is it?
767 something.
So it says, yeah, it's
coming from 18ee whatever,
and if I click on it I get
three different transactions.
There is a specific
transaction that 18ee
was involved in that is being
spent from in this transaction.
I can look it up because I
have an actual full node.
So if I say get raw transaction
and I put it in here
and I can see, OK, it
was spending from c838.
It was spending from this
transaction, not just
an address.
So I mean if you're coming at
this sort of new it's like, OK,
fine, why do you keep
talking about this?
But if you've been working on
these things, a lot of people,
myself included, for
like six months a year
I looked at these
websites and I'm like,
oh, this is how it works and
then six months in or something
looking for code, and
I'm like, wait, huh?
This code is wrong, but no,
this is the bitcoin code
that actually is running.
And so it's a weird thing
to sort of get used to.
Like no, you're not
spending from an address.
You don't show the address at
all when you spend from it,
you spend from a
specific output.
OK.
So that leads to
some weird issues.
Specifically, what gets signed?
So to some extent you're
signing the whole transaction.
You sign.
You want to sign everything.
When you're saying I'm
sending money from here,
I'm sending it to here,
you want to make sure
that your signature covers
the entire transaction so
that people can't add stuff
after or remove stuff.
So you want to sign
the inputs and outputs.
But the inputs
contain signatures
and you can't sign
the signature.
That doesn't make any sense.
The signature is the thing
you're putting on at the end.
So it's sort of weird.
You've got this document
and you have a little line
at the bottom for the signature.
But should the signatures be
maybe a separate page that
refers to the previous page?
It's actually kind of a
weird confusing problem.
So in practice, in bitcoin,
what Satoshi did in 2009,
you take the whole
transaction, but you
remove the signature fields.
You basically zero them out.
Just put a zero
there and then you
sign that sort of
signatureless transaction.
And then you put that
signature in after the fact.
And so that means if you change
any bit of the stuff that gets
signed other than the signature,
the signature will break.
So does that make sense?
You have these
empty lines kind of,
and the idea is you empty them
out, you make them blank lines,
and then you take that whole
message, hash it, including
the empty parts, and then
paste in those signatures
after you've signed.
You don't empty out every
line, the line that you're
specifically putting
the signature in,
you actually put a
different few bytes
in there, which leads
to other problems
that I can maybe
mention if I've done.
So this seems OK.
It's like well, look, I can't
sign the signatures, sure,
but if you change any bit
of the stuff I'm signing,
the signatures now break.
So this seems perfectly safe.
No one can change the
amounts I'm sending.
No one can change where
I'm sending it to.
No one can change
where I'm sending from.
All these things get covered
in my signature, so I'm good.
Problem.
The transaction ID, the way
you refer to transactions
is the hash of the whole thing.
The txid does not zero
out the signature fields.
So the identity of the
transaction itself, the way
to point to and indicate
where you're spending from,
that includes the signatures.
So that also seems
like, well, that's OK.
When I point to
something I'm indicating
the whole transaction, the
whole signed transaction.
But the problem is
that can change.
The signature itself
may be malleable,
and in bitcoin it is.
There's third party
malleability problems.
So the simplest one
was leading zeros
where all these
things are numbers.
You could say, OK,
someone's got a signature.
It's this big, long
string of bytes.
I'm just going to put
zeros in the front.
I'm going to put a zero
byte in the front of it,
and that doesn't
change the meaning.
If someone says, I'm
sending you $1,000
and I put a 0 front
of the one and 1,000.
Well, it's still 1,000.
However, for the purpose
of a hash function,
if you have a zero
byte in front,
that changes the whole hash.
And so they got rid
of this pretty early.
They sort of made
it so that you had
to have the exact right number.
You can't have
any leading zeros.
But the first one was just,
oh, I put a 0 in the front.
The harder one is
called low s, where part
of the ecdsa signature scheme.
I showed before that
it's this curve that's
symmetric about the x-axis.
Whether the thing
you're indicating
is on top of the curve or
it's sort of reflection
on the bottom, it's
valid either way.
So for any given
signature, there's
another signature
that will be valid.
You just sort of flip it,
make it negative or positive.
So now there's a standard,
OK, you need to have high s.
Low s signatures
should be invalid.
Both of these are really
tricky because they're
third party malleability.
Anyone can just listen on the
network, see a transaction,
change the signature.
And in doing so they
will change the txid,
which is how all the software
refers to transactions.
So it looks like
a new transaction
to most of the software.
And the transactions
are still valid.
The signatures are still
valid and you're not
sure which one will get in.
There's also first party
malleability, or in some cases
second party if you're
doing transactions
with multiple people signing.
So I'm not going to go back into
the elliptic curve signature
algorithms, but
there is a nonce.
There's randomness
that you inject
into the signing process.
It's not deterministic.
It's not that given a
message and my private key.
I always compute
the same signature.
No, that's not how it works.
There are signature
schemes like that,
but in the case of the elliptic
curve stuff that bitcoin uses,
you have to make up a random
number to make each signature,
and no one knows what
that random number is.
So you could make up
different random numbers.
You can make as many
signatures as you want.
So given a message
and your private key,
you can make arbitrary
number of signatures
that are all
different signatures,
but they're all
valid signatures.
There is a sort of standard
for how to make them
the right way not randomly.
It's basically take the
hash of your private key
and the message being signed.
Put them together,
hash that, and use that
as your random number because
then the idea is well,
it's got secret
information in it
as well as the message
specific information in it.
So no one's going to be
able to guess what it is so,
and it's still kind
of message dependent
so it'll change each time.
So there is that, but
that's something you can do.
It's a really good
idea because if you
use a non-random k,
if someone can guess k
or if your random number
generator's broken,
they can steal all your money.
They can figure out
your private key.
So you don't want to be
dependent on generating
randomness.
A nice way to model it is, OK,
have some initial event where
you're putting in random
numbers and you're storing them
and then from then on you want
everything to be deterministic,
then you don't have to rely
on random number generators.
So it's really a good
idea to use this.
And I use it in my software.
Most things use this
kind of standard.
However, you can't verify that
anyone's actually using it.
It's purely internal.
It's a internal way for you
to make your own signatures,
but no one can actually--
can you prove?
No.
I'm not going to say you
can't prove you're using it,
but not in any
reasonable fashion.
Yeah.
So no one knows if
you're doing it.
So this is an attack
where you can say, OK,
I'm going to make a whole
bunch of different signatures.
They're all going to
be valid, but that
will mean I've got a whole
bunch of different transactions
that are doing the same thing.
So maybe the question is,
what does this really do?
Does this really hurt?
If someone dots an eye on your
check, it's the same amount.
It's going to the same place.
Who cares.
Outputs are the same.
Inputs it's pointing
to are the same.
It's just this sort
of annoying thing.
OK, I tweaked it and
I changed the txid.
No big deal.
Well, in some ways,
yeah, it's no big deal,
but a lot of wallets
didn't deal with this well.
So let's say you're running a
wallet, you make a transaction
and you sign and you
broadcast transaction 2d5cac
and it never gets confirmed, and
instead someone out there flips
a bit, changes your
transaction to 9cba3e
and that gets confirmed,
and your wallet just
says, yeah, this transaction
you sent never got confirmed.
There's wallets that did that.
Most of them have
fixed it by now.
But if you're thinking
of transaction IDs
as the identity txid, this is
the name of the transaction.
I create it, I'm watching it
to see when it gets confirmed,
and I'm not looking for
some malleated version.
I'm just watching this
thing that I created, never
gets in a block.
Weird, and it's just
stuck in the wallet.
So there are definitely
wallets, and I think
everyone's fixed it by now.
But a couple years
ago, definitely wallets
that would have these problems.
It's a wallet problem.
Your money got to
the right place.
You just need to sort of either
delete stuff in your wallet
or upgrade the software or
tell it to somehow forget
about this transaction
and actually
look on the blockchain
for everything
similar to your transaction.
But it did do some damage.
So, I don't know, 2014 the Mt.
Gox thing where Mt.
Gox got hacked supposedly
and lost all the money,
they blamed transaction
malleability,
which was kind of interesting.
There may have been an attack
on Mt Gox that used transaction
malleability.
The attack probably was
this, which was log into Mt.
Gox, withdraw some coins,
modify the txid to this,
and then it gets confirmed,
you get your coins,
and then log into Mt.
Gox and say, hey,
this never happened.
My withdrawal
didn't work and then
their system would automatically
issue a new withdrawal
transaction.
And so you could just start
taking all the money out
and your balance on
the system's like,
well, we keep trying
to send you money
and it keeps never
getting confirmed.
And so we keep making new ones.
I don't know to what extent
that that actually happened.
It couldn't have been
the whole thing for Mt.
Gox definitely.
There's still a lot of
uncertainty about that.
But it's indicative.
If you write your own
software and it's not
accounting for these things,
you may be losing money
once people say, hey, this
didn't work, make a new one,
and then you keep doing that
and losing a ton of money.
But that's pretty
sloppy practice.
Another issue.
If you're spending from
a unconfirmed change
output or an
unconfirmed output--
so you make a transaction, you
send the two different outputs,
you've got a txid.
And you're sending five
coins to this person
and three coins
back to yourself.
That three coins back
to yourself output,
you might want to use
it again pretty quickly.
Sometimes this happens.
And so you've got a
change output that's
from transaction 1, 7feec1.
So you're going to now spend
that change output, however
the txid of transaction
one changes.
So you're saying
where you're spending
from is no longer valid.
And this is a big problem
because now you've
signed a transaction that you
think is going to be valid,
but the money you thought
you were spending sort of
moves out from under you.
And so that transaction's
no longer valid.
tx2 is now invalid.
It refers to something
which can never
be confirmed because there's
a different transaction that's
almost the same, but they're
mutually exclusive that
did get confirmed.
OK.
So this is bad.
It doesn't seem that bad.
And so for years in
bitcoin this was a problem
that while it dealt
with, software and people
would be like, oh yeah, you
have to backup your keys,
delete your whole database,
and rethink the blockchain
and then it'll find
the right transactions.
Kind of hacky work
arounds like that where
it didn't happen too much.
It wasn't a great attack.
You can annoy people.
You don't get any money.
So wasn't a huge deal,
but it was annoying.
But the idea is you
can always re-sign.
You've got your private keys.
If the money you are receiving
sort of shifts around
and changes its location,
well it's still yours.
You just need to re-sign.
But what if you can't re-sign?
So the case of multisig where
in most cases when you're
doing transactions if you just
have one key, it's just you,
that's fine.
In the case of
multisig usually you're
all friends to some
extent and you're
all in the same organization or
multiple devices that you own.
But you can have sort
of adversarial multisig
where you're
assigning transactions
with people who are you're
sort of cooperating with them,
but you may not
really trust them,
or they might be potentially
attackers, things like that.
You can definitely sort of
extend your multisig model
into that.
And there could be multisig
pre-signed transactions
where, OK, we've got
this two of three output
address, this
output that exists,
and one of the two or three
has pre-signed a transaction
and hands it over to me
and then they disappear.
And they say, oh OK, well I'm
going to now sign my side.
But if malleability occurs and
the transaction ID changes,
that signature is no longer
valid, signing something
that's not there anymore.
So this is very important in
payment channels lightning
network stuff that I'll
get to in a few days.
And so it wasn't so
much that malleability
was like a showstopper bug
and everyone was losing tons
of money, it was that
it was preventing
kind of new, cool
things from happening
because there were a
lot of problems with,
OK, let's make this
construction where we put money
into a multisig account and
then I sign like a refund
transaction that's got a lock
time before I actually fund it
and things like that where
you couldn't reliably do it
because if either party
modified their signature,
they could break the whole thing
and they could have a tax where
it's like, OK, we're
doing something together.
Oh look, it's got stuck.
Well both of our coins
got stuck in this place.
Hmm.
Now it's sort of a hostage
situation and you can say,
well, I think I should get 1
and 1/2 and you should get 1.5.
It's like wait,
we both wanted 1.
So there is a tax
that could happen.
And so this malleability
was a problem
for people trying to
do new, cool things.
So how do you fix this?
Any ideas?
Non-malleable signatures?
So the one we did for
the first homework.
Does anyone have an idea about
why the lamport signatures were
non-malleable, like
from problems at one?
Yes, it was right.
But yeah, they weren't.
There's no randomness for one.
I'm pretty sure if you flip
any of the bits it's not
going to work.
So lamport signatures are
an example where, yeah, it's
non-malleable.
You can't produce multiple
different signatures
on the same message.
So that's good.
The thing is many
useful signature schemes
are malleable.
So to just say no, you have to
use a non-malleable signature
scheme, it's not a great
answer to the question.
I'm pretty sure there's
some weird malleability
stuff in RSA.
A lot of the systems
have randomness in there
and they're malleable.
So an idea that I
had like 2014 and I
was sort of going for was just
don't sign your inputs at all,
only sign your outputs.
So you don't actually specify
where you're sending money
from in your signature.
You do have to still
specify in your transaction
because people need
to know, but you
say I'm only going to
sign off on the outputs.
The endorsement of
my inputs is implicit
because the keys match.
So I don't actually
sign off on which
key I'm sending from to
something that's redundant.
You know I'm sending
from these inputs
because the keys match
up and the signature's
valid for this key.
I really like this idea still.
I think it's really fun.
You can do a lot of
cool stuff with it.
It's also dangerous.
It allows signatures
to be replayed,
which is sort of one of
the big points of having
utxo's because if you
send two outputs--
I have address one.
I send two outputs.
So I've got output one,
output two, and this one
has five coins, this
one has three coins,
and they're both
the same address,
both the same public key, and
then I want to spend them.
And if I use this sort
of signature scheme
where I don't actually sign
which input I'm spending from,
it can be used on either.
So maybe I'm not
aware of this 5-1 yet,
or it just hasn't happened
yet, or I haven't seen it,
and I say, yeah, I'm going
to make a signature sending
three coins over
here and then someone
can malleate the transaction
without touching the signature,
and pointing over
here, and the signature
wouldn't apply to either.
And then this is a really
good deal for the miners
because now I'm spending five
coins and only outputting three
and the miners get the
two coins difference.
And so that's pretty dangerous.
And also, even if they're
the same I just say, hey,
I'm sending three coins
to you and then as
soon as the receiver
sees this output,
oh, it'll also work here.
I'm going to take
another three coins.
So this is mitigated by
not reusing addresses,
but people reuse addresses.
So it is dangerous.
I think in the context of
multisig you can reliably
say like, OK, we're
not reusing addresses
because these addresses are the
combination of multiple people
working together.
But it would allow
really cool things
where you could sort of work
backwards, compute a public key
that you could prove no one
knew the private key to,
but you could
still sign with it.
Like really weird crazy stuff.
Anyway, people were
still talking about it
a week or two ago.
Like, oh, we could do
these cool things with it.
But it's dangerous and so
it's like we're not sure
if it's worth it.
OK.
Any questions about this
transaction malleability
so far?
OK.
So any ideas of
what you actually
do to fix malleability?
Nobody?
OK.
we'll find out in one minute.
Segregated Witness.
I don't think it's a good name.
Separate signatures would be a
much easier to understand name.
So Peter Wuille who is
really good at bitcoin
and makes all these
cool things, he's
not the best at naming things.
Makes lots of cool
stuff, but just makes
whatever weird technical name.
So it's a pretty
straightforward idea.
The idea is when you're
signing a transaction you hash
a bunch of data design, but you
don't include the signatures
in the data you're
hashing to sign them
because that wouldn't
make any sense.
You can't.
Do the same thing when you're
referring to transactions
themselves as txids.
So in the exact same way
that when you're signing,
you hash the data
without the signatures.
When you're pointing
to a transaction
to say I'm spending
from there, also
don't include the
signature data.
Just take the hash of the
data without the signatures.
Yeah.
You just sort of
have this pointer.
You've got a pointer
of previous input
and you've got the outputs, but
the signatures aren't in there.
So the idea is the signature can
change and the transaction ID
doesn't.
But what about
backwards compatibility?
So this is a great idea.
Why not go for it?
But how do you make it backwards
compatible so that old software
can still work with it?
This seems like a
soft fork is I'm
adding new rules to
the system I'm putting
further restrictions on.
This seems like just a change.
It seems like, look,
I'm now defining
something in a different way.
I'm removing the
signatures from the txid.
How do we make this not
appear to be a hard fork?
Hard fork's easy.
You just say, look, we're
changing the entire system.
From now on txids
don't have signatures.
So any ideas how you do this
in a backwards compatible way
or just give up hard fork?
AUDIENCE: Adding restrictions
that screw with [INAUDIBLE]..
PROFESSOR: So you can't
change old transactions,
but having both at the
same time is tricky.
So the idea is it
would have been easier
to start off this way.
If Satoshi had just started this
way, it would have went great.
He didn't think of it.
It wasn't a super
obvious thing that--
so you can do it as a soft fork.
The way you do it is
you make new outputs
which don't require
any signatures at all
and then you just don't
have any signatures.
This seems kind of silly.
Signatures are pretty important,
otherwise any arbitrary person
could just take all the money.
But you redefine things in a
way that new people know about
and old people don't.
So this is actually what a
segwit output looks like.
The output script is just a
zero and then a pubkey hash,
and then the sig script,
the field for a signature
is just nothing.
You just don't put
a signature there.
And then when you're
running the stack
you end up with a pubkey hash
on the top of the stack, which
is some number and
the interpreter
looks at a number
that's non-zero as true.
Like in C or things like that.
And the bitcoins move.
It's great.
Someone was joking that you
could potentially make this
into a hard fork, because what
if the pubkey hash was zero,
and you found a pubkey that
hashed to zero and then you
signed signed with it and then
segwit would accept it but old
nodes wouldn't.
Actually, that doesn't
work, but it's sort of--
Anyway, if you're running
the regular bitcoin software,
you see this and no signature,
and you're like yeah,
this doesn't need a signature.
It's just a hash.
I don't know what this is.
Fine, the coins move.
It evaluates to true.
But the new version
of the software
sort of adds a restriction
to this kind of output.
It says, look.
If you see this,
this is a template.
This doesn't actually mean
put a zero on the stack
and put a pubkey
hash on the stack.
It means something else.
Now, it means this is a pubkey
hash and look for a signature.
But look for a signature
in a different place.
Don't actually put
it in the place
you're supposed
to put signatures,
put it in a new place.
And don't tell the old
software about this place.
We add a new field to
the transaction inputs.
It's sort of in the inputs,
but they put it at the end.
It's kind of weird.
Logically, it's in the input.
It's the same, but physically
it's not, which is silly.
I don't like that aspect of it.
But the idea is there's this
new field in the inputs called
the witness field,
and in cryptography,
witness sort of means
signature in this case anyway.
It's a little bit more general.
But the old software
never sees it.
So the idea for here's the
old transaction format.
You've got your tx id and
index, 36 bytes sort of pointer
to what you're
spending, and then
a signature which is 100 bytes,
and then this stays the same.
And the new tx format.
The idea is, yeah,
the signatures field
is still there.
You just leave it empty.
So you're not putting
any signature.
It doesn't look like you
need to put a signature
to the old software.
And then you have
this third thing,
which is witness, which is the
same as signature basically.
Slightly different format.
And technically, they
put them all together
and put it at the end,
which is kind of annoying.
But anyway, logically
this is how they do it.
They make a new version
of the transaction format.
So the old version looks
like empty signatures.
The new version
looks like here's
this useless empty
signature field,
and here's where the
real signatures are.
And you omit this
to the old nodes.
So when people ask for
witness transactions,
when people know
about this new system,
yeah, you give it to them.
So they say hey, yeah, I'm
hip to this new segwit thing.
Give me a segwit transaction.
And you're like, OK, here's
the witnesses at the end.
But when they don't
seem to know about this
and they're running
older software
and they say, hey, just
give me the transaction,
you give it to them
without the witness at all.
It still looks valid to--
either one looks valid.
However, the new people, they
know that if you see this,
it does not mean push a
zero, push up pubkey has.
There is a new rule that
no, this is a template.
This is segwit.
I need a signature, and I need
it to be in the witness field.
So if a new node gets a
transaction without a witness
that they know needs a witness,
they will declare it invalid.
But the old nodes won't
be able to distinguish.
They'll say, well, it looks like
no signature is needed here.
OK.
So you're sort of
tricking the old software
into accepting things that
they shouldn't actually
accept in some cases.
There may not be a
valid signature that
goes into the
segwit transaction,
but old software will
still think it's OK.
So this is how you make
it into a softfork.
It's kind of ugly.
But, yeah.
Do you have a question?
AUDIENCE: Yeah.
Is this still implicitly?
So when the signature
is zero, [INAUDIBLE]..
PROFESSOR: No, because it's
based on the output script.
You could make you a
different output script that
would have a signature that
no signature requirement,
and it would still work
even with this new system.
So it's just based on--
we changed the definition
of an output script.
So have this sort of template.
You can still do weird--
like you could put
without a zero in front.
You could put just
a pubkey hash,
and that's not
defined in segwit.
That's not defined anywhere.
It would just be, OK,
yeah, it evaluates
to true without a signature.
Anyone can spend it.
And you could do that--
that would have to be a
non-segwit transaction.
The only way to use
a segwit transaction
is to have the special
format for the output script.
Any other questions
about network stuff?
Yeah, and this
solves malleability
in a pretty good way.
For the old software,
the old nodes,
well, they can't
change the signature
because there isn't one.
There's nothing to malleate.
And from the new
node's perspective,
yes, the signature can
change, but that doesn't
affect the transaction ID.
Both old and new
nodes still agree
on the exact same
transaction ID.
The transaction ID does not
include the witness field.
So when you're
calculating a transaction,
you include all
this for backwards.
And if there's this
actual signature there,
that gets into that the txid.
But if you're using
empty signature
and only using witness, then
it doesn't get into the txid
at all.
So both old and
new software agree,
and that's important, because
if they didn't the merkle routes
would look wrong.
You take all the txids, put
them into a merkle route,
put that in the header.
And that's really important
that everyone agrees on that.
So they do work
together, So that's cool.
So this is kind of interesting.
You've got two
different old version,
new version operating at the
same time on the network.
And they agree on
a lot of stuff,
but they also sort of
disagree on some things.
So they agree on outputs
of the transactions,
and they agree on
which inputs there are.
But they have a
slightly different view
of what these inputs are.
Some of them think,
no signatures here.
Some of them think, yeah,
there's a signature here.
That's weird.
They don't agree on
how things got spent.
What are some other things that
these two different classes
of nodes would not agree on?
Any ideas?
So you understand how they
see different transactions.
What are some other
aspects that may
be sort of interesting
for this consensus system
that we have different views on?
I forget what I put.
I put two things.
Any?
Hint.
Biggest argument
since 2010 in bitcoin.
What do these two different
classes of nodes not agree on?
Yeah.
AUDIENCE: Size?
PROFESSOR: Well, the
transaction size.
Yeah.
So they both see two
different transactions.
One of them sees it with
these signatures, one of them
sees it without.
They don't agree on how
big the transaction is.
They agree on the txid.
They agree on where the money is
going, where it's coming from,
but they have completely
different views
of how big this transaction is
in terms of number of bytes.
So this is really interesting,
For many, many years
since 2010, everyone's
been arguing.
And one of the big
aspects of, oh, if we
want to increase the block
size, that's a hard fork.
Everyone up to now,
we're enforcing.
The block size must be
one million bytes or less.
There's no way
around that, right?
You can't just increase it.
We've got this rule.
You're breaking that rule.
This is a sneaky way to break
the rule but still not tell
people you're breaking the rule.
Say, OK, I'm enforcing a rule
that there's one million bytes.
As far as I'm concerned, there
are less than one million bytes
in this set of transactions.
The new nodes know, yeah,
there's more than one million.
There's like two
million bytes in here.
We just didn't tell
the old software
about all these extra bytes.
So this is kind of an
interesting thing you can do.
So you can increase
the transaction size
without telling the old nodes.
So yeah, the old nodes don't
see the hundred something bytes
with the pubkey signature.
So they see transactions
that are much smaller.
Around half the size--
depends, but half the size ish.
So those bytes, they
won't count those bites
towards the one million
byte block size limit.
So this ends up being
a soft fork that allows
you to increase the block size.
In a kind of sneaky way, right?
The old nodes don't think
the block size is increased.
They think it's less than a
megabyte, and they also think,
this is weird.
I haven't seen any
signatures for a while.
| seems to be using these
transactions that don't require
signatures, and
somehow everyone's
getting along and not
stealing each other's money
despite the lack of a
need for signatures.
But these are not
intelligent people.
These are software programs,
and it'll just run.
And it'll, OK, yup, yup, yup.
This evaluates to true.
So it's kind of cool.
Block size entry softfork.
However, you Institute
a new rule with segwit.
You don't want to just
say for the new rules,
we don't count signatures
towards the one megabyte limit,
right?
You could do that, but then
people might spam signatures.
Let me make a giant signature
or some kind of like 50
out of a million pubkeys
thing and spam the network,
and then it will still be under
a megabyte of non witness data.
So yes, so now I've got
two classes of data.
You've got all the data
that everyone sees,
and all the witness data
that only the new nodes see.
So what they did is they said,
OK, the witness data still
counts towards that limit.
But each witness byte counts
as a 1/4 of a regular byte.
OK, kind of weird, but yeah.
So in practice in the software,
what they do is they say, OK.
We multiply the non
witness bytes by four.
So every byte in the outputs
and every byte in the txid input
things counts as
like four bytes.
And then, the witnesses just
count as one regular byte.
And then we now say,
OK, the new block size
is four million bytes.
But four million weight units,
because they're sort of, OK,
we've got different
weights for things.
This actually makes sense,
because the utxo set
is what you really
want to minimize,
that database we keep
updating every block.
And the signatures don't
go into the utxo set.
So the signatures
you don't actually
have to store on a fast,
low latency storage.
So in a very real
sense, the signatures
are sort of OK to make bigger.
They don't really cost as
much to the network to store.
So having this
discount where you say,
OK, the signatures, you can
have a bunch of them that
doesn't really count as much.
But the outputs we
really need to minimize.
So this one fourth is
somewhat arbitrary,
but there are some calculations
and a little handwaving.
But it's like yeah, this
is about what it should
be to try to balance things.
So the end result. If
you have this discount,
you can put about 80% more
transactions in a block.
You get about 1.8 megs.
It depends, right?
It depends how big
your signatures are.
So the maximum would be
you have a block that
has one transaction with
just a giant signature that's
like almost four megabytes.
And the old software
would see this block
as being really tiny,
like 100 something bytes.
And the new software
would see, oh yeah,
this block is almost
four megabytes.
But that's sort of
the extreme case.
I remember generating some
like 3.7 meg transaction blocks
and testing that awhile
ago just to test it out.
It works, but in practice
you're seeing about this.
In practice today, as segwit
has been seeing more adoption,
you see like 1.3
megabyte blocks.
Not everyone's using it.
The idea is it's
backwards compatible,
but you can still use
your old software.
But it seems like more and
more software is using this.
You get a discount on your
fees because your transaction
seems to be smaller.
You can fit more
of them in a block.
So that's kind of
cool, and that's
sort of an incentive to use it.
OK, other thing you can do.
You can commit to signatures.
This is a little tricky.
If the signatures aren't
in the transaction ID,
then they aren't in the
merkle route, right?
So there's nothing really
committing the signatures
into the block chain.
And this would actually work.
You could say, no,
I have a signature.
I'll give it to you,
but it could change.
It could be maleated, so
it could be weird, though.
You could agree on a utxo
set, but you could disagree
on how exactly you got there.
So one example
would be multisig,
where there's two of three
multisig, Alice, Bob and Carol.
Two of them need to sign.
And then on my computer, it
says that Alice and Bob signed,
and on your computer, it says
that Alison and Carol signed.
That's weird, right?
For accountability.
If we want to know who
exactly endorsed these things,
we might disagree on it.
There would be no canonical
here's the blockchain,
here's who signed.
The transactions
themselves would all still
be the same but
not the signatures.
So that's kind of weird,
but it also seems like well,
maybe that's part of the price
you pay for fixing malleability
in this way.
If we're not putting the
signatures into the thing that
gets committed to in the
block chain, then yeah,
signatures can change.
So anyway around this?
It sort of seems like
yeah, that's the trade off.
Sneaky way around it?
Sneaky fun?
No?
You know.
OK, so what you do actually,
you commit the signatures
but in a weird way.
OK, so here's the regular
old merkle tree, right?
This is the merkle route
that you put in the header.
Here's all the transaction
IDs, and so you
make these intermediate hashes.
This is the hash of these two
things concatenated together,
this is the hash of these two
things concatenated together.
Now, if the txids
don't have signatures,
there's no commitment to the
signatures in the top hash.
What you do is this.
You say, OK.
I'm going to make
these new witness
txids, hashes of transactions
that do include the signatures.
In practice, you could just make
a hash of just the signatures.
That would also work.
They just take the whole thing.
And now I've got
this other reflected
merkle tree kind of
thing, where OK, I
take the hash of these two
witness transaction IDs,
put it here, and this
one just drops down.
It's another merkle
tree, and then you
get a root for all those
things called the witness root.
And then what you do is
you put the witness root
in the coinbase transaction.
Put in an opp return.
And the idea is the
coinbase transaction
doesn't have any
signatures anyway, right?
So you can put it in there.
You don't need to
include the transaction
zero in this witness tree.
Wait, they do though, right?
But maybe this is slightly
inaccurate in that I think
they actually do
make a witness txid
for the coinbase transaction,
but they define it
as being zero or something.
I think-- I don't remember.
So it's weird, right?
But you could do that.
They define a zero, or they
let you pick anything you want.
I would have to
look at the code.
But anyway, the basic
idea is for these anyway,
you take the hash of the whole
thing including the signatures,
put it in the witness
root, put the witness root
in the coinbase transaction, and
the coinbase this transaction
gets in to the merkle root.
So you are committing
to all the signatures
but on the block level,
not the transaction level.
So in the case where I
think Alice and Bob signed.
Oh, I think Alice
and Carol signed.
You can have those two
transactions floating around
on the network, and
they have the same txid.
And so who knows which
one's getting into a block?
They look almost the same.
Some of the software won't
be able to pick between them.
However, once it gets
into a block, one of them
will be committed to.
It's like, oh, ended up
being Alice and Carol.
Those two signatures actually
got into the blockchain.
However, you could
prove, hey, no I
had this Alice Bob
signature, but then it
never got into the
blockchain, and maybe
you made it after the fact.
It never gets committed to.
Yeah.
AUDIENCE: Also, a
bunch of pool software
just doesn't always do this.
PROFESSOR: A bunch of pool
software doesn't do this?
What you mean?
AUDIENCE: It's
the responsibility
of the pool software to
make this construction,
but [INAUDIBLE]
PROFESSOR: Have it
implemented as in they just
don't support segwit?
AUDIENCE: No, so they
do the first part,
but [INAUDIBLE] segwit support.
PROFESSOR: OK, but wouldn't
that just not work?
How--
AUDIENCE: It works, because--
[INAUDIBLE]
PROFESSOR: But to the new
software, if you don't have--
so segwit is the
software, right?
You say, OK, we define
these new transaction types.
We define this template where
if you have a zero and then
this pubkey hash.
It also says, I require
the coinbase transaction
to have this output that
says, op return aa9c
whatever this little
four random bytes,
and then I'd require it to
have the witness root in here.
AUDIENCE: I'm guessing
they just don't
include segwit transactions?
PROFESSOR: So I've
seen that a lot.
Yeah, so a lot of--
AUDIENCE: [INAUDIBLE]
PROFESSOR: Yeah, a
lot of the software
says, I'm not going to do this.
So the other thing that's nice--
segwit transactions to old
software look non-standard.
So I mentioned before that
there's standardness rules
where, this looks weird.
I'm not going to mine it.
I'm not going to relay it to my
peers, but if I it in a block,
well, OK, fine.
So segwit transactions
look very non-standard.
It looks like
there's no signature.
That's weird.
There's this zero.
What's going on?
So yeah, you can you
can still run a miner
and just not even
know about segwit.
It's a little
dangerous, because you
might see a block that
is segwit invalid,
but you wouldn't
know it and so you
might try to mine on top of it.
So there are some
risks, but in general
if most people are
doing the right thing,
you could still mine without
knowing about this stuff.
So any questions about
committing to the signatures?
What else?
Oh yeah, so you've
got this upgrade path.
That's kind of cool.
So it defined zero
pubkey hash as hey,
this is now pay to
pubkey hash, right?
Interpret this weird
template as the regular hey,
verify this signature.
It also, when segwit
softfork happened,
redefined a whole bunch of
other templates like this.
So one and then some data,
or two and then some data.
Just put a number, and
then put a bunch of data.
All of these are defined
as future upgrades.
So if you see a three block of
data, you now say, yeah, OK.
I know that's segwit
version three.
My software will maybe pop
up something saying hey,
people are using
segwit version three.
You're only aware of
segwit version zero.
But you'll consider
it non-standard.
You won't relay it.
But if it's in a
block, yeah, sure.
And you don't require
anything about the signature.
You'll just say, yeah,
whatever weird witness data
you provide for these
outputs, I don't
know how to interpret them.
I'm just going to let
it all go through.
What that means is--
there's no witness needed.
If a witness is
provided, you just
ignore it and you think
everything's fine.
This allows easier upgrades.
You have 16 new
versions to upgrade to.
Yeah, you don't require any
specific things about this,
so you can make new scripts, you
can make a completely different
script interpreter.
You could say, OK,
we're going to port EVM
to bitcoin and disable
some of the op codes
that don't apply, and
have that kind of thing.
Have new smart contracts.
So it's kind of a fun,
like yeah, we will--
and it's a nice,
easy upgrade path.
You could have multiple
different things,
things like that.
The code will be easier.
Don't do it today.
You could construct
an output that's
a two byte and then your
pubkey and send it out there.
It will be probably stealing by
miners, because everyone else's
node will say, yeah, I don't
know how to interpret this yet.
There's no rules about this yet.
OK, let me show you some
segwit stuff I looked for.
OK, so there's
actually nested segwit.
There's an an ugly--
I didn't like it, but--
this is like somewhat designed
by committee E. There's also--
this is 2016, right?
AUDIENCE: People lose so much
money on segwit two years ago.
PROFESSOR: So the other
thing I would say with this,
I was like, OK.
You've got this witness txid.
And I remember people
working on segwit
and I said, hey, why don't
you make the transaction IDs
a merkle tree of the
inputs and outputs instead
of just the hash of
everything all together?
Then, if you had a
really big transaction,
you could prove that an input
had been spent without sending
the whole transaction over.
And I thought that
was a cool idea.
And then when I talked to
people, they're like yeah,
Peter Todd already said
that like three weeks ago.
And whatever, we're
not going to do it.
It's too late.
We already coded stuff.
Oh well.
And that's the fundamental
aspect of segwit.
You can't really upgrade that
in the new script versions,
so whatever.
There's also still a hard rule
on transactions themselves
being less than a
megabyte, I think.
So it's not a huge deal,
but it would have been cool.
Another thing is
there's actually a way--
so there's no address
defined for this, right?
Address is mapped to output
scripts in all the software.
And so when you
say, OK, I'm sending
into 1aeecc or
whatever, it knows
how to interpret that address,
build the 20 byte pubkey hash
script, and send to it.
And vise versa, right?
So from the address, you
can get this output script,
and from the output script
you can get an address.
So when an old
software sees this,
it's just like,
there's no address.
I don't even know what that is.
I've never seen that.
And so people worried that
oh, it's going to be weird.
People are going to have to
upgrade to even send to people
using segwit.
So it's backwards compatible,
but if you want to say, hey,
send me some money at the segwit
address and then they can't.
And so you say, OK, fine.
Send me the money with
a regular address,
and then we still have
this malleability problem.
And then I have a wallet
that supports both,
and I can move money
to my own addresses,
and it's kind of ugly.
So they made this nested address
thing, which I don't like,
because then it
actually has both.
So you've got a
signature and a witness.
And the signature is
not a real signature.
It's just pointing
to the witness.
It's really ugly.
There's a bunch of weird
stuff in the segwit code
that I'm not super into.
I don't have to use
it though, right?
That's the beauty of these
permission-less innovation
kind of systems.
Like ew, I don't like that code.
OK, I'm not supporting it.
OK, so here's one that's nested.
So I was just randomly
looking through a block.
Here's one, and you
can see it's like, OK.
The outputs are probably
also nested segwit,
and the input has got both a
script sig and a tx witness,
right?
A tx input witness.
A pure one is this one f7.
OK, so you can see--
oh wait, am I not running--
what version am I running?
I think I'm running
to 15-1 still.
So I'm not seeing the address.
There's a new address format
called beck 32, bech 32,
which will turn--
so it's zero and
then a script hash.
Zero and then a pubkey hash.
It says, witness,
version zero, key hash.
There's also an address
associated with these.
I think this version of
bitcoin CLI does not show it,
but the new version does.
So I think if you guys have
version 0.16.0, it will show,
here's the address.
And then you can see
in the single input
for this transaction,
there is a tx in witness.
And there's no scripts.
There's a script sig
field, and it's just empty.
There's no actual
signature traditionally.
There's instead this big thing.
Here's the signature, and here's
the pubkey being revealed.
And then it also says,
OK, here's the txid
without the signature, and
then here's the hash or witness
transaction ID.
The hash of the whole thing
including the signature,
and they're different, right?
Also you've got size so this
is actually 235 bytes, right?
Because you're
including the witnesses.
And then, v size,
which is virtual size.
This is how big it looks
to old software that
doesn't know about segwit.
So the new software,
this knows about both.
The actual size or witness
size is 235, v size is 153.
So yeah, it's not quite
50%, because this one has
two outputs, and the outputs
don't get any smaller,
and the input just gets smaller.
And then, size, v
size, and then you
can see what block
it's in when we
get the coinbase transaction.
OK, so the first
transaction in the list
is going to be the
coinbase transaction.
And I can get that one.
And yeah, the
coinbase transaction
has a different txid and hash.
Its size is 259, its v size 232.
Coinbase has whatever
random data they want,
and there's the
actual output, which
is sending to this address,
and it's sending 12.79 coins.
And then, there's this
zero value output.
So you can have an
output that's got
an amount of coins set to zero.
It's still OK, and it's
got this op return.
And the op return
starts with aa21a9ed,
and those four bytes mean
here's the segwit commitment.
Here's the witness commitment
to the segwit transaction
hashes, the root of all those.
And you have to
have that in order
to have a valid segwit block.
And so then we can--
this is segwit in action.
I think most blocks
now will have that.
So there is size
and v size, right?
And that makes sense.
But then you have strip--
no, v size is not size.
It's really confusing.
And size, weight,
height, like what?
So size is--
I don't actually know.
I think size is
interpreted the same.
This is the actual number
of bytes for this block.
Weight is you multiply all
non witness bytes by four,
and you leave all witness
bytes as weight one,
and that has to be
less than four million.
And you can see here, it's
just under four million.
And then stripped size is
the size that old nodes see.
Yeah.
Pretty sure.
Anyway, so it's
kind of confusing.
One of the biggest
problems in bitcoin
is names, where it's like, wait.
Script pubkey, and
script sig script?
Like, what?
All these terms and names
are really confusing,
and it's sort of getting worse.
So yeah.
Also, there's no v size here.
I think this is actually v size.
Anyway, so that's how segwit
works in the actual thing.
But it's nice, because now you
can reliably spend from things
before they're confirmed.
So segwit is cool.
Fixes malleability.
Increases the block size.
Oh, it does a whole bunch
of other stuff, too.
OK, so one of the
aspects that it fixes.
When you're signing
a transaction,
let's say you have five inputs.
Each time you sign, you need
to hash the whole transaction,
because it's slightly
different, right?
You zero out all the
signature fields,
but in the signature
field for the one
you're actually signing,
you don't zero it out.
You put the previous
script there.
So it's slightly different.
It's totally redundant.
There's no reason
to put it there
because it's
already in the txid,
but you change things
around a little bit.
So the idea is, I'm going
to put a signature here,
I'm going to put
a signature here,
put a signature--
all five of them.
Each time I put
a signature here,
I hash the transaction to get
a slightly different thing
to sign for each one.
It might not jump out at
you, but this is actually
o and squared, which is bad.
Because the idea is, as I
extend the number of signatures
required in a transaction,
the number of inputs
in a transaction, the amount of
data that needs to be processed
goes up with the square
of the number of inputs.
Because I had an input.
Now, the total size of the
transaction gets bigger,
so each time I sign, I need to
take a bigger amount of data
through my hash function.
Also, the number of
signatures gets bigger.
Or the number of inputs.
So this is in squared.
It seems fine, right?
You never notice,
except when you do.
So there's pathological block.
There was one like 2015 early
in the year where some miner was
like, I'm going to
make this block that's
one giant transaction
with thousands
and thousands of inputs.
And a lot of software choked
on it, and it took gigs of RAM
to process the transaction,
and things like that.
So that was bad.
Just in general, if you have
a lot of little dust outputs,
if you're trying to
aggregate them into one big--
I'm going to have 100
inputs and one output,
it takes forever to sign.
And it also takes
forever to verify.
So it's pretty bad.
I remember sort
of a silly story.
Tim Draper's coins.
He had all this dust.
And it was nerve wracking,
because it was way more money
than I'm going to
make in my life.
And moving Tim Draper's
coins to somewhere else.
And the software by default
just swept all the inputs
with that wallet controlled.
And they were looking at me
like, why doesn't this work?
Is it frozen?
I'm like, no, I'm not trying
to steal the money, guys.
Because everyone was sending
all these little outputs to Tim
Draper's 30,000 coins or
whatever, because he's--
and then when he
tried to spend it,
it took five minutes to sign.
AUDIENCE: When
people use P2 pool,
the software really
struggles with this.
PROFESSOR: Yeah, so it's bad.
Any o event squared,
this is sort of a bug.
Segwit actually fixed this.
The way they do it
is they say, OK,
we sort of pre compute these
three intermediate hashes.
Take the whole transaction.
This is sort of the
global transaction data,
and pre compute
these three things.
And then for each of the
inputs, add another thing.
Here's this input specific.
So this is global.
It's the hash of all the
tx ends, the hash of all
the outputs, the hash of this.
And then here is that
the input specific.
Input specific.
And then hash all these
things into one thing
and then sign that.
So the idea is it's o of n in
that you compute these three
and then you sort
of go down and keep
changing this for each one.
So that saves a lot of time.
It's a much nicer--
oh, you also put in the amount
being spent in your signature
hash, which is also
redundant, because that's
committed to in the txid
that you're sending.
But it's really nice
for hardware wallets,
because a lot of times hardware
wallets are essentially
presented with here's a hash.
Sign it.
And it's a very small system.
It's a little chip somewhere,
and it doesn't really
know too much about bitcoin.
It's just, here's a hash.
Sign it.
OK, and they don't know
how much they're spending,
so there could be attacks
on hardware wallets,
where they get a hardware wallet
to sign something where it's
actually moving too much money.
So it's nice to be able
to have the actual amount.
So there's a bunch
of stuff like that.
It was a giant grab bag of all
these different little fixes,
things like that.
Fixes malleability.
It increases the block size.
Does all these
other cool things.
People didn't like it.
I never really understood why.
AUDIENCE: For the reasons you've
been telling everyone about?
PROFESSOR: All these reasons?
Wait, these seem like
good things, right?
AUDIENCE: Well,
yeah, but [INAUDIBLE]
PROFESSOR: Oh.
No, that wasn't what--
it wasn't like
people were like, oh,
here's some little things
I don't like about it.
Because that was what I said.
That was like what everyone
working on Bitcoin was like.
No one thinks it's perfect.
Everyone was like, oh,
but this thing is weird.
Why did you do that?
Or why didn't you put
this in kind of things.
But no, the people who didn't
like it really didn't like it.
There's still a bounty on
[INAUDIBLE] head, right?
There's death threats.
Someone's like, I'll pay
someone to kill this guy.
It's all, this is going
to destroy Bitcoin,
that segwit isn't
bitcoin anymore,
because there aren't
any signatures.
It's like no, signatures
are still committed to, just
in a different way.
You have to build
this other tree.
So lots of weird conspiracies.
I don't know.
It became this really
sticking point,
and so that sort of
led to Bitcoin Cash.
The whole idea is segwit is bad.
We're making Bitcoin Cash.
And Bitcoin Cash forked
off before segwit
activated in the main network.
Interestingly, Bitcoin
Cash uses this.
So they took a bunch of
the code from segwit,
because this is a
really good improvement
to signing that
Bitcoin Cash used,
but they didn't like segwit.
Yeah, I'm still not like--
I don't know.
There's problems I have with
it, too, but it's an upgrade,
and it's cool.
I think a lot of it was
people wanted a hard fork,
and this was a softfork.
And so there's
backwards compatibility,
and they wanted to
show that people
have more control over
bitcoin than they maybe do.
It might never be possible
to have a hard fork
to get everyone on
board to really switch.
So who knows.
So yeah, it was interesting.
It took forever, and
that was the last change
to the bitcoin code in
terms of consensus code.
And it was initially
announced late 2015
in Hong Kong, and
then all of 2016
it never-- so it activated
in August of last year.
And now you can use it.
AUDIENCE: People had big
interest in stopping it,
though.
At one point they were
spending hundreds of thousands
of dollars a day to
stop it from activating.
PROFESSOR: Yeah, so on your
vert coin, you're like,
I'll just take the segwit code
and activate it, and like cool.
And then people tried to stop
it and spend a lot of money
to stop it.
OK, I want to say unclear
why, because I don't know.
It's sort of weird.
There's a whole lot of opinions.
One theory is that
this breaks some mining
chips optimizations.
One of the optimizations--
it doesn't work with
a tree of height two.
But if you have a really tall
tree, you can swap txids,
or you can swap intermediate
nodes of the tree
and you'll get a
different merkle route.
So you can see--
so it doesn't work here, because
this has to stay in place.
But in many cases, the order of
the transactions is arbitrary.
So I could flip these two.
It's still valid.
So what I might do is say, OK.
I have this merkle
route I'm mining,
and then I want to flip
these two, calculate
a different merkle
route, and mine.
And there were some chips
that maybe did this and had
these kinds of optimizations.
There was also a patent on
it and all this weird stuff
going on.
It doesn't break,
but it essentially
loses the optimization
if you have this.
Because you're saying, OK, I'm
going to have this big tree.
I'm going to swap
something near the top,
and it only has to
recompute two hashes
to get a new merkle route.
However, if I now have this
mirror image witness merkle
tree underneath, if I say,
OK, I'm going to swap this,
I'm also swapping all these.
And I have to recompute this.
Maybe I can swap some of it, but
I have to recompute what this.
This is going to
change just as well.
And then I have to
put that in here,
and this is going
to be at the bottom.
And then, I'm going to have
to recompute everything
all the way up to
the merkle route.
So this was called AsicBoost,
and then there was a post--
Greg Maxwell posted
this sort of like,
you guys, like accusatory mail
on the mailing list last spring
saying, look.
We were trying to figure out
a way to break AsicBoost,
because we think miners have
this patented algorithm that
optimizes and it gives
a 20%, 30% speed up.
And we're worried that the
patents will make one miner,
have a monopoly, and everyone
else won't be competitive.
So we're trying
to think, is there
a way we make software
to prevent this
from this optimization?
And then once they
tried to look at it,
they were like, oh, wait.
Segwit does that.
We want to make it costly
to swap things in the tree,
and segwit does that.
Oh, so basically, we're good.
And then he was like, oh, wait.
Maybe that's why all
these people hate segwit.
Maybe this is these miners
who have billions of dollars
worth of equipment with
these optimizations in it,
which would be rendered unusable
by this new software change,
maybe they're trying to
prevent it from activating.
It's a theory, and the
mining companies said,
oh, no that's a
bunch of nonsense.
Although, the way they said
it was sort of suspicious.
They were like, yeah, we
put circuitry in our chips
to do this, but
we never used it.
That's strange.
So who knows.
But that's one theory.
I'm not sure how much I
believe that's the real reason,
but yeah.
AUDIENCE: but if they want
to calculate Merkle roots
in bitcoin, don't just--
order all of the transaction
fees by transaction ID?
PROFESSOR: You can't,
because the order matters.
Because when you
validate, this transaction
might create an output that
this transaction spends.
And so if you swap them, so
if you didn't have intra block
dependencies, then it
would all be arbitrary
and you could put in ordering.
But there are intra
block dependencies,
and so the order does matter.
In many cases, it doesn't.
In many cases, these are
two separate transactions.
You can swap them.
But the software does
use the ordering.
And there's all sorts of other
things that would be better.
What I would want is
prepend or append the height
at each stage of
the merkle tree.
That would have helped
me out for some things.
Because then, it's
like you know,
since you're at
the bottom just put
a zero at the end of each hash.
And then when you get
up here, put a one
at the end of each hash.
Doesn't really change anything.
But one problem is
what if I request--
so what I want to
do in my software.
I want to request all the
transaction IDs in a block.
I don't actually care
about the transactions.
I just want to
see all the txids.
Like this.
If I get rid of the head 20,
I get a giant list of txids.
The thing is, what this let me
do is to look for transactions.
If I have a txid I know
I'm looking, I can say,
oh, I can look for it in here.
The problem is, what if
the person I'm asking
is giving me this
instead of this?
I won't know.
They all look like
random numbers.
If I do the merkle tree algo,
I'll get to the merkle route.
That's good.
But I don't really know
that I'm at the bottom.
It's OK if I'm
running a full node
and I actually download all
the transactions and look,
and OK, it works.
But to have a way to say, hey,
give me a list of all the txids
and I can verify
that it's correct,
I can't do that right now.
There's ways around it.
But it would have
been nice if then they
appended a zero or something.
Or even, all you
have to do is just
append something
at the bottom row
or just append
higher or something.
Then, it would've
been kind of cool.
It would've been easier for me.
But oh well.
And there's people who've
written about this.
Yeah.
AUDIENCE: Did James say that's
pool operators are leaving off
the [? whipper? ?]
And if so, does it
weaken the whole system?
PROFESSOR: I think
what they really
do is they just
don't support segwit.
So I've seen, especially--
AUDIENCE: [INAUDIBLE] it's
expensive but then they--
PROFESSOR: Yeah, they say
they're going to support it,
and then they don't.
So they sort of flag their
transactions, yeah, segwit,
and then they haven't actually
upgraded their software,
so they can't use it.
They can't mine it.
So you see this a lot
on TestNet as well.
If you're making TestNet
segwit transactions,
sometimes they just
don't get confirmed
for a few hours, because
all the blocks that come out
don't support it, and
so they won't use it.
AUDIENCE: The badly
written pool software,
if they use segwit
supporting full load with it,
it will give them
segwit transactions,
and they'll try to include
it but it won't do this, so--
PROFESSOR: So it's invalid.
Yes, so it's invalid.
AUDIENCE: I guess my
question is does it
weaken the security in the
system if for six months
they're not supporting this?
PROFESSOR: No, no.
It hurts the usability.
If I want to use a
segwit transact--
but as me running a
segwit compatible node,
I require signatures.
I require all this
whole construction.
If you make something
that looks like it
spends the segwit transaction
without this, I just reject it.
So security wise, it's fine.
Yes.
AUDIENCE: I think it
might be important to note
that the way that these things
are designed, and in particular
that softforks are designed, is
that anyone who doesn't update
the new functionality
can't hurt the security
of the new functionality.
That's sort of part
of the design process.
PROFESSOR: Although, their
security might get hurt.
Not a ton, but yeah.
If you haven't
upgraded, you might
see these segwit
transactions, and--
AUDIENCE: [INAUDIBLE]
PROFESSOR: Yeah,
they look weird,
but you're like, OK, fine.
But you can't actually
verify the whole thing.
Given an invalid and a
valid segwit transaction,
the old software
can't distinguish
but the new software can.
AUDIENCE: That's even though the
pool operators, whether there's
six or eight key pool
operators, might not
be supporting the witrootsub
PROFESSOR: If they
don't support it,
you have to wait until
someone that does
support it mines the block.
So if they try to support
it and support it wrong,
you ignore them.
You don't use their data.
You don't use their block.
AUDIENCE: you just want
segwit transactions stay
in the node pool a bit longer.
PROFESSOR: Yeah.
So I think in
bitcoin now, it's OK.
TestNet is kind of weird,
but there's segwit.party,
and you can see what people
are doing with segwit.
So yeah, it's about 30.
This is by transaction, it's
somewhere around 30 something
percent of the transactions
are using segwit,
and then you can see witness
size percentage, block size.
OK, so sometimes you got--
oh wow, I had no idea.
Blocks are way under
a megabyte now.
Oh, OK, well free
transactions for everyone.
If you want to use
bitcoin, now's the time.
You don't have to pay anything.
That looks very different
a month ago where
you had a solid red line.
You had to sort of--
nothing went below a
million, and then you
had a little bit of segwit
stuff going on here.
But now you've got most
things are below a million.
So interesting.
OK, so yeah.
So that's the basic
idea of segwit.
And if people have
any questions,
stick around and ask.
There's office hours tomorrow
at 4:00 to 6:00 over there.
Look at the homework,
and next time
I'll talk about lightning
network payment--
I'll try to get into
payment channels
and see how far we get into
lightning network stuff.
