We shall continue with the topic of cryptographic
hash functions, which we started last day.
We were essentially discussing about the relative
order of hardness of security criteria for
hash functions, we will continue with that.
Then, discuss about a particular type of construction,
which is very much popular, it is called the
Merkle Damgard construction. It is an integrate
construction technique for hash functions,
so we will discuss about these topics.
To start with, we have discussed about these
two reductions. We were discussing about the
collision to second preimage problem; that
we discussed last day. We will be discussing
about the collision to preimage problem.
One thing actually is we were considering
these types of reductions, there were some
queries like why do we put this? One reason
is, because this is also a probabilistic algorithm,
therefore this does not always give you the
answer; it gives you the answer sometimes,
it does not give you the answer sometimes.
For example, this is a Las-Vegas randomized
algorithm, therefore this can fail also. When
this algorithm fails, then this collision
to second preimage also fails.
Therefore, this is only to check whether this
actually gives you an answer or not. Therefore,
that means that in case of a Las-Vegas algorithm,
whether it terminates or not, there is a probability
of it terminating and that probability is
denoted by epsilon. So, if that is the probability,
then the probability with this collision to
second preimage also gives you a correct answer,
is also epsilon. That is the precise design
why this - if was kept actually, but again
as we discussed in the last day's class, we
do not require to check for x and x dash being
equal or not, because since this is correctly
giving an answer for the second preimage problem,
therefore this is automatically taken care
of.
This is something we have discussed; we were
discussing about this particular reduction.
Therefore, this is the collision to preimage
problem. We know that preimage problem, means
given h x, we have to return an x value for
which the hash value are the same; that is
the preimage problem.
We assume that there is a solution to this
preimage problem; from there we show that
in that case, we can also solve the collision
problem. Therefore, if this is the question
that is if you are choosing x uniformly at
random, then you compute the hash value that
is h x value and then engage the algorithm
for computing the preimage. Therefore, this
is an oracle to which you are asking a question
and it gives you back an x dash.
One thing you have to now check is, whether
x and x dash are same or not, because this
just gives you the preimage. Therefore, you
do this check that is whether x dash and x
are same or not, if they are not same, then
you can return x comma x dash, because both
of them have the same hash value and they
are also not equal. Therefore, this is the
solution to your collision problem as well.
Now, the question is that if I assume that
the oracle preimage is a 1 comma Q Las-Vegas
algorithm, so that means 1 comma Q, means
the probability of giving you the preimage
solution is actually 1. There are Q queries
required, then what is the corresponding epsilon
values and number of queries required from
this collision to preimage to occur.
Number of queries you can easily see that
if this query is Q, since there is an extra
query, it is Q plus 1. About the probability
we will make an assumption, so the assumption
is as follows that is the cardinality of x
is twice greater than the cardinality of y.
Then, we can show that you have actually the
probability of half comma Q plus 1. In order
to understand that let us consider a proof.
The proof works as like this, so consider
the space of x. Now, what we start doing is
that we start partitioning this space, so
we start inducing partitions; you know that
we can induce partitions by equivalence relations.
Therefore, we can start inducing partitions;
you start partitioning this set such that
all the values which lie in one partition
they have got the same hash value. That means
if there are two values like x 1 and x 2,
which is one partition and then h of x 1 and
h of x 2 are the same. That is the definition
of a partition in this case.
Let us now consider that. Suppose - how many
such partitions are there? Therefore, the
number of partitions which will be there,
suppose all of them are some collections,
therefore this is my c 1, this is c 2, this
is c 3 and so on, then the number of such
collections will be equal to the cardinality
of y, because all of the partitions are indicating
one particular hash value. Therefore, the
number of collections is actually equal to
the cardinality of y.
Now, let us consider that there is a given
value of x; suppose you have been provided
with an x value, what is the probability of
success with which you are actually solving
the collision problem? Your probability of
success will be actually equal to - how many
total number of given x? How many total values
are there? There are actually - if I call
this as - I mean the equivalence class - I
mean, for there is basically one partition,
if I indicate by this symbol, then the cardinality
of this is the total number of possible values,
which I can choose from among them. Except
one, all of them are my correct values, because
one will be the same value. So, do you understand
what I am saying?
Suppose, let us consider one particular partition.
You are considering this partition and suppose,
x lies in this partition. In that case, when
you are asking the preimage to give you a
result, the preimage is giving you a result
with a probability of 1, the preimage oracle,
we are assumed that it has - it is a Las-Vegas
algorithm with probability 1. Therefore, it
will definitely solve the preimage problem
and it will return you some value.
What are possible values which it can return?
It can return the number of values which are
lying in this partition. What is that set?
So, I indicate the number by equivalence of
x and the cardinality of that. I mean, in
that except for the number x itself or the
value x itself, all are correct values for
the collision problem. Therefore, your probability
of success, if you are given this value of
x, is this; any doubts?
Now, we need to consider the average case
probability. What we will do? If I am interested
in the average probability, then what I will
do is that I will take it over the entire
set. Therefore, I will do a sigma operation
of this and I will value x over the entire
set x.
This, we can actually write in this fashion.
So, there are so many collections, let us
break this sigma into two sigmas. In this
particular sigma, let us concentrate that
x is belonging to one particular collection.
Here, the collection belongs to all possible
collection sets. Let c be the collection sets,
which comprises of c 1, c 2 and so on. How
many total collections are there? c cardinality
of y.
Therefore, consider that this is your total
collection set; therefore, you know that this
is how what you are doing. What does these
value compute to, this is the number of elements
in the corresponding collection c; therefore
this is also cardinality of c minus 1. This
will compute to now one by cardinality of
x, if I keep this sigma, so this will be for
all the values of x, which lies in cardinality
of c. So, how many values are there? There
is cardinality of c values, this value's a
constant and you get cardinality of c minus
1. This computes to 1 by x, I can break this
into cardinality of c, where c lies to this
cardinality of set minus sigma 1, this also
is this, is denominator working, therefore
the c is again varying over this cardinality
set of c .
Therefore, what does the first term compute
to? It computes to mod x. What does the second
term compute to? Mod y, because for each collection
you are getting a 1, so this computes to mod
y. Therefore, this is nothing but mod of x
minus mod of y divided by mod of x.
Now, you see that we have actually made one
assumption in the theorem, which says that
mod of x is greater than twice of mod of y.
If I assume this, this is quite a practical
assumption to make, in that case, mod of x
is greater than twice of mod of y, if I assume
this and then this particular thing is greater
than half. Therefore, your probability of
success is at least equal to half.
You note one thing that whenever we are doing
this kind of reductions or the reduction that
we have discussed in our class, we have assumed
an ideal hash function assumption - We have
assumed that the hash functions are ideal.
Basically, all these proofs that we have shown
are under the random oracle model. So, if
the random oracle model is violated, then
these proofs or these reductions may not hold
true; so we have to be careful.
What was the ideal hash function model? The
main assumption was that if I need to compute
a new hash value, how many previous hash values
you have computed, if you are computing the
hash for new input, then you have to again
compute it, previous value should not help
you; very informally this was the meaning.
Now, you can actually ponder upon a point,
it says that if the oracle preimage has a
success probability of epsilon which is less
than 1, then what is the minimum probability
of success of collision to preimage algorithm?
You can just think on this actually.
Now, we go into the construction of iterated
hash functions. We will take up a particular
type of construction, which is known as the
Merkle Damgard construction. The idea is that
we have got a compression function. So, all
these iterated hash functions has got a underlying
concept that is, there is a compression function,
which means that it takes a large number of
values and it compresses to a small output
bits. Using this, we will iterate this, so
that my domain actually becomes infinite and
my output is still constant.
Now, if you see this slide, extending a compression
function to a hash function with an infinite
domain, so that is the objective of the iterated
algorithm. A hash function is created in this
fashion is called an iterated hash function.
So, we will consider a hash function whose
inputs and outputs are bit strings, which
means they have 0 1 values.
Again, we know this that mod x denotes the
length of a bit string x and this particular
symbol is a concatenation symbol. So, x concatenated
y will be represented in this fashion, so
x is a string, y is a string and they concatenate
in this form. So, these are some notations.
Now, let us start the algorithm, therefore
the basic concept behind this construction
is that you have got a compress function,
which means that you have got a compress block,
which will take in say an m plus t bits and
will give you a 0 1 m bit output.
Therefore, this is a compression function,
so t is typically greater than equal to 1.
Now, what will we do is that your input x
can be actually quite large, using that you
are computing the hash value. The first thing
which you do is that you have got the input
string x, you can think that you can actually
break this - break them into blocks. What
you can do is that you can break them into
sub blocks, such that each block is actually
divisible by t or rather is equal to t.
Therefore, you understand that in the last
block, there can be some values - I mean it
is not exactly equal to t. Therefore, what
you do is that you do padding. You extend
this and you make it also equal to t. This
particular step is actually called the preprocessing
step. Therefore, what you do is that you take
x and you make the output of this particular
preprocessing step, the size of that block,
a multiple of t.
Therefore, you see that what you do is that
you take an input string x, where mod x is
greater than equal to m plus t plus 1. The
output string y is such that mod of y is actually
equal to 0 mod t, which means it is divisible
by t and y; you are actually breaking up like
y 1, y 2 and so on, till y r. Therefore, r
- I mean, for each block, is actually equal
to t. So that means r into t should be the
cardinality of y. So, cardinality of y means
the number of bits, which are there in the
block y.
Then, you have got certain steps; I mean you
do a sort of an iterated algorithm. Therefore,
what you do is that you take one steps that
is you start computing with y; now you got
this y. So, this y has got each of the blocks
as size of t.
You have got the compress function, which
takes m plus t bits and gives you m bit output.
What you do is that now to this compress function,
you feed in t bits from y and you feed in
m bits, so you start with something which
I called as an IV or an initializing vector.
Now, this will result in an m bit output.
This m bit output, what I am doing is that
I am passing on as my first as some parameter,
which means, suppose I can call that as Z
1. Next, what I do is that I take this m bit
and I take the next t bits from y, again I
pass that to my compress function, I again
obtained another m bits, I call that Z 2.
Similarly, I keep on repeating this process.
The final compress steps which you obtain
also will therefore result in an m bit output.
This particular output, let us write that
as Z r, because there are r blocks of y, this
Z r is sometimes called the hash function
output or Z r is often given an optional transformation
like for example, g Z r and that is actually
called h x. So, g is optional, which means
that you can either make it an identity function
or you can have a g transformation.
This m bit 
is actually fed to the compress, therefore
this compress now has got an m plus t bits
of input and it also results in an m bit output.
This is the way you keep on iterating this
process. Therefore, this final Z r which you
obtained, like you are exporting some values
like Z 1, Z 2 and so on, thus rth value that
is Z r, is sometimes given an optional transformation,
sometimes not; optional means sometimes you
do not give it. Therefore, g Z r is actually
called h x; this is the basic idea behind
iterated hash functions.
This is what I have already told to you. So,
you can go back looking into the slide, but
so let us consider one typical preprocessing
step. Therefore, what it does is that if you
see that you have got y and you have got x,
what you do is that you concatenate it with
pad x. So, pad x is nothing but a simple padding
function, so it generally has the value of
mod x. What you do is that you pad it to the
left with additional zeros. So that the sum
is a multiple of t, so this is a very simpler
step which I have already told you.
Now, you note one thing that I mean the main
thing is that to be noted here is that this
preprocessing step has to be an injective
function. What does injective function mean?
It has to be a one to one transformation,
why? Because, if it is not a one to one transformation,
then again concentrate on this preprocessing
step itself and create a collision. So, collision
creating should be quite easy in that case.
Therefore, you have to ensure that the preprocessing
step is actually a one to one mapping.
Therefore, you note that the preprocessing
step has to be injective and you see that
mod of y that is the number of values in y,
is actually equal to r t, which is greater
than the value of number of values in x. This
is necessary for one to one transformation,
this cannot be lesser than mod of x. So, these
are some minor points.
Now, we coming to the original Markle Damgard
construction, let us consider upon this. This
basically the same iterated hash function
technique only, but slightly there are some
small minor changes, so let us concentrate.
What you do is that again you use the compress
function, which is an m plus t bit to m bits.
So, this is a collision resistant, this is
used to construct a collision resistant hash
function. Therefore, you see that the domain
of your hash function is actually an arbitrary
size 0 1 string, from there you are obtaining
a 0 1 m bit output. The main advantage of
using or main elegance of using a Markle Damgard
construction is that it gives you a wonderful
proof.
The proof idea is like this, so before I go
into the proof, I will discuss about the construction,
but I will just give you the idea behind the
- I mean what I mean by the term proof. Proof
means that - your problem is now to create
a hash function, which is collision resistant.
Now, this is a quite big problem, because
what you are saying is that you will take
a very large string, you will compress that
into a small m bit string, which has to be
still collision resistant.
The beauty of this Markle Damgard construction
is that it reduces this problem to the collision
resistance of the compress function. The compress
function is much smaller problem, it is just
an m plus t bit input, which you are compressing
into an m bit output. It gives you a proof
that if you are able to make a compress function,
which is collision resistant, then your hash
function is also collision resistant. So that
is the basic elegance of the Markle Damgard
construction.
Now, let us go through the steps of the construction
before going into the proof. You see that
x is again taken as the k bit value like x
1, x 2 and so on. All of them are in this
case - I mean, all the size of each particular
block is actually equal to t minus 1, so it
is not t as we saw in general overview of
the construction. It is actually equal to
t minus 1, in the last block, is actually
of size of t minus 1 minus d.
In order to make it t minus 1 size, what do
I have to do? I have to add d 0s. Therefore,
in this case, the size of d is actually lesser
than equal to t minus 2 that is the way how
you are breaking it. What is the value of
k? So, k will be in this case, you already
have your size as n, but if you have padded
that with d blocks, it becomes n plus d. Therefore,
n plus d divided by t minus 1 should be the
value of k and that you can represent as ceil
of n divided by t minus 1.
Now, let us see the construction. The construction
you have to go through this algorithm little
bit. You see that 0 1 m plus t bit is to 0
1 m bit is my constructions, complex function
which we have taken. Here, we are assumed
that t is actually greater than equal to 2,
but ideally t should be actually greater than
equal to 1, but we will be excluding the case
t equal to 1 for now. You just consider the
case when t is greater than equal to 2, why?
Because, if t equal to 1, then this k becomes
undefined, therefore we will consider that
t is greater than or equal to 2 for this case.
You see that n is actually equal to the size
of x and k is equal to n divided by t minus
1 and ceil of that. What you do is that for
each of these steps, you keep on doing certain
operations. What you do is like this, therefore
you see that for i equal to 1 to k minus 1,
you take x i, assign that to y i, so that
is the preprocessing step. Only the kth block
what you do is that you take x k to the right,
you pad that with d 0s. If you do that then
this also becomes of size t minus 1, then
there is an additional step, which is actually
called Markle Damgard's strengthening step;
therefore, this is called m d strengthening.
What is that m d strengthening step? What
you do is this that is you take y k plus 1;
in y k plus 1, you write the binary representation
of d. So, d is some value some 4, 5, 6 something.
You write the binary representation of d,
you pad to the left with 0s, because you have
to make this also of size t minus 1. So, all
the y values are of size t minus 1, so this
is the way how the y's.
Then, what you do is that you start your operation.
What you do? You take the first one, so how
many bits are there in y 1, t minus 1? You
pad that with m plus 1 0s, therefore size
becomes equal to m plus t, because there are
m plus 1 values here and there are t minus
1 values here, so there are m plus t values
here. So that means you can feed it to the
compress function, if you feed it to the compress
function, it will give you back an m bit value,
call that g 1.
Now, you continue the iterative process, which
means that you take g i, which you have obtained
in the previous step. You know, you have also
obtained y i plus 1, in between you keep a
1. Therefore, what is the size of this particular
string? m plus t, again you can feed this
into the complex function and obtain the m
bit output. This you can iterate, you can
obtain the corresponding value, values like
g 1, g 2, g 3 and so on till g k plus 1. Therefore,
this final value of g k plus 1 is actually
the hash value that is the h x value.
You note that this is I mean sort of a pseudo
algorithm way of representing the story, but
diagrammatically this will look like this.
Therefore, what you do is that you take this
value of m, then do an additional padding
to the kth block, then finally you add the
length block also, which is the y k plus 1th
block in the pseudo algorithm. Then you are
taking each m 1, m 2, m 3 and so on, this
is the IV value in my pseudo algorithm, I
have taken this to be 0 m plus 1, but that
is the public value anyway.
Then, what you do is that if you feed it to
the complex function, you obtain an m bit
output, you take the next t minus 1, but in
between you have to add 1 here also and you
continue with that to obtain the corresponding
hash value. So, all the hash functions like
sha, m d 5, m d 4, this follows these principles.
Now, we come to the slightly little bit more
complicated part, but it is quite simple proof,
which says that suppose compress 0 1 m plus
t to 0 1 m, is a collision resistant compress
function, then the function, this is actually
a collision resistant hash function. This
is the representation of the collision function,
you see that your 0 1 string can actually
lie from m plus t plus 1 to infinity. The
domain can be 0 1 - I mean from m plus t plus
1 to infinity and that you are compressing
to an m bit output.
What you have to prove is that if these compresses
function is a collision resistant function,
then I mean this are a collision resistant
function, then your hash function is also
collision resistant. Like the previous day's
class, what you have assumed is that you will
take a not in both sides, therefore if we
assume that the hash function is not collision
resistance that means, if you have assumed
that there is a collision which we have obtained
for the hash function, then we can prove that
the compress function also has a hash function,
I mean also as a collision
If you can find a collision in the hash function
efficiently, so efficiently is again under
my definition of efficiency. Then, you can
find a collision in the compression function
also efficiently. How do we prove this?
In order to prove this, let us go back to
my algorithm and the concentrate upon certain
cases.
You see that both my algorithm are essentially,
I have got y 1, say y k plus 1. Now, assume
that we have been able to find out a collision
for the hash function. What does it mean?
It means that you have found out two values,
if you assume that the hash functions has
a collision, so that means that x and x dash
which are not the same, has got h x dash as
seen. You have find out two such values which
are like this.
That means, now if you concentrate upon say
y 1, I mean concentrate suppose on x and x
dash, concentrate upon I mean the two preprocessing
steps of x and x dash. Therefore, assume that
x results in y and x dash results in y dash.
What is y? The y is suppose y 1 to y k plus
1, y dash is y 1 dash to y l plus 1 dash.
This can be two arbitrary lens, based upon
which we have done this operation.
Now, assume two cases: case 1 is when mod
of x and mod of x dash mod of t minus 1 and
not the same, which means, if you take mod
x and mod x dash mod of t minus 1 and not
the same, you know, it means that if I take
a mod of t minus 1, so that is the final number
of values, which you are padding. So that
is minus being both the cases, minus d 1,
minus d and minus d dash value. In this case,
what I am saying is the d and d dash is not
the same.
Therefore, y k plus 1 and y k plus 1 dash
or other y l plus 1 dash are different. Now,
consider upon the hash output. So, what was
my hash stream, which was outputted - I mean
what was my corresponding hash value? The
h x was equal to g k plus 1 and h x dash was
equal to g of l plus 1, so call that g dash
of l plus 1. If h x and h x dash are the same
values, then g of k plus 1 will be equal to
g of l plus 1 dash.
Therefore, what does it mean? It means that
compresses g of k with 1 with y k plus 1,
will be equal to compress g l dash 1 y l plus
1 dash. Now we know y k plus 1 and y l plus
1 dash are different. What does it mean? It
means that we are been able to find out two
values which are giving you the same compress
outputs. Therefore, what you are showing is
that the compression function is not a collision
resistance, this solves one case.
What about the other case now? Is this part
clear? The other case is when mod of x and
mod of x dash mod of t minus 1 are same. This
actually will be easy if I break this into
two parts. So, I call that first part as mod
of x - the first case is mod of x is equal
to mod of x dash, which means that what does
it imply? It implies that k and l are same
values.
Now, if you observe that what your saying
is g of k plus 1 is equal to g dash l plus
1 or other, l will be in this case k itself,
so g dash of k plus 1. So, what does it mean?
It means that compress I call that comp for
private, so comp will be g of k 1, here we
have got y of k plus 1 is equal to comp g
of k 1 and y of k plus 1 dash.
Here, you note that y of k plus 1 and y of
k plus 1 dash are the same values - I mean,
if you say that this is not a collision for
the compress function, definitely g of k and
g of k dash should be equal. Therefore, g
of k and g of k dash must be equal. So that
be the case and again compress, we can continue
like this. This becomes g of k minus 1, 1
y of k and that is equal of compress g of
k minus 1 dash, then you have got a 1 here,
y k plus y k dash; so this is the idea.
Now, you see that what you have? If you have,
still do not want to force a collision of
the compress function, then two things become
clear: one is that y of k and y of k dash
must be same. Therefore, if this is not a
collision, then this implies that y of k is
equal to y k dash and g of k minus 1 is equal
to g of k minus 1 dash.
Now, if I assume that g of k minus 1 and g
of k dash minus 1 are not the same, then again
I can continue in this fashion. If I continue,
since my lens at both are the same k and l
are same, I will continue till g 1 equal to
g 1 dash. In each case, other will result
like this, like y k equal to y k dash and
again, y k minus 1 equal to y k minus 1 dash
and finally, we have y 1 equal to y 1 dash.
What we are saying now is that y and y dash
are the same, because for each sub part y
is having been same in both the cases. If
y and y dash are the same, because my preprocessing
step was my injective transformation, so this
implies that x and x dash are the same values,
but that contradicts my assumption, because
I will assume that x and x dash are different,
therefore you see a contradiction, you follow?
This part is also solved, therefore there
must be a collision in the compress function
and you are able to show this reduction.
What about the other case which we have? The
other part case 2b, is mod of x, is not equal
to mod of x dash. This proof is actually quite
similar to the case 2a, expect for the fact
that you have got g of k plus 1 equal to g
of l plus 1 dash. If I assume without loss
of generality that l is greater than k and
then I can continue in this fashion, what
will obtain is g of 1 will be equal to g of
l minus k plus 1 dash. What is g 1, what is
g of l minus k plus 1 dash? If you see you
are algorithm here, just see your algorithm,
how it starts.
The first step is that you take 0 m plus 1
and you concatenate between y 1, so that is
what you are compressing to obtain g 1. What
you are obtaining is 0 of m plus 1 concatenated
with y 1 and that is what you are compressing.
What about this? This is simple like - you
take g of l minus k, concatenate with y 1
and you have got y of l minus k plus 1 dash.
Now, if you compare these two things, you
observe the m plus first bit, the m plus first
bit in this case is what? y 1 is how many
bits? It is of t minus t minus 1 bits; this
is of m plus 1 bits. How many bits are there
in g dash l minus k? m bits. So, there is
1 bit here; how many bits are there? t minus
1.
Now, observe the m plus first bit; m plus
first bit here is 0, but what about the m
plus first bit here, it is 1, therefore they
are definitely different. Therefore, you have
essentially again found out a collision in
the compress function. Now, you see why 1
was used and why 0 m plus 1 was used here,
that you understand from the proof actually.
This gives you an idea that actually your
collision resistance of your compressed function,
implies there your iterated hash function
is also collision resistant.
This is again under the ideal assumptions,
so we are again assuming that each hash function
is an independent computation. If you find
that there are dependencies in the hash functions,
then all this proof systems will come down.
Now, we will just conclude with the case when
t equal to 1. So, I think this part is clear
to us, let us consider the case when t equal
to 1. When t is equal to 1 that means that
you will take a compress function 0 1 m plus
1 to 0 1 m, this is how you do. You take x,
the preprocessing step is slightly different.
So, what you do is that you take 1 1 and after
that you take any function f - I mean not
any function, a particular function and then
apply f x 1, f x 2 and so on. You see that
mod x means - n means what? How many bits
are there? The n bits are there. For each
of the bit you are computing f x 1, f x 2,
and so on. This is how you do the encoding,
if the input is 0, then the output is also
0, but if the input is 1, then the output
is 0 1.
Do you understand the encoding? Suppose, your
input is a string like 0 1 0 1 1 like this,
then if you are doing this hash encoding f,
then for your 0 you get 0, for your 1 you
get 0 1, for your 0 again get 0, for 1 you
get 0 1, for again 1 you get 0 1; so this
is the way how you are encoding the input
x. Now, you see that in this case, you take
this value of x and you start encoding this,
the only the thing is that the first two bits
are 1.
So, why do you keep these two things? Two
things are important again, the encoding is
an injective encoding. The other thing is
that it there does not exist two strings:
x and x dash which are different, such that
y of x and y of x dash satisfy this kind of
relation. That is y of x is equal to z, which
is any arbitrary string, concatenated with
y of x dash. You see that this can never happen,
that is no encoding is a postfix of another
encoding, why? Then, because y of x, will
obviously start with two ones, y of x dash
will also start with two ones, but in this
y x, two ones can never occur in between;
two successive ones can never occur in between.
Therefore, this relation will never be satisfied,
do you follow?
What I am saying is that if you take two values
of x and x dash, there can never be the case
where y of x dash and y of x dash satisfies
this relation; that is y of x dash is a postfix
of another encoding, because of the reason
that two consecutive ones can never occur
in the preprocess step. It is quite simple,
therefore you obtain this y 1 and then, each
of them is of now of 1 bit.
So, what you do is that you take 0 m, I mean
m bits and this is of - y 1 is of 1 bit, so
that is m plus 1 bits, you compress, you obtain
g 1, you continue this process. You take g
I, concatenate with y i plus 1, so g i is
m bit and y i plus 1 is 1 bit, so this is
m plus 1 bit, you engage this compress function,
which gives you an m bit output form and m
plus 1 bit output, obtain m bit output. Finally,
you are returning g k as the corresponding
hash value.
Now, this we can actually show to be collision
resistant, the proof is quite very simple,
I will not go into the details of the proof,
you can find out the details from Hinson's
book. The reason is that - I mean, if you
again you divide into two cases. Again, you
assume that there is a collision for the hash
function, which means that there are two x
values: x and x dash, which are distinct and
which results in the same hashed outputs.
Now, again assume that x and x dash, the sizes
are same or the sizes are different. If the
sizes are different, then the proof is exactly
- I mean if the sizes are same, then the proof
is exactly same. If the sizes are same, then
again you can assume that g k equal to g r
l dash. Similarly, you can again continue
and show that y and y dash will be same, you
can continue in that fashion, but what about
the other case when the sizes are not the
same?
If the sizes are not the same, then you will
actually show that - if the sizes are not
the same, which means, if I assume that the
size in one case is k and the size is another
case is l, they are not the same. What you
do is that you continue from y k, you start
with g k and g l dash. So, if g k and g l
dash are the same, because you are assuming
that the hash function collides with for two
x values: x and x dash, then g k equal to
g l dash, means what? The g k equal to g l
dash means that correspondingly y k plus 1
will be equal to y of l plus 1 dash.
Similarly, you can continue, finally you will
have y 1 equal to y of l minus k plus 1 dash.
Now, this means that you are violating the
postfix assumption, because here, you have
got one string, which is actually the postfix
of the other string. You see that this is
a shorter string, this is a bigger string
for y's, do you see that? This is a y string
in one case, this is the y string in another
case, this actually forms a post part of - I
mean this is actually a bigger string. Therefore,
if you considered y dash, you can write y
dash as some z, which is followed by a y string.
Therefore, this is actually violating my property
of the encoding function. This cannot happen,
therefore again you are proving a contradiction,
I mean, showing a contradiction. This establishes
the fact that if the compress function is
collision resistant, then your hash function
is also a collision resistance.
This is what the theorem says, if you put
the two theorems together, then now you have
actually got t greater than equal to one,
because the previous case was t greater equal
to two and this case is equal to t equal to
one. So, this is what you have shown that
if the compressed function is a collision
resistant compress function, then so is the
hash function. You can actually compute the
number of times the compress is computed in
each of the cases, this works out as follows.
These are quite simple calculations, which
we can just check how many times the compressed
functions are being engaged.
This is an assignment for you again, therefore
you see that consider a collision resistant
function g x, which takes an infinite string
and results in a 0 1 n bit output. Consider
a hash function h x, which is one concatenate
with x. If x is of n bits, otherwise it is
0 concatenate with g x. Now, the question
is that you have to discuss about the collision
and the preimage resistance of h x. You see
that since g x is collision resistant, h x
is also collision resistance, but what about
the preimage problem? You see that in one
case, you have found out that there is 1 as
an output. Therefore, if I assume that is
50 percent of time, you will get 1 and 50
percent of time, you will get 0. If you get
1 concatenate with x and then the inverse
is through so clear, with the probability
of at least 50 percent, you are able to invert
the function h x.
So, what does it mean? This is not preimage
resistant, but what have we proved? We are
proved that collision resistance implies preimage
resistance. So, there is the anomaly, various
small problems in this particular example
from what we have proved in the class, so
you are supposed to justify that.
Next time, we will again continue with the
hash function, this is the textbook that we
have followed and this is some paper that
I have followed. This is the main book, you
can refer to this textbook and you will find
these details here. So, we will again continue
with cryptographic hash functions next time.
