The answer is the attacker can learn both the length of m and which blocks
in the message are equal. The attacker learns the length because the length
of this ciphertext is actually equal to the lenght of m.
The output for encryption functions is the same length as its input
so we know that these lengths are equal. It's assumed that
file contents m were an even number of blocks, since we're dividing m into
If it's not, we'll have to do something about that.
And we'll get to that soon, but the solution to that is to add padding.
The other issue, which is a more serious one in this case
is the attacker learns which blocks in m are equal.
This may not sound like such a big problem, if we're thinking of
the blocks as 128 random bits, maybe the probability that two blocks are equal
is actually very low. We'll see later--this class is not quite as low as you might guess
but the message is not random bits. The message, or the file,
if we're thinking of these as 8-bit characters, well,
128 bits is only 16 characters. If they're Unicode characters, well then we have
more than one byte per character. This could get down to
a pretty small number of characters. There are certainly lots of sequences
in many files that would be 16 characters long.
that could be repeated. So this is a pretty large amount of information to leak.
We want to find a way to encrypt our file that doesn't do that.
Ways of using ciphers like this are called Modes of Operation.
The one that I just described is known as Electronic Codebook Mode.
And the reason for that is that you can think of having a book
that for each input message, 128 bits of input, there are (2^128) - 1 inputs
You would be able to look up the value of encrypting that message.
So you can have a codebook that gives you for each input the 128-bit
output that corresponds to that input.
And this would be a really big book if we assume we could get about 10,000
entries on a page, and each page weighs five grams,
well, this would be about 1.7 * 10^32 kg to carry around.
Adn that's just for one key. So it's really not practical to think of this as
being a physical codebook today, but this is really the same thing
that early codebooks did. They just provided one-to-one mapping
between inputs and keys. And that's exactly how we're using AES here
is to provide that mapping, and we're using the same mapping every time
we need to encrypt the same value, we're getting the same output.
This is the problem that we mentioned earlier,
that it doesn't hide repetition in the message.
Another problem it has is that someone could scramble--
an attacker could move blocks around--could replace blocks--could change things--
and it would still decrypt to a perfectly valid message,
just with the blocks in a different order.
So those are problems with the Electronic Codebook Mode
We're going to look at some alternatives that avoid some of those problems.
