Extracting uniform randomness from noisy source

Tue Aug 6 01:00:29 EDT 2002

Amir Herzberg  wrote:
>Oops, sorry, I wasn't clear above; AES_k is of course using the CBC_MAC 
>using AES.

I don't believe this is secure, either.

Notice that you never gave a security theorem, or a proof of security.
If the goal is a construction with a rigorously justified security
analysis, this deficiency should be a warning sign.  And in fact, there
are some attacks, as I show below.

For instance, suppose all the entropy of the input is contained in just
one block: say, the first block varies, and all other blocks are fixed
at all-zeros.  Then we can mount the same attack I described before to
distinguish a pseudorandom output from a true-random value.  Simply take
the output, decrypt it backwards as far as you can go, assuming all-zeros
in the trailing blocks, and the result will be a candidate value for
the first block of input; if it looks like a plausible input sample,
then you probably received an output from the pseudorandom source,
else you probably received true-random bits.

This same attack works even if the varying block is in the middle.
So the first lesson is that it does not suffice just to have enough
entropy in the input; it must be well-spread across blocks.

But being well-spread across blocks is no guarantee of security, either.
Imagine: you have a semi-random source, where each 16-bit word has 1
bit of entropy, and sampled words are independent.  To accumulate enough
entropy, you sample 80 words from this source, and then compress it using
AES-CBC-MAC.  Well, this succumbs to a 2^40 meet-in-the-middle attack.
Suppose you're given an output and you want to guess whether it came
from the pseudorandom process or from a true-random process.  First,
enumerate all 2^40 possibilities for the last 40 words, and decrypt
backwards through 5 blocks to obtain a list of 2^40 candidate values for
the intermediate value halfway through the CBC-MAC.  Next, enumerate
all 2^40 possibilities for the first 40 words, and encrypt forwards
through 5 blocks to obtain another list of 2^40 candidate values for
the same intermediate value.  If there is any value that appears in both
lists, conclude that you were probably given bits from the pseudorandom
AES-based process.  If not, conclude that you were probably given bits
from a true-random source.

This second attack shows that not only must the entropy be well-spread
out, but you must have more entropy than you need, by a factor of two.
In other words, the AES-CBC-MAC construction is wasteful of entropy.

SHA1 hashing does not have any of these problems.  And moreover, the lack
of a security theorem (or proof) for the AES-CBC-MAC construction raises
the question of whether there are any more attacks we haven't noticed yet.

In short, despite all the theoretical problems with rigorously justifying
SHA1-hashing, I believe in practice it is by far the best technology
available.  If asked by any practitioner, I would gladly recommend
SHA1-hashing over the alternatives.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com