building a true RNG
John S. Denker
jsd at monmouth.com
Mon Jul 29 13:37:49 EDT 2002
Barney Wolff asked:
> >Do we even know that the popular hash functions can actually generate
> >all 2^N values of their outputs?
David Wagner replied:
>
> It seems very unlikely that they can generate all 2^N outputs
> (under current knowledge).
I was temporarily astonished, but he clarified as follows:
> Q2: If we cycle through all messages (possibly very long
> or very short), are all 2^N output values possible?
...
> I'd guess that the answer
> to Q2 is probably "Yes", or close to it.
1) Consider the following function H0: Divide the input into
chunks N bits long. Calculate the XOR of all the chunks, and
use that as the output.
This meets the definition of hash function, although it would
not be a one-way hash function. And it would most certainly
be capable of generating all 2^N possible outputs.
2) I can't prove that a standard hash function such as SHA1
generates all possible codes, but I consider it likely. It would
be quite shocking if a strong hash function such as SHA1 generated
fewer codes than a weak function such as H0.
3) For a one-way hash function should not expect a _constructive_
proof that it generates all possible codes; such a construction
would violate the one-way property.
4) Here is a rough plausibility argument. Consider two hash
functions H_1 and H_2 that are independent. We define
H'(x) := H_1(x) XOR H_2(x) (1)
which implies
H_1(x) = H'(x) XOR H_2(x) (2)
Now let's look at the truth table for equation (2), where
the row-index is the H' code, the column-index is the H_2
code, and each entry represents the H_1 code required
to uphold the equation:
00 01 10 11
00 00 01 10 11
01 01 00 11 10
10 10 11 00 01
11 11 10 01 00
Now let's suppose H_2 is missing one code (say the 10 code) and
H' is missing one code (say the 11 code). Then H_1 must be missing
at least three codes! Otherwise there would be a way of combining
a non-missing H_1 code with a non-missing H_2 code to create the
missing H' code.
00 01 10m 11
00 00 01 10 11
01 01 00 11 10
10 10 11 00 01
11m 11m 10m 01 00m
We can extend this argument by combining lots and lots of
independent hash functions H_i. The combination has far
fewer missing codes than any of the ingredients. So either
you conclude that
a) there is a conspiracy that prevents us from constructing
"independent" hash functions to use as ingredients, or
b) we can produce hash functions with very few missing codes.
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com
More information about the cryptography
mailing list