building a true RNG

John S. Denker jsd at monmouth.com
Mon Jul 29 13:37:49 EDT 2002


Barney Wolff  asked:
> >Do we even know that the popular hash functions can actually generate
> >all 2^N values of their outputs?

David Wagner replied:
> 
> It seems very unlikely that they can generate all 2^N outputs
> (under current knowledge).  

I was temporarily astonished, but he clarified as follows:

>   Q2: If we cycle through all messages (possibly very long
>       or very short), are all 2^N output values possible?
...
> I'd guess that the answer
> to Q2 is probably "Yes", or close to it.

1) Consider the following function H0:  Divide the input into
chunks N bits long.  Calculate the XOR of all the chunks, and
use that as the output.

This meets the definition of hash function, although it would
not be a one-way hash function.  And it would most certainly
be capable of generating all 2^N possible outputs.

2) I can't prove that a standard hash function such as SHA1
generates all possible codes, but I consider it likely.  It would 
be quite shocking if a strong hash function such as SHA1 generated
fewer codes than a weak function such as H0.

3) For a one-way hash function should not expect a _constructive_ 
proof that it generates all possible codes;  such a construction
would violate the one-way property.

4) Here is a rough plausibility argument.  Consider two hash
functions H_1 and H_2 that are independent.  We define
	H'(x) := H_1(x) XOR H_2(x)		(1)
which implies
	H_1(x) = H'(x) XOR H_2(x)		(2)

Now let's look at the truth table for equation (2), where
the row-index is the H' code, the column-index is the H_2
code, and each entry represents the H_1 code required
to uphold the equation:

	     00   01   10   11
	00   00   01   10   11
	01   01   00   11   10
	10   10   11   00   01
	11   11   10   01   00

Now let's suppose H_2 is missing one code (say the 10 code) and
H' is missing one code (say the 11 code).  Then H_1 must be missing
at least three codes!  Otherwise there would be a way of combining
a non-missing H_1 code with a non-missing H_2 code to create the
missing H' code.

	     00   01   10m  11
	00   00   01   10   11
	01   01   00   11   10
	10   10   11   00   01
	11m  11m  10m  01   00m

We can extend this argument by combining lots and lots of 
independent hash functions H_i.   The combination has far 
fewer missing codes than any of the ingredients.  So either
you conclude that 
  a) there is a conspiracy that prevents us from constructing
     "independent" hash functions to use as ingredients, or
  b) we can produce hash functions with very few missing codes.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com



More information about the cryptography mailing list