entropy depletion

Sat Jan 8 11:29:33 EST 2005

Zooko O'Whielacronx wrote:
> I would love to have an information-theoretic argument for the security 
> of my PRNG, but that's not what we have,

Yes, and I'd like my goldfish to ride a bicycle, but he can't.
The P in PRNG is for Pseudo, and means the PRNG is relying
on computational intractability, not entropy.

>  and I don't think reducing the 
> entropy_count by one bit per output bit gets us any closer to such an 
> argument.

For PRNGs that's true ... which is why I have introduced the
terminology of SRNG, meaning Stretched RNG, which is a
hermaphrodite, being partly PRNG and partly HESG.  Linux
/dev/urandom is an early and less-than-satisfactory attempt
at a SRNG.  Yarrow is vastly better.

> For starters, the entropy_count value before you output the bit is 
> obviously not a measure of the real information-theoretic entropy in the 
> PRNG's state.  It is a guess implemented by a simple algorithm (the 
> "entropy estimator").  So if we set the resulting entropy_count after 
> outputting one bit to be equal to the previous entropy_count - 1, we 
> really have no justification for thinking that the resulting 
> entropy_count is any closer to the true information-theoretic entropy 
> than if you had set it equal to the previous entropy_count - 2.
> 
> On the other hand, I've haven't heard an information-theoretic argument 
> that the output bit contains a whole bit of entropy. 

Well, that depends on details.  Suppose you seed your PRNG with
1000 bits of entropy, and then put out 50,000 bits of output.
For one typical class of PRNGs, which includes linux /dev/urandom
and excludes yarrow, the first 1000 bits of output will use up
all the entropy, in the sense that the mapping from seeds to
output-strings will be very nearly one-to-one.  To say the
same thing another way, the adversaries are trying to find the
seed by brute-force search of seed-space, if they've guessed
the wrong seed they'll know it with very high probability after
seeing the first 1000 bits of output.  (They could also wait
and know the same thing from the second 1000 bits of output,
so we can have metaphysical arguments about just where the
entropy resides ... but since the adversaries get to see the
output stream in order, our minimax strategy is to assume
they are not stupid and have used the first 1000 bits to
reduce *their* entropy.  The number that I'm decrementing by
one bit per bit is my minimax estimate of the adversaries'
entropy.  If you have some other number that you would like
to decrement by some other rule, that's fine ... but I think
I'm doing the right thing with my number.)

Yarrow, in contrast, might well be configured to use only
100 bits of entropy for the first 50,000 bits of output,
and then reseed itself with another 100 bits for the
second 50,000 bits of output.  This leads to a notion of
average entropy density of the output, which is greater than
zero but less than 100%.

> There is a nice 
> practical-cryptography argument that an observer gains a whole bit of 
> information from seeing that output bit, but in pure 
> information-theoretic terms an observer gains less than one bit of 
> information from seeing that output.  So perhaps when you output a bit 
> from /dev/random you ought to decrement entropy_count by 0.9 instead?

That is not minimax, as explained above.

> In general, I've heard no persuasive information-theoretic argument to 
> justify the practice of decrementing the entropy_count by 1 bit per 
> bit.  

See above.

 > If that practice does indeed ever protect someone from a
> cryptanalytic attack on their PRNG, it will not be because the practice 
> is Information Theoretically Correct, but because the 
> entropy_count-bits-added-per-input-bit minus the 
> entropy_count-bits-subtracted-per-output-bit were an engineering fudge 
> factor that was turned up high enough to drown out the cryptanalytic 
> weakness in the PRNG.

I'm not sure I would have said it that way, but I might agree
with the sentiment.  If the PRNG were secure against
cryptanalysis, including "practical cryptanalysis" such as
peeking at the state vector, then you wouldn't need more
than an infinitesimal entropy density.

> Of course using such a fudge factor has some other costs, including the 
> cost of introducing new security risks.  I estimate that the chance of a 
> successful attack due to timing attacks, induced failure, taking 
> advantage of accidental failure, social engineering, etc. outweighs the 
> chance of a successful attack due to cryptanalysis of the PRNG, which is 
> why I use /dev/urandom exclusively [*, **].  You may weigh those 
> trade-offs differently, but you shouldn't think that by decrementing 
> entropy_count you are achieving information-theoretic security.

> [*] Of course I have to be aware of the regrettable possibility that 
> /dev/urandom has *never* been properly seeded and protect against that 
> in user space.
> [**] ... and the possibility that the operating system is re-using 
> stored random state which it already used just before an unclean shutdown.

Meanwhile, implementing a High Energy Symbol Generator would
relieve you of worrying about those things ... other things
that you ought to be worried about besides.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com