/dev/random is probably not

Mon Jul 4 11:22:39 EDT 2005

On 07/03/05 15:19, Dan Kaminsky wrote:

 > So the funny thing about, say, SHA-1, is if you give it less than 160
 > bits of data, you end up expanding into 160 bits of data, but if you
 > give it more than 160 bits of data, you end up contracting into 160 bits
 > of data.  This works of course for any input data, entropic or not.
 > Hash saturation? ....

I don't know what it means to talk about "data, entropic
or not".  That's because for the purpose of analyzing
randomness genreators, nobody cares whether the input has
a large amount of "data"; what matters is _entropy_.

If you feed the hash function less than 160 bits of
entropy, it will not "end up expanding" it to 160 bits
of entropy.  No function can expand the amount of entropy.
For a hash function with output width W=160 bits, if the
input has 5 bits of entropy the output will have very nearly
5 bits of entropy.  The input/output relationship is very
nearly linear, with unit slope, until we get close to
saturation.

 > Hash saturation?  Is not every modern hash saturated with as much
 > entropy it can assume came from the input data   (i.e. all input bits 
have
 > a 50% likelihood of changing all output bits)?

That's not what "saturation" means.  I introduced and defined
the term "hash saturation", so I ought to know.  Saturation
is what happens when the input entropy is large, such that
the output entropy smashes up against the horizontal asymptote.
This is discussed in detail at
  http://www.av8n.com/turbid/paper/turbid.htm#sec-saturation
along with a tabulated numerical example.

 > Incidentally, that's a more than mild assumptoin that it's pure noise
 > coming off the sound card.

I have no idea what is meant by "pure noise", but I'm
pretty sure I didn't assume any such thing.

 >  It's not, necessarily, not even at the high
 > frequencies.  Consider for a moment the Sound Blaster Live's E10K chip,

I recommend against using any Creative Labs (aka SoundBlaster)
products for any serious or even halfway-serious purpose.  Far
too many of their products have deceptive specifications, and/or
don't meet specifications at all.

 > internally hard-clocked to 48khz.  This chip uses a fairly simple
 > algorithm to upsample or downsample all audio streams to 48,000 samples
 > per second.  It's well known that scaling algorithms exhibit noticable
 > properties -- this fact has been used to detect photoshopped works, for
 > instance.  Take a look how noise centered around 15khz gets represented
 > in a 48khz averaged domain.  Would your system detect this fault?

I'm not sure exactly what fault is being described, and I
don't know what "simple algorithm" is alluded to.  However,
my guess is that the "simple algorithm" is linear, and that
the gain contour is a fairly smooth function of frequency,
with no strong singularities at 15kHz.  And since my method
calls for _measuring_ the gain as part of the calibration
process, this so-called "fault" should not AFAICT be classified
as a fault;  most likely it is just one of the many factors
that affect the calibration constants.

 > Of course not.

Proof by bold assertion.  Unsubstantiated opinion.

 > No extant system can yet detect the difference between a
 > quantum entropy generator and an AES or 3DES stream.

First of all, there is nothing special about "quantum" entropy
that makes it better than other types of entropy (e.g. thermal
entropy), in any practical sense.  This point is discussed in
detail at
   http://www.av8n.com/turbid/paper/turbid.htm#sec-hesg-attack

Secondly, it simply is not correct to say that their is no
difference between a genuinely entropic randomness generator
and a pseudo-randomness generator.  Proof by construction:

a) On a system that relies on /dev/urandom in the absence of
sources of real entropy, capture a backup tape containing
the /var/lib/.../random-seed file.  Then provoke a crash and
restart, perhaps by interrupting the power.  Then you know
the output of /dev/urandom for all time thereafter.

To repeat:  One difference between a genuinely entropic
randomness generator and a pseudo-randomness generator is
whether I have to be paranoid about crash/restart scenarios,
and whether I have to be paranoid about protecting my backup
tapes.

b) A related point:  Suppose we are running a high-stakes
lottery.  Who provided the _original_ seed for the randomness
generator we will use?  Even assuming (!) we can protect the
seed for all times greater than t=0, where did the t=0
seed come from?  If you provided it, how do I know you didn't
keep a copy?

This is important, because the historical record shows that
randomness generators are not necessarily broken by attacking
their cryptologic primitives, by direct cryptanalysis of
the PRNG output;  more commonly they are broken by attacking
the seed-generation and seed-storage.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com