[Cryptography] Entropy is forever ...

Fri Apr 17 13:26:11 EDT 2015

Dear all:

Quoting the basic definition of entropy from Wikipedia, "In information 
theory, entropy is the average amount of information contained in each 
message received. Here, message stands for an event, sample or character 
drawn from a distribution or data stream." In applied cryptography, the 
entropy of a truly random source of "messages" is an important 
characteristic to ascertain. There are significant challenges when 
applying the information theory, probability, and statistics concepts to 
applied cryptography. The truly random source (and computations using 
random message data) must be kept secret. Also the following dilemma 
should be noted: the truly random source is needed in a digital 
processor system typically engineered with determinism as a design goal 
derived from the basic reliability requirement. Quantitatively, the 
entropy measure for applied cryptography, in the order of hundreds of 
bits, is way beyond any workable statistical analysis processes. In 
practice, a truly random source usable for applied cryptography is a 
system procurement issue that can seldom be blindly delegated as an 
ordinary operating system service. Thus, one wants a reliable source of 
uncertainty, a trustworthy one than can barely be tested, as a universal 
operating system service totally dependent on hardware configuration.

Applying the information theory to actual situations is error-prone. Is 
there a lower entropy in "Smith-725" than in "gg1jXWXh" as a password 
string? This question makes no sense as the entropy assessment applies 
to the message source. A password management policy that rejects 
"Smith-725" as a message originating from the end-user population 
actually constraints this message source with the hope of a higher 
average amount of information in user-selected passwords. From a single 
end-user perspective having to deal with an ever growing number of 
passwords, the entropy concept appears as a formalization of the 
impossible task he/she faces.

Significant conceptual deviation may occur from the common (and correct) 
system arrangement where a software-based pseudo-random number generator 
(PRNG) of a suitable type for cryptography is initially seeded from a 
secret true random source and then used for drawing secret random 
numbers. It is often inconvenient for statistical testing to apply 
directly to the true random source messages, but statistical testing of 
the PRNG output gives no clue about the true random source. The design 
of PRNG seeding logic is an involved task dependent on the true random 
source which may be hard to modelize in the first place. In actual 
system operations, the inadequate seeding may have catastrophic indirect 
consequences but it may be difficult to detect, and it is certainly a 
challenging error condition for service continuity (programmers may be 
inclined to revert to insecure PRNG seeding when the proper true random 
source breaks down).

Despite these pitfalls, I assume my reader to share my endorsement of 
the true random seeding of a cryptographic PRNG as the main source of 
random secrets for a digital processor system dedicated to cryptographic 
processing. As this PRNG output is being used in various ways, chunks of 
the output sequence may be disclosed to remote parties. It is an 
essential requirement for a cryptographic PRNG that no output chunk may 
allow the recovery of its internal state (i.e. some data equivalent to 
PRNG seed data leading to the same PRNG output sequence as the secret PRNG).

In this note, I challenge the view that an entropy pool maintained by an 
operating system ought to be depleted as it is used. I am referring here 
to the Linux "entropy pool." My challenge does not come through a review 
of the theory applied to the implementation. Instead, I propose a 
counter-example in the form of the above arrangement and a very specific 
example of its use.

The central question is this problem. A system is booted and receives 
2000 bits of true randomness (i.e. a 2000 bits message from a source 
with 2000 bits of entropy) that are used to seed a cryptographic PRNG 
having an internal state of 2000 bits. This PRNG is used to generate 4 
RSA key pairs with moduli sizes of 2400 bits. The private keys are kept 
secret until their use in their respective usage contexts. No data leak 
occurred during the system operation. After the key generation, the 
system memory is erased. What is the proper entropy assessment for each 
of the RSA key pairs (assume there are 2^2000 valid RSA moduli for a 
moduli size of 2400 bits, a number-theoretic assumption orthogonal to 
the entropy question)?

My answer is that each of the 4 RSA key pairs are independently backed 
by 2000 bits of entropy assurance. The entropy characterization 
(assessment) of a data element is a meta-data element indicating the 
entropy of a data source at the origin of the data, plus the implicit 
statement that no information loss occurred in the transformation of the 
original message into the assessed data element. Accordingly, my answer 
should be made more precise by referring to an unbiased RSA key 
generation process (which should not be considered a reasonable 
assumption for the endorsement of lower ranges of entropy assessments).

To summarize, the entropy assessment is a characterization of a the data 
source being used as a secret true random source. It also refers to the 
probability distribution of messages from the data source and the 
quantitative measure of information contents derived from the 
probability distribution according to the information theory. This 
mathematical formalism is difficult to apply to actual arrangements 
useful for cryptography, notably because the probability distribution is 
not reflected in any message. The information theory is silent about the 
secrecy requirement essential for cryptographic applications. Maybe 
there is confusion by assuming that entropy is lost when part of the 
random message is disclosed, while only (!) data suitability for 
cryptographic usage is being lost. In applying the information theory to 
the solution of actual difficulties in applied cryptography, we should 
address secrecy requirements independently. The probability distribution 
preservation through random message transformations is an important 
lesson from the theory that might have been overlooked (at least as an 
explicit requirement).

A note about the genesis of the ideas put forward. In my efforts to 
design applied cryptography key management schemes without taking 
anything for granted and paying attention to the lessons from the 
academia and their theories, I came with a situation very similar to the 
above problem statement. The 2000 bit random message from a 2000 bits 
entropy truly random source is a simplification to the actual situation 
in which a first message transformation preserves the probability 
distribution of random dice shuffling. In the above problem statement, 
the PRNG seeding is another distribution preserving transformation. The 
actual PRNG is based on the Blum-Blum-Shub x^2 mod N generator, which 
comes with two bits of entropy loss upon seeding. The above problem 
statement is thus concrete.

Maybe the term entropy is used, more or less by consensus, with a 
definition departing from the information theory. Indeed, NIST documents 
covering the topic of secret random numbers for cryptography use 
conflicting definitions surrounding the notion of entropy.

Although my own answer to the stated problem puts into question the 
Linux "entropy pool" depletion on usage, I do not feel competent to make 
suggestions. For instance, my note hints that a PRNG algorithm selection 
should be part of the operating system service definition for 
/dev/?random offered for cryptographic purposes but I have just a vague 
idea of whether and how the open source community might move in this 
direction.

Entropy is forever ... until a data leak occurs.
A diamond is forever ... until burglars break in.

Regards,

- Thierry Moreau