<div dir="auto"><div dir="auto">Sorry in advance for making a new thread. But I thought it might  be simpler this way - a lot of important information is scattered all over, and some things I think is important is missing. </div><div dir="auto"><br></div><div dir="auto">Note: this is fairly long. I've tried to keep the wording and formatting readable. </div><div dir="auto"><br><div dir="auto">Feel free to add sources to the statements herein.</div><div dir="auto">Feel free to point out mistakes (but please keep it short!). </div><div dir="auto">And feel free to add to anything *important* that is missing. <div dir="auto"><br></div><div dir="auto">------</div><div dir="auto"><br></div><div dir="auto">* What an RNG is needed for; </div><div dir="auto"><br></div><div dir="auto">Anything that requires unpredictable elements.</div><div dir="auto">TLS key exchanges needs unpredictable ephemeral secrets even on the client side, tons of IoT devices needs to generate keypairs, and that's just the start. </div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto">* How should we define entropy? </div><div dir="auto"><br></div><div dir="auto">The computational definition is the most practical one. </div><div dir="auto"><br></div><div dir="auto">Given our sources of randomness and our pool of collected bits, and given all of the information and resources accessible to our adversaries, the computational entropy of the pool is defined as that which is the minimal amount of computational effort ("work factor", "bruteforce", etc...) that the adversaries would collectively need to recover the state of our pool, or to otherwise predict its outputs (expect the adversaries to collaborate, voluntarily or not!). </div><div dir="auto"><br></div><div dir="auto">The key is that the outputs are unpredictable = out of reach from being cracked for as long as deemed necessary. </div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><div dir="auto" style="font-family:sans-serif">* How the RNG interface should behave;</div><div dir="auto" style="font-family:sans-serif"><br></div><div dir="auto" style="font-family:sans-serif">You ask the system for X bits. You get X bits. Quickly. Always. </div><div dir="auto" style="font-family:sans-serif"><br></div><div dir="auto" style="font-family:sans-serif">It should be easy to implement this request in all relevant software environments. </div><div dir="auto" style="font-family:sans-serif"><br></div><div dir="auto" style="font-family:sans-serif">Those bits shall be indistinguishable from random and can't be guessed in advance with a chance better than random - those X bits shall represent X bits of *computational* entropy (see "defining entropy"). They can't be used to guess future bits, or to derive previous bits. Those bits are given to you exclusively. </div></div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto">* Who needs to provide the RNG, and what they need to do and consider;</div><div dir="auto"><br></div><div dir="auto">The system designer (for various definitions of "system"). </div><div dir="auto">The platform should offer a secure source to all the applications running on it. </div><div dir="auto">If it can't, chances are you already lost. </div><div dir="auto"><br></div><div dir="auto">The hardware should always have a trustworthy TRNG / CSPRNG that comes with accurate data on its characteristics, for use by the firmware / software. The hardware should be auditable, so that the behavior can be validated. </div><div dir="auto">The goal is to allow you to be certain of how much "genuine" entropy that has been produced at minimum. </div><div dir="auto"><br></div><div dir="auto">Beware that some hardware sources hold large states, but accumulate random changes (entropy) slowly (lava lamps? Lavarand is already a thing). Asking for 64 bits four times may give you 64 bits of entropy first, then just 5 new bits on each invocation seconds later. </div><div dir="auto">Metrics like "state flush time" (or "entropy displacement time"?), "initialization time", "entropy generation rate" and similar must be part of the available characteristics of hardware sources. Otherwise the software estimating the entropy can't provide any guarantees. The ability to detect faulty behavior is also extremely important. </div><div dir="auto"><span style="font-family:sans-serif">The hardware sources should all be resistant to tampering and fault injection and similar attacks. </span></div><div dir="auto"><br></div><div dir="auto">The opinions on using multiple hardware sources are divided. I'm personally in favor, for the sake of redundancy and defense in depth (increasing the cost of compromising all sources), perhaps ~4 hardware sources of different designs. Others believe multiple hardware sources increases complexity and attack surface. </div><div dir="auto"><br></div><div dir="auto">The OS / firmware MUST be configured to use these hardware sources, and MUST know their characteristics. If you have a perfectly trustworthy hardware TRNG that produces 1 bit of entropy per 100 bits out, then you don't get to consider the pool as filled after feeding it 256 bits. You need to know when you can be certain to have enough bits, and knowing the characteristics is the key to it.  </div><div dir="auto"><br></div><div dir="auto">If you have multiple sources, you may only add up their individual entropy estimations if you can guarantee they're all sufficiently independent. Breaking one should not let anybody break another, and thereby completely breaking your RNG through simply modeling and bruteforcing the majority of your sources (such as if your 10 hardware sources all produces just 32 bits of entropy, all predictably correlated). </div><div dir="auto"><br></div><div dir="auto">If you have multiple machines all belonging to the same entity then entropy may perfectly well be "provisioned" in the form of seed files, through a central server, by temporarily plugging in a TRNG, or with similar solutions. </div><div dir="auto">You don't care where the hardware source is, only that adversaries can't guess the outputs. You probably don't consider yourself your own adversary. Remember that this provisioning too must happen securely (hard to spy on). </div><div dir="auto">In the case of provisioning entropy to tiny embedded devices, it isn't entirely unreasonable to use private keys as one of the entropy sources for initial seeding. If doing that breaks something, you've got worse problems (such as weak keys or algorithms). </div><div dir="auto">Remember that you still need stored state for a CSPRNG to not repeat. </div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto">* How the RNG should behave internally;</div><div dir="auto"><br></div><div dir="auto"><span style="font-family:sans-serif">It shall verify that the hardware RNG is behaving according to the characteristics, and adjust entropy estimates thereafter. Faulty sources shall be detected and estimated to 0 (zero) bits of entropy. </span><br></div><div dir="auto"><span style="font-family:sans-serif"><br></span></div><div dir="auto"><span style="font-family:sans-serif">The entropy pool must hold a large enough internal state that you can not possibly enumerate all possible secret states, i.e. at least something like 256 bits or more. It shall at minimum match (or exceed) your defined lowest entropy threshold. </span></div><div dir="auto"><span style="font-family:sans-serif"><br></span></div><div dir="auto"><span style="font-family:sans-serif">Mix in fresh entropy securely - personally I favor information theoretic approaches like "multisource entropy extractors", but hashing and other such common existing solutions are likely sufficient. Mixing shall never cause a loss of entropy within your pool. </span></div><div dir="auto"><br></div><div dir="auto">Don't ever generate any outputs before being sufficiently seeded! </div><div dir="auto">Why? Because you'll effectively leak the internal state, making it possible to bruteforce the pool contents and the permanently break the security of the CSPRNG (that is, if it is never reseeded in between outputs with enough bits to prevent further pool recovery through bruteforce). </div><div dir="auto"><br></div><div dir="auto"><span style="font-family:sans-serif">The threshold for "sufficiently seeded" (having enough entropy to generate outputs)</span><span style="font-family:sans-serif"> should be somewhere around 256 bits of entropy for a good security margin, which includes defending against Quantum Grover's algorithm. See the argument in "defining entropy" above. </span><br></div><div dir="auto"><br></div><div dir="auto"><span style="font-family:sans-serif">Reseeding may be done occasionally if you believe your system might leak secret bits from the pool through sidechannels. When you reseed, attacks like the above suggest that you should NOT inject the new entropy into the main pool until you've accumulated enough new entropy for it to be unpredictable by itself (reaching above your entropy threshold). </span></div><div dir="auto"><span style="font-family:sans-serif"><br></span></div><div dir="auto"><div dir="auto" style="font-family:sans-serif">Beware of complicated constructions;</div><div dir="auto" style="font-family:sans-serif">Linux recently had a flaw where new injected entropy wasn't spread across all the bits in the pool, and not all bits in the pool was used simultaneously for deriving outputs - with their particular design, this meant that even seeding 256 bits after a state compromise would (for a while) still allow for bruteforcing all the new entropy piece-wise, yet again recovering the full internal state! </div></div><div dir="auto"><span style="font-family:sans-serif"><br></span></div><div dir="auto">Seed the entropy pool quickly! And o<span style="font-family:sans-serif">nce sufficiently seeded, </span><span style="font-family:sans-serif"> just quickly generate outputs as requested and never block again.</span></div><div dir="auto">Nobody likes to wait. Your hardware sources must (collectively) be quick enough to meet your time requirements. </div><div dir="auto"><br></div><div dir="auto">Only use strong cryptographic algorithms to generate the outputs (CSPRNG - some platforms use modern stream ciphers, which should be sufficient). Beware of output limits on algorithms, given that many are capped at something like 2^64 or even 2^32 outputs before security no longer can be guaranteed, and then you MUST reseed - on long lived public servers, this is a genuine problem. </div><div dir="auto">It must be rollback resistant (equivalent to the term Perfect Forward Secrecy in TLS and Signal). </div><div dir="auto"><br></div><div dir="auto">VM:s and similar platforms / runtimes must ALWAYS be reseeded on cloning / initialization. This extends to all their key material! Otherwise you may leak key material or plaintext through for example stream cipher key reuse. </div><div dir="auto"><br></div><div dir="auto">On the software side, it must be impossible for an output to be duplicated and given to multiple applications. The RNG needs to perform well, and if it is threaded then it must be written to handle that safely.</div><div dir="auto"><br></div></div></div></div>