[Cryptography] Whitening Algorithm

Thu Jul 23 10:07:48 EDT 2015

>   Hi,
> Iâm trying to whiten a random noise source (a reverse biased transistor)
> with a low-powered microprocessor. I figured out a technique that seems to
> work well, and I want to know if there is anything insecure or subpar
> about
> it.

Yes. There is.

>
> (Earlier, I had heard that XORing a random stream with alternating 0s and
> 1s could remove bias. However, this strikes me as very insecure, because
> an
> attacker could reverse the mask by XORing the mixed stream with the same
> 01
> mask. Also, I had used the Von Neumann algorithm
> <https://en.wikipedia.org/wiki/Hardware_random_number_generator#Software_whitening>,
> but this slows output significantly.)

You are posting in a cryptography oriented mailing list. So I assume you
want the output not just to be spectrally white, but to meet cryptographic
needs and so be indistinguishable from random and unpredictable.

You need to understand that any process that meets your needs will
necessarily slow the output significantly. You have partially entropic
data. You are trying to make it more entropic. You would like to make it
fully entropic but you cannot. However you can get close enough to get
random numbers useful for cryptography.

What you want it an entropy extractor. Extractor theory is well developed.
You have choices. You have a single source, so multiple input extractors
are not an option. Seeded extractors are not an option (chicken and egg
problem - where does the seed come from?). So you want a single input
extractor. You could use for example the algorithms (called conditioners)
in SP800-90B. They involve hashing or MACing X data bits down to Y data
bits, where Y < X and the min-entropy in X > 2Y. So the reduction in data
volume is greater than a factor of 2, since you are starting from a lower
than perfect min-entropy.

A Von Neumann whitener is not an entropy extractor, or at least not one
that is useful given the data that you get from physical sources. If you
put significantly serially correlated data into a Von Neumann whitener,
you will get unbiased data with lower min-entropy than the data you put
in. It is an unbiaser and only an unbiaser.

Don't feel bad - you are in good company. The notion that Von Neumann
whiteners are entropy extractors persists, even amongst RNG designing
academics who should know better. This:
http://csrc.nist.gov/groups/ST/lwc-workshop2015/papers/session7-yang-paper.pdf
was presented on Tuesday at the NIST Lightweight Crypto Workshop and it
recommends applying a Von Neumann whitener over SRAM start up values. This
is wrong. It will not achieve the intended effect.

>
> The algorithm mixes new bytes derived from the noise source with previous
> ones in an overlapping manner. Below is the source code, and a link to an
> illustration of the process.
>
> Iâve done some testing, and it appears to transform data that shows as
> much
> as 5% bias to ~0% while passing ent chi-square. Some less rigorous testing
> with NIST also had positive results.
>

This algorithm does not compress data. Therefore it is not an entropy
extractor. You are muddling the data, but in a deterministic process that
is reversible. It looks more random on the output, but has no additional
entropy.

So you go ahead an implement say the CBC conditioner from SP800-90B :
AES-CBC-MAC(k=aes(1,0), bytes[0:48]) -> output_bytes[0:16] and find that
you have fully entropic data (this assume > 50% entropy in the source
data. Put more data in if you have less source entropy per bit.

Now you have a slower source. You want it faster. So you use the
close-to-full-entropy value out of the conditioner to seed a
cryptographically secure pseudo random number generator, which will
generate useable random number as fast as your software can run. There are
many to choose from. Try the SP800-90A CTR-DRBG, or cha-cha-cha or one of
many others out there.

Regards
DJ