[Cryptography] Throwing dice for "random" numbers

Mon Aug 13 12:48:44 EDT 2018

On 08/12/2018 04:01 AM, Dave Horsfall wrote:

> I picked up a a pack of 12 dice from one of those discount stores for the 
> princely sum of AU$2 (so we can safely assume that they are not exactly 
> casino-quality[*], but at least the pattern of dots is correct).
> 
> So, the obvious question is: how unpredictable (not "random") would their 
> sum be, modulo N, given that each individual source is numbered 1-6? 

*) Is this meant to be a conceptual question?  And
are we supposed to take the word "sum" literally?

If so, then we proceed as follows:

Twelve is big enough that we can treat the distribution
of sums as nearly Gaussian.  For a single die, the distro
is nowhere near Gaussian, but we know the mean is 3.5 and
the variance is 2.92, as you can easily verify directly
from the definition.  When you sum multiple dice, the mean
and variance grow linearly, so for 12 dice we get a mean 
of 42 and a variance of 35.  So the stdev is σ=5.92.

It is well known that a Gaussian can be approximated well
by a triangle with a HWHM equal to the stdev.  So the full
width at baseline (FWBL) is 4σ.  It can be approximated
less well but still decently by a rectangle with the same
FWBL.  That is to say, all values within ±2σ of the middle
are equally likely and all others are irrelevant.  So we
estimate the entropy as log(4σ).

If you want to do a bunch of calculus, the fancy "official"
result is log(√(4πe) σ) ... where the prefactor is √(4πe)
= 4.13 ... so our intuitive value of 4 was not off by much.

All together you get log(4.13) + log(5.92) = 4.61 bits
per roll.

*) Or is this supposed to be crypto engineering, as suggested
by the Subject: line?

In that case, you shouldn't be computing the sum at all.
You can get log(6) = 2.58 bits of entropy per die per roll,
and if you consider them all separately you get 12*log(6)
= 31 bits per roll.  That's a huge improvement in
throughput.  There's no need to distinctively mark the
dice.  We assume each die is statistically independent
of the others.

*) Or was the question meant to ask how to handle the
systematic imperfections in the dice?

In that case, I recommend using the appropriate Rényi
functional.
	H_0 is sometimes called the Hartley functional.
	H_1 is the plain old entropy
	H_∞ is what I call the adamance;
	    it is sometimes called the min-entropy, but
	    IMHO that is a misleading misnomer

H_∞ tells us the worst-case resistance to attack, which
is appropriate for engineering a defensive cryptosystem.
So we look at the probabilities.  The ideal case is
	1/6  1/6  1/6  1/6  1/6  1/6
which gives you -log(1/6) = 2.58 bits of adamance per
die per roll.

Now suppose that on the basis of careful measurements
you know that the following is a *bound* on how badly
non-ideal things can be:
	0.8/6  1/6  1/6  1/6  1/6  1.2/6
which gives you -log(1.2/6) = 2.32 bits of adamance per
die per roll.

This is what I would recommend (unless I have guessed
wrong about the intended application).  This is what
Turbid does.

=======

Engineering philosophy:  It is neither necessary nor
possible to measure the exact amount of "randomness"
(whatever that means) in the dice.  It suffices to
obtain an estimate that is
 -- small enough to be a provable lower bound, and
 -- large enough to be useful.

The bound must be calculated from the physics of the
sourc.  It cannot be measured a_posteriori by looking
at the statistics of the downstream output. As Dykstra
famously said:
  Testing can be used to show the presence of bugs,
  but never to show their absence.

Here the critical "bug" is hidden nonrandomness.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://www.metzdowd.com/pipermail/cryptography/attachments/20180813/66800d22/attachment.sig>