# [Cryptography] Mapping numbers to permutations

Viktor Dukhovni cryptography at dukhovni.org
Thu Dec 4 02:07:48 EST 2014

```On Thu, Dec 04, 2014 at 07:28:25AM +0100, Andreas Briese wrote:

> Don't get it.
> If the output of the generator is random then its binary notation is random too.
> Why don't you use this directly, if you need a binary stream?

A uniformly random stream of integers selected from [0, n-1] does
not directly yield an unbiased stream of bits.

Sampling the first \$k\$ elements of the stream results in any of
\$n^k\$ possible states with uniform probability.  Suppose such a
sample is used to generate the first \$m\$ bits of a binary output
stream.  This is accomplished via some function

f: n^k -> 2^m

The only way that \$f\$ can result in a uniform distribution on \$2^m\$
is if \$2^m\$ divides \$n^k\$.  So in particular with \$n = 5\$ the
claimed uniformity is provably impossible.  What is possible is to
discard some subset of \$n^k\$ so that \$2^m\$ divides the number of
allowed states.  Whenever a forbidden state is reached, you throw
away your \$k\$ inputs and try another \$k\$.

This is reasonably efficient when \$n^k\$ is just larger than a
power of \$2\$.  For \$n = 5\$, and given the continued fraction
for \$log_2(5) = [2; 3, 9, 2, 2, 4, 6, 2, 1, 1, 3, 1, 18, ...]\$

we get the rational lower bound approximations:

2/1
65/28
339/146
...

So for example 2^65 is just less than 5^28:

\$ echo "10k 2 65 ^ 5 28 ^ / p" | dc
.9903520314

Therefore, after generating 28 base 5 digits, the resulting number
fits into 66 bits, but 99% of the time only 65 bits are required
and the top bit is zero.  If the top bit is non-zero discard the
state, and generate another 28 base 5 digits.  Otherwise, output
the bottom 65 bits.  This is uniformly random, and reasonably
efficient (discards are rare).  If we take the next approximation:

\$ echo "10k 2 339 ^ 5 146 ^ / p" | dc
.9989595361

The discard rate drops to ~0.1%, but the output arrives in clumps
of 339 bits at a time (of course this bit stream can be buffered).

The simplest thing of course is to use [0, 3] and discard all the
4's base 5.  The discard rate is 20%, and a run of "4" will
occasionally stall the generator for a few cycles until the bad
luck (Chinese numerology anyone) runs out.

--
Viktor.
```