[Cryptography] new PRNG family

Thu Dec 4 04:27:29 EST 2014

First of all, thanks for looking. 

Am 04.12.2014 um 07:25 schrieb Ray Dillinger <bear at sonic.net>:

> 
> 
> On 12/03/2014 01:46 AM, Andreas Briese wrote:
>> 
>> Maybe someone on the list likes to investigate and publish with me about a new (chaotic) family of PRNG based on logistic maps? 
>> 
>> https://github.com/AndreasBriese/breeze
>> 
> 
> It is interesting, but you're relying on the exact format of
> floating point numbers, correctness of the hardware implementation
> of operations on them (*cough* FDIV bug *cough*), and rounding
> modes which you must control exactly in order for your code to
> work the way you think it will. Honestly, these are things
> that vary much *less* from system to system and machine to
> machine than they used to, but they are still a cause for
> concern.  I would worry about using an FP-based generator on
> any platform other than the one it had been developed and
> tested on, or even on any platform where the default rounding
> mode can be set with a system call.

There is a number of chaotic maps, but the logistic map was in favor 
because it uses multiplication and subtraction only.

basically it's (for chaotic state): 
x = kx(1-x) with k between 3.86 and 4 (inclusive) and x between 0 and 1 (exclusive)
output will always be in range >=0 and <1 
(talking about the FP representation not about the underlying math where you never get 0 here)

A problem is FMA, which isn't supported by all FPU/CPU and might even be 
implemented with different of the intermediate precision of the FP number.   
I tried to tackle this by explicitly cutting of the calculation into two steps:
nx = 1-x
nx = nx * k * x
to prevent the compiler of calling a FMA operation and to make sure, properly rounding occurs
before calculating the multiply.

The only other FP operation is: 
x=1 - nx 
before the next cycle, which is again a subtraction with rounding. 

regarding rounding: IEEE754 has 5 rounding modes of which  
(1) round to nearest, ties to even
(2) round to nearest, ties away from zero -> here: to the nearest value above
(3) rounding to zero
are relevant in the context.

(3) occurs (even if mathematically incorrect; see above) but should be performed on all FPU/CPU equally if the calculation result is converted into the FP representation. If rounding to zero happens, the algorithm makes sure the PRNG reseeds (deterministic from bits of mantissa of previous states). 

AFAIK mode (1) is the default. 
What happens if mode (1) is changed to mode (2) (i.e. to attack the PRNG): Less results are rounded to zero, more are rounded to 1 -->  x=kx(x-1) = 0 --> reseed. But the output stream is nevertheless random. Because the random number streams are different, if breeze is in example used as a cipher, decoding on a machine with a different mode(1/2) fails. But nothing else happens, right?  

> 
> Anyway, I don't want to get sufficiently elbows-deep in float
> format and representation to do that analysis.  But if you want
> to analyze this, Here are some things to check for.
> 
> Given your assumptions about IEEE float representation and
> the default rounding mode, you already know that each state
> has a unique successor.  Now try to prove that each state also
> has a unique predecessor -- ie, for each total state your
> generator can be in, there is only one possible prior state
> which could have produced it. If the unique-predecessor
> property is not true, then one of the predecessor states
> (along with all of its predecessors, etc) is not on any cycle,
> and that is bad.  You want as many states as possible - in
> fact, all of them - to be on cycles.

unfortunately this is different here: because of the rounding in FP you get 
at each map (there is a number of maps) different previous states producing 
the same output and the UNIQUE-predecessor property FOR EACH INDIVIDUAL 
MAP is not true. FP is finite representation of infinite R thus it can't even have this property.
Furthermore i cannot say how many states there are exactly: theoretically it should be the mantissa 
length -> should be about 2^51 ?
maybe you can give me the exact number of different mantissa in double float representations 
of values between 0 and 1 ? 

Nonetheless it's deterministic and thus can be reproduced. I mean, the sequence of 
numbers (period) from each map falls exactly again, given the same x as starting point. 
And therefore this is true for the combined algorithm including the different logistic maps and 
the occasional reseeds (about 10^-10 from my tests).

> 
> A cycle is a set of states, each leading to the next in sequence.
> Because you have a finite number of states, and each state has a
> unique successor, every traversable path through the states of
> your generator must eventually comes back to whatever state you
> started in if your starting state is even *ON* a cycle - which
> it probably won't be if your states don't have unique predecessors.
> That is why the unique-predecessor property is important.  States
> that aren't on cycles are unreachable, and don't count toward
> security. Ideally, every possible state is on the same huge cycle.
> If your generator has several equal-length cycles, then no keys
> are weaker than others - but none of them is as strong as it would
> be if all the states were on the same cycle. If it turns out that
> some cycles are much shorter than others, your PRNG will have a
> class of weak keys for every such short cycle.

i guess that cycle is equivalent to period length here, right?
computed logistic maps have a known shortcoming at period length (because 
of the FP representation; see Persohn & Povinelli (2012) 
http://povinelli.eece.mu.edu/publications/papers/chaos2012.pdf ). 

There are seeds producing short periods for each 'k' in kx(1-x). 
Thats why breeze is interchanging the states for the next round between 
the different logistic maps - and that's basically what makes breeze a 
different approach of using chaotic maps for a PRNG. 
Because of this interchange, short periods are very rare as far as i can compute it.
What happens even with verschränkten log maps is, that they produce zeros - 
that is tackled with this code for reseeding here: 

(from breeze128)

switch newstate1 * newstate2 * newstate3 * newstate4 * newstate5 * newstate6 {
	case 0:
		s1 := (uint64)((*(*uint64)(unsafe.Pointer(&l.state1)))<<11>>(12+l.bitshift%7)) + (uint64)((*(*uint64)(unsafe.Pointer(&l.state2)))<<11>>(12+l.bitshift%7))
		s1 += (uint64)((*(*uint64)(unsafe.Pointer(&l.state5))) << 11 >> (12 + l.bitshift%7))
		s2 := (uint64)((*(*uint64)(unsafe.Pointer(&l.state3)))<<11>>(12+l.bitshift%7)) + (uint64)((*(*uint64)(unsafe.Pointer(&l.state4)))<<11>>(12+l.bitshift%7))
		s2 += (uint64)((*(*uint64)(unsafe.Pointer(&l.state6))) << 11 >> (12 + l.bitshift%7))
		seed := [2]uint64{s1, s2}
		l.bitshift++
		l.seedr(seed)

if one of the six maps result in zero, then produce four uint64 out of the mantissa bits of the six states of the previous round and use this to reseed all maps. 

> 
> How many bits of state does your generator have?  That is, let
> S be the number of valid, reachable states your generator can
> be in.  Log2(S) is the number of bits of state your generator
> has.  IE, if you map all the valid states to numbers, you
> would need this many bits to represent a number that size.

As said above, i do not exactly know. but each map has 'mantissa length' entropy = 2^51
breeze128 has 6 maps
breeze256 has 12 maps
breeze512 has 24 maps
This is the inner state. random number output is a mixup of parts of these mantissas by XOR and ADD regulated 
by another inner state called bitShift, that's used to obfuscate inner states against output by continuously changing the number of overlapping bits that are xored&added.

> 
> If your adversary knows your PRNG's state, the game is all over
> because it's deterministic.  You're producing pseudoentropy with
> no input of real entropy.  And each output reveals state to
> the adversary.
> 

i think it be a hard problem to get the inner state from output. the output is a XOR & ADD result from at minimum 3 actual inner state mantissa parts XORed with the previous output of at least 2 different logistic maps. Furthermore the portion of mantissa that is used for the actual calculation differs because bitShift will have changed in between.

As i wrote on the github site, even if an attacker has insight into the computers memory registers at one time, he will not be able to recalculate previous states because of the second 1-x subtraction, where information is inevitably lost through the rounding process of FP. 

I know of the paradigm of people, that think, their code can't be broken, because they don't see how to do it, but …. :-)  - i like to learn.

> Your objective is that your generator should change its
> state faster than the bits it reveals allow an adversary to
> determine the state.  By the time an opponent sees a bit,
> the bit should be a function of every part of your state.
> And every part of your state should depend on that bit
> (or on the transition that produced it) by the time he
> has seen enough further bits to reveal that state.
> 

see above
like salsa20 (and some other PRNGs) breeze produces 128 / 256 / 512 bytes with one roundtrip and stores them in a outer state array for return at call. keyspace = seedspace is 128 / 256 / 512 bit for breeze128 / breeze256 / breeze512

> For a generator with N bits of state, it takes N+1 bits of
> output for an analyst with Godly computer power to determine
> the total state.  But most analysts have merely mortal compute
> power at their disposal, and if they can explain the most
> recently observed N-1 bits of output in terms of any of
> 2^128 or more starting states which they would have to check
> through one at a time, the N+1 bit isn't going to do them
> any good.
> 

see above

> The security of a PRNG fails when an adversary doesn't have
> to check all those possible prior states one at a time. For
> example, in an LFSG, each output tells an adversary a whole
> bit about the future state of the generator which won't
> change until another output reveals what it's being
> replaced with, so after he's seen N+1 bits of output he
> knows all of the current and therefore all future states.
> 

as stated above, i think, you need rainbow tables to predict next output of breeze. 
And even then, because each k for the maps can be chosen out of the unique double FP representation of values 3.82843 < k <= 4.0
the programmer can even make this volatile (i.e. calculating from dev/urandom or natural randomness).

> Bear
> 
> 
> _______________________________________________
> The cryptography mailing list
> cryptography at metzdowd.com
> http://www.metzdowd.com/mailman/listinfo/cryptography