[Cryptography] How do I tell if a byte sequence repeats?
Sampo Syreeni
decoy at iki.fi
Tue Aug 26 20:22:11 EDT 2025
On 2025-08-25, Pierre Abbat wrote:
> The state is the last several ciphertext bytes, together with the
> accumulator, which, if the plaintext is all zeros except for one byte
> at the beginning, is that byte, a constant.
So remember those when you sample. All of them. That is the state of
your algorithm, and what needs to be compared in order to apply any
cycle finding algorithm which is dependentent on equality comparisons.
But you know, you will do something with those preceding bytes of
ciphertext, in order to arrive at the outcome that is the next byte
output. Typically in CRNG's that will lead to some update of a finite
state machine which is *not* the same as the whole prefix thus far
produced by your generator. You can *always* include the preceding
prefix in your state and compare over it, but here, you typically don't
have to. You only need enough information to determine what you're
outputting, over a step of your overall algorithm. All of the preceding
state, the prefix, is usually propagated via an intermediate,
well-organized representation of your algorithm's state, which you can
compare over.
Not that you'd have to optimize like this. If your underlying algorithm
already looks at its past in an unbounded fashion, it will probably be a
bad one. What's called for in crypto work is low, linear cost in bytes
consumed and produced, using bounded space in the process. That implies
you can do state recognition the way I said. If you can't, your crypto
primitive is broken to begin with, and needs to be changed.
> I keep the last 8 bytes in a UInt64 as the state, and use a Fibonacci
> version of Brent's method.
You can use the last eight bytes as a sieve, a proxy, for the state, and
you can use whatever exponential search strategy you want, binomial,
Fibonacci, whatever. But the thing you need to remember is that whatever
your algorithm put out, is not the same as its state. Whatever it put
out, does not determine what it will put out in the future. At the same
time, if you gather up all of the stuff your algorithm depends on to do
its next step, that will always determine its future in full. That is
its "state". The sum total of its internal variables, on which it takes
decisions. Compare on *that*.
> With a four-byte key, the periods are usually the same and close to
> 65536 in order of magnitude, which is what I'd expect if the state
> transition is a random function, not a permutation: 26413, 45336,
> 24703, 70546.
What I'm talking about is once removed from the sequence level of yours.
If you think about different sequences generated in a probabilistic
fashion, yes, you can have these things where different "key lengths"
lead to different cycle lenghts. But in the deterministic, algorithmic
framework I'm coming from, you either have a cycle, or you don't have
one, there can only be one, and detecting one is generally equivalent to
the halting problem, so any attempt at it will not be, cannot be, a
recursive decision problem. It will have to be somewhere in the
recursively enumerable class, so not guaranteed to terminate.
Getting arbitrarily strong stochastic bounds, especially given such a
strong stochastic model as we have with cryptosystems and their
associated random number generators, then random oracle model, and
whatnot, is very different. But then I don't think you're really talking
about this, since you only asked about your system falling into a cycle.
That is not a statistical thing, but a deterministic, algorithmic one.
It can be best quantified and analysed at the level of algorithmic
state and its equivalence, instead of looking at the many distributions
in input and output sequences your algorithms might produce.
> One run, though, resulted in one of the Daphnes falling into a cycle
> of length 24905 and the other in a cycle of length 45718, which are
> relatively prime. I plotted the heat map, and that's the one that
> looks like tartan.
If we're going this way, divulge the code. All of it, including your
test framework. It's very difficult to follow what you're going at,
otherwise.
--
Sampo Syreeni, aka decoy - decoy at iki.fi, http://decoy.iki.fi/front
+358-40-3648785, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
More information about the cryptography
mailing list