[Cryptography] One-time pads in modern crypto software?

Wed Feb 17 09:21:29 EST 2021

Phillip Hallam-Baker wrote on 2/16/21 10:01 AM:

> The big problem marketing wise would be 'Power One Time Pad' and its
> thousand copies. For most of us, 'one time pad' is a sure fire sign of
> crypto snakeoil. Instead of a one time pad it is invariably an untested
> stream cipher.

I am thinking out loud here about the ways in which even a "perfect"
stream cipher can be misused.  This is primarily to clarify my own
thoughts, as I'm sure this is nothing new to the people on this list.
Please forgive me in advance if it sounds like the musings of a school boy.

Let's say I have chosen a specific function that maps a 256-bit key into
a "highly unguessable" infinite sequence of output bits.  There are many
such functions already identified.  For a given secret key, the output
of that function serves as a one-time pad.

The use of that function is vulnerable to all the same problems as the
use of a one-time pad.  For example, you should never reuse the same
bits to mask more than one message.  So, you use a random nonce to
choose a starting point for each new message.  You might even chop a
single long message into several smaller messages (blocks) and choose a
new nonce for each block.

The point is to avoid masking both "attack at dawn" and "meet me at the
park" with the same sequence of bits.

The principal advantage of the stream cipher is that it's easier to
distribute a single 256-bit key than a one terabyte pad.  Another
advantage is that the 256-bit key can yield a pad far larger than one
terabyte.

The advantage of the one terabyte pad is that it can be "more random,"
or even "truly random."  However, a 256-bit key is all the randomness
you could ever need -- assuming that the stream cipher function is not
vulnerable to analysis, which can be difficult to prove.

Also, it seems that the output of the stream function is bound to repeat
at some point, assuming that the function runs efficiently in time and
space.  You might devise a function which produces an unguessable
transcendental number for a given input, but the "random access" into
that sequence for a given nonce would no doubt require a lot of time and
memory.  In any case, the nonce itself has a fixed size, which imposes a
limit on where you can point into the output stream anyway.

Consequently I would avoid using the word "infinite" to describe the
length of the output.  I would instead pin it down to a specific number.
 For example, in the case of ChaCha20, I would say that the stream
cipher is a deterministic function which maps a 256 bit number to a
2^393 bit number.  (That may not be the exact length but you get the idea.)

To characterize SAFE usage modes in a deterministic way, I would think
of the cipher function as a function of TWO inputs, the key and a seed.
 For example, the function might take a 256-bit key and a 256-bit seed,
and map that into a number with 2^393 bits.  The key is a secret already
shared by both parties, and the seed is a randomly chosen number sent in
clear text to the receiving party ahead of the message.

For each such function, I should like to see a set of clear reproducible
test vectors, each specifying both input and the corresponding output.
Clearly you cannot specify an entire 2^393 bit output, but probably the
first 256 or 512 bits would suffice.  If there are issues regarding key
rotation, sequence numbers, overlapping nonces, etc., then the test
vectors should be characterized so that they capture all such nuances.
Each test vector shall be denoted as a tuple of fixed finite numbers.

In that way, a developer such as myself can be virtually certain that I
am deploying the stream cipher in the secure way intended by experts.
If a particular cipher stream is VERY strong but also VERY "difficult to
get right," well then I shouldn't worry at all since I can reproduce the
known test vectors.  If there's something else I *should* worry about,
that nuance should be incorporated into the test vectors.

In that way I can avoid the pitfalls of Donald Knuth's "Beware of bugs
in the above code; I have only proved it correct, not tried it."  I want
the probability of getting it wrong to be astronomically low.

-- Patrick