[Cryptography] Solving Matt's hash problem

Weger, Benne de b.m.m.d.weger at TUE.nl
Wed Nov 16 09:56:58 EST 2022


> >In Matt Levine's long article about cryptocurrencies, he describes in
> >accurate detail how hashing works, and mentions in an aside that he
> >cannot include the hash of the article in the article itself, you (the
> >reader) figure out why.
> >
> >I know this is a preimage problem, but is there anything about making
> >the hash itself part of the input that would make it harder or easier
> >to solve?

In a mathematical sense this is trivial: for an n-bit hash function,
produce a file containing the 2^n different possible hash values in your 
favourite representation. There it is.

The file will contain at least n.2^n bits. Can we do better? Yes, if
we allow the hashes to be represented by bit strings of length n: the
optimal solution of a bitstring that contains all possible bitstrings
of length n as substrings is a binary "De Bruijn" sequence of
order n, see https://en.wikipedia.org/wiki/De_Bruijn_sequence.
The length is 2^n + n-1, the n-1 coming from repeating the first n-1
bits to deal with the original sequence being defined as cyclic, which
we don't want here. Way better than n.2^n, but it still won't fit on
your hard disk if n gets too far above 40. 

Can we do better? Probably yes. Starting with an empty file and 
iteratively appending the file hash to the file, until a previous hash 
occurs, will produce, with high probability (if I'm not mistaken), a 
file with approximately 2^{n/2} hashes, so approximately n.2^{n/2} bits, 
and this does the job. Probably with a De Bruijn-type idea (i.e. taking 
for the hashes bitstrings of length n starting at any offset) we again 
can remove the factor n.

So I would say that the question should be rephrased in asking for
a solution with small enough space and time complexity, such as their
product being essentially better than 2^n.

Benne de Weger




More information about the cryptography mailing list