[Cryptography] Are Momentum and Cuckoo Cycle PoW algorithms broken?

John Tromp john.tromp at gmail.com
Wed Jul 8 16:04:00 EDT 2015


> "Percival introduced the notion of sequential memory-hardness (SMH), which
> is satisfied by his introduced password scrambler scrypt. Bases on this
> notion, an algorithm is sequential memory-hard, if an adversary has no
> computational advantage in using multiple CPUs, i.e., using b cores requires
> b times the effort used for one core."

Ok; but leaving out that crucial word "sequential" in your original
post threw me off...

> I read it, but I am focusing primarily on Momentum first, which is far
> simpler to analyze.  I see you have a mistake in your analysis of Momentum's
> memory-hardness:
>
> "Since the extreme case of L = 2 is so special, there is likely to be a
> greater variety of algorithms that are more efficient than for the general
> case. While we haven’t found (and don’t know of) a improved main algorithm,
> we did find an improved BFS(L/2) TMTO algorithm (implemented in
> momentomatum.cpp) that cuts the memory usage in half, resulting in a
> slowdown of only 1.75—a lack of memory-hardness."

Is there a mistake other than my notion of memory-hardness differing from
Percival's SMH?

> Can we focus on Momentum first, and see if there is something special about
> Cuckoo Cycle that avoids Momentum's flaws?

Ok. Your Momentum hypercube architecture may use 2 orders of magnitude
more memory compared with a Cuckoo-inspired one using 2^25 bits,
but worse, you cannot discard the cost of the communication
network. And you cannot have constant length connections.
Even a 3D implementation requires connections of length n^{1/3}.

> My parallel sort algorithm uses the same amount of memory, and I already
> took into account the latency bottleneck when the number of parallel
> processing nodes runs into the thousands.

When you a're ready to shift focus back to Cuckoo Cycle, I would like to see a
detailed description of your implementation avoiding latencies for
1000s of threads. I expect it will be very expensive compared to a
multicore-cpu+dram
commodity solution that I hope will be economically near-optimal.

> A major flaw I have not yet found any good solution for is the slowness of
> cryptographic hashing algorithms.  I checked the runtime of SHA256 using
> Intel's latest hardware instructions, and they just aren't fast enough
> (multiple cycles per byte rather than the other way around).  I think an
> ASIC attacker will gain a substantial advantage with far faster, lower-power
> implementations.  I have not seen a PoW system with fast verification yet
> that avoids this problem.

Cuckoo Cycle uses the much cheaper-to-compute siphash24 hash function,
which contributes only 1/3 of the total running time (the other 2/3
being memory latency). With native hardware support in the form of a
hypothetical cpu-instruction for a single siphash round (far cheaper
to implement than sha256) this 1/3 would reduce to mere percents.

-John


More information about the cryptography mailing list