<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jul 8, 2015 at 12:13 PM, John Tromp <span dir="ltr"><<a href="mailto:john.tromp@gmail.com" target="_blank">john.tromp@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class="">> On the downside, to even achieve<br>

> memory parity with the main algorithm, it already incurs a big slowdown. To<br>

> the extent that this slowdown is unavoidable, it can be called the memory<br>

> hardness of the proof-of-work."<br>

><br>

> This is an incorrect notion of memory-hardness.  Specifically, the goal is<br>

> to defeat attackers who have a huge number of cheap parallel cores, whether<br>

> they are using a GPU, FPGA, or ASIC attack.  Scrypt's notiion of<br>

> memory-hardness where multiple computing cores provides no more than a<br>

> constant advantage in memory-use * runtime is the proper definition.<br>

<br>

</span>I would prefer to call that parallelization hardness.  Memory hardness should ideally be about the suffering caused by having less<br>

than the designated amount of memory available.</blockquote><div><br></div><div>For consistency with existing terminology as used to describe Scrypt and other "sequential-memory-hard" password hashing schemes, why don't we call this a "memory-intensive TMTO-resistant PoW with fast verification"?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class=""><br>

> If I just implement an actual Cuckoo hash table using a parallel sorting<br>

> machine, and do the on-average constant time insertions mostly in parallel,<br>

> wont I detect all the Cuckoo cycles?  How much faster is your implementation<br>

> than this basic algorithm?<br>

<br>

</span>This is already implemented and results are shown in the section<br>

called "Parallelization". I don't understand why you need sorting though.<br></blockquote><div><br></div><div>I was trying to show that an ASIC based miner doing Momentum computations would have essentially an unfair advantage vs CPUs, leading to an inevitable switch to ASIC based mining if Momentum were used as the PoW for a PoW mining based crypto-currency.  Parallel sorting is enough to work out well enough for the proof, I think.</div><div><br></div><div>Now that I've been pointed to <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.96.4820">faster the parallel Pollard rho-algorithm</a> for finding hash collisions (thanks for the link, Jonathan!), I think Momentum can be sped up linearly with the number of processors.  Momentum does not seem parallelization-hardened</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

Unfortunately, I have no access to a working Xeon Phi<br>

multicore cpu that would allow me to try more than 40 threads.<br>

I do expect the parallel speedup to flatten out at around a hundred threads.<br>

For AMD Opteron based servers, it already flattened at 30.<br>

<br>

regards,<br>

-John<br>

_______________________________________________<br>

The cryptography mailing list<br>

<a href="mailto:cryptography@metzdowd.com">cryptography@metzdowd.com</a><br>

<a href="http://www.metzdowd.com/mailman/listinfo/cryptography" rel="noreferrer" target="_blank">http://www.metzdowd.com/mailman/listinfo/cryptography</a></blockquote></div><br></div><div class="gmail_extra">I am more interested in massively parallel attacks.  If I'm not mistaken, using lots of memory had providing good defense against parallel attacks is what we need to defend against ASIC based mining.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Bill</div></div>