[Cryptography] RAM needed for zero knowledge proofs

Jan Carlsson jancarlsson at Safe-mail.net
Sat Feb 14 18:02:25 EST 2015


I have some intuition about practical computing requirements for zero
knowledge proofs from working on snarklib/snarkfront.

For example, take a Merkle tree of depth 64 using the SHA-256 compression
function and a 128 bit Barreto-Naehrig elliptic curve pairing. The zero
knowledge proof is for an authentication path from a commitment leaf back
up to the root of the tree. How big are the cryptographic structures?

number of rank-1 constraints: 6654237

group G1 element is 96 bytes
group G2 element is 192 bytes
pairing<G1, G1> element is 192 bytes
pairing<G2, G1> element is 288 bytes

(the proving key)
query vector A size: 6569312 of pairing<G1, G1> is 1.3 GB
query vector B size: 6569312 of pairing<G2, G1> is 1.9 GB
query vector C size: 6569312 of pairing<G1, G1> is 1.3 GB
query vector H size: 8388609 of G1 is 0.8 GB
query vector K size: 6569312 of G1 is 0.6 GB

(look up tables needed to calculate proving key)
G1 exponent count is 38647176
windowed exponentiation LUT of 12 x pow(2, 22) is 4.8 GB
G2 exponent count is 4831575
windowed exponentiation LUT of 15 x pow(2, 17) is 0.4 GB

The proving key (query vectors A, B, C, H, K) is about 6 GB.
The look up tables used to calculate the key are over 5 GB.

The Zerocash pour transaction uses two Merkle tree authentication paths
corresponding to the two spent coins. The proving key for Zerocash coin
pouring is more than twice as large (if a single ZKP is used for the
entire transaction).

This is an issue as most consumer laptops and desktops have 4 GB to 8 GB
of RAM. Even if 16 GB is available, that is really a minimum requirement.
It is a problem for me as I do not have any computer with so much RAM.

I was forced to use map-reduce techniques to trade time for space. The
calculation is partitioned into blocks in files on disk instead of in RAM.

For this example, RAM use remains between 500 MB and 2 GB if the query
vectors are partitioned into 16 blocks. The G1 windowed exponentiation
table is partitioned into 8 blocks (some are larger than others). Note
this partitioning is just arbitrary numbers I picked. My past experience
is that compute kernels often have convexity in time complexity. There
will be tuning parameters which give optimal performance. However, you must
search to find it... and it will vary for each computer configuration.

Just a little about time.

My laptop with 4 GB RAM and the CPU locked at 1.2 GHz (so it does not
become hot) took about 7 hours to build the constraint system, generate the
key pair, generate a proof, and verify it. This is on one core. If multiple
cores were used (or the clock rate increased), time would be much less.

Most of the time is spent generating the proving key and verification key.
This cost is less significant than the proof generation which affects all
users. Proof generation took about 25 minutes. If the CPU were running at a
faster clock rate and the witness values were calculated concurrently, the
time required would be around 5 minutes.


More information about the cryptography mailing list