objectivity and factoring analysis

Fri Apr 26 17:32:08 EDT 2002

Nicko van Someren writes:
> I used the number 10^9 for the factor base size (compared to about
> 6*10^6 for the break of the 512 bit challenge) and 10^11 for the
> weight of the matrix (compared to about 4*10^8 for RSA512).  Again
> these were guesses and they certainly could be out by an order of
> magnitude.

In his paper Bernstein uses a relatively larger factor base than in
typical current choices of parameters.  It's likely that the factor
bases which have been used in the past are "too small" in the sense that
the linear algebra step is being limited by machine size rather than
runtime, because of the difficulty of parallelizing it.  For example in
http://www.loria.fr/~zimmerma/records/RSA155 we find that the sieving took
8000 mips years but the linear algebra took 224 CPU hours on a 2GB Cray.
If there were a larger machine to do the matrix solution, the whole
process could be accelerated, and that's what Bernstein's figures assume.

Specifically he uses a factor base size of L^.7904, where L for 1024 bit
keys is approximately 2^45.  This is a matrix size of about 50 billion,
50 times larger than your estimate.  So a closer order of magnitude
esimate would be 10^11 for the factor base size and 10^13 for the weight
of the matrix.

> The matrix reduction cells are pretty simple and my guess was
> that we could build the cells plus inter-cell communication
> in about 1000 transistors.  I felt that, for a first order guess,
> we could ignore the transistors in the edge drivers since for a
> chip with N cells there are only order N^(1/2) edge drivers.
> Thus I guessed 10^14 transistors which might fit onto about 10^7
> chips which in volume (if you own the fabrication facility) cost
> about $10 each, or about $10^8 for the chips.  Based on past work
> in estimating the cost of large systems I then multiplied this
> by three or four to get a build cost.

The assumption of a larger factor base necessary for the large asymptotic
speedups would increase the cost estimate by a factor of about 50.
Instead of several hundred million dollars, it would be perhaps 10-50
billion dollars.  Of course at this level of discussion it's just as
easy to assume that the adversary spends $50 billion as $500 million;
it's all completely hypothetical.

> As far at the speed goes, this machine can compute a dot product
> in about 10^6 cycles.

Actually the sort algorithm described takes 8*sqrt(10^11) or about 2.5 *
10^6 cycles, and there are three sorts per dot product, so 10^7 cycles
would be a better estimate.

Using the larger factor base with 10^13 entries would imply a sort
time of 10^8 cycles, by this reasoning.

> Initially I thought that the board to
> board communication would be slow and we might only have a 1MHz
> clock for the long haul communication, but I messed up the total
> time and got that out as a 1 second matrix reduction.  In fact to
> compute a kernel takes about 10^11 times longer.  Fortunately it
> turns out that you can drive from board to board probably at a
> few GHz or better (using GMII type interfaces from back planes
> of network switches).  If we can get this up to 10GHz (we do have
> lots to spend on R&D here) we should be able to find a kernel in
> somewhere around 10^7 seconds, which is 16 weeks or 4 months.

Taking into consideration that the sort algorithm takes about 8 times
longer than you assumed, and that "a few" minimal polynomials have to
be calculated to get the actual one, this adds about a factor of 20
over your estimate.  Instead of 4 months it would be more like 7 years.
This is pretty clearly impractical.

Apparently Ian Goldberg expressed concerns about the interconnections
when the machine was going to run at 1 MHz.  Now it is projected to run
10,000 times faster?  That's an aggressive design.  Obviously if this
speed cannot be achieved the run time goes up still more.  If only 1
GHz can be achieved rack to rack then the machine takes 70 years for one
factorization.  Needless to say, any bit errors anywhere will destroy the
result which may have taken years to produce, requiring error correction
to be used, adding cost and possibly slowing the effective clock rate.

Using the larger factor base from the Bernstein paper would increase
the time to something like 10^11 seconds, thousands of years, which is
out of the question.

> Lastly, I want to reiterate that these are just estimates.  I
> give them here because you ask.  I don't expect them to be used
> for the design of any real machines; much more research is
> needed before that.  I do however think that they are rather
> more accurate than my first estimates.

These estimates are very helpful.  Thanks for providing them.  It seems
that, based on the factor base size derived from Bernstein's asymptotic
estimates, the machine is not feasible and would take thousands of years
to solve a matrix.  If the 50 times smaller factor base can be used,
the machine is on the edge of feasibility but it appears that it would
still take years to factor a single value.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com