objectivity and factoring analysis

Nicko van Someren nicko at ncipher.com
Mon Apr 22 04:50:32 EDT 2002

Anonymous wrote:

> Nicko van Someren writes:
>>The estimate
>>of the cost of construction I gave was "some hundreds of
>>millions of dollars", a figure by which I still stand.
> But what does that mean, to specify (and stand by) the cost of
> construction of a factoring machine, without saying anything about how
> fast it runs?  Heck, we could factor 1024 bit numbers with a large abacus,
> if we don't care about speed.  A cost figure is meaningless unless in the
> context of a specific performance goal.

You'd need a large abacus and a vary large stack of paper...

If you had read the Bernstein proposal in detail you would
understand that (among other things) it details the conceptual
design of a machine for computing kernels in a large, sparse
matrix.  The design talks of the number of functional units and
the nature of the communication between these units.  What I
set out to do was look at how complex those units are and how
hard it would be to connect them and to place a cost on this.

>>I was then asked how fast this machine would run and I tried
>>to do the calculation on the spot without a copy of the
>>proposal to hand, and came up with a figure on the order
>>of a second based on very conservative hardware design.
>>This figure is *wildly* erroneous as a result of both not
>>having the paper to hand and also not even having an
>>envelope on the back of which I could scratch notes.
> And yet here you say that it took you completely by surprise when someone
> asked how fast the machine would run.  In all of your calculations on the
> design of the machine, you had apparently never calculated how fast it
> would be.

I'm sorry, but I don't think at infinite speed.  I started
this process after lunch and the panel session started at
1:30pm.  I did say that this was an impromptu session.

> How could this be?  Surely in creating your hundreds of millions
> of dollars estimate you must have based that on some kind of speed
> consideration.  How else could you create the design?  This seems very
> confusing.

See my comments above.  The costing was based on transistor count
and engineering costs.  The design suggested in the Bernstein
proposal does not have a simple size/time trade-off since the size
of the system is proscribed by the algorithm.

> And, could you clarify just a few more details, like what was the size
> you were assuming for the factor base upper bounds, and equivalently for
> the size of the matrix?

I used the number 10^9 for the factor base size (compared to about
6*10^6 for the break of the 512 bit challenge) and 10^11 for the
weight of the matrix (compared to about 4*10^8 for RSA512).  Again
these were guesses and they certainly could be out by an order of

>  This would give us a better understanding of the
> requirements you were trying to meet.  And then, could you even go so far
> as to discuss clock speeds and numbers of processing and memory elements?
> Just at a back of the envelope level of detail?

OK, here are the numbers I used.  Again I preface this all with
it being order of magnitude estimates, not engineering results.
It's based on a proposal, not a results paper, and there are
doubtless numerous engineering details that will make the whole
thing more interesting.

The matrix reduction cells are pretty simple and my guess was
that we could build the cells plus inter-cell communication
in about 1000 transistors.  I felt that, for a first order guess,
we could ignore the transistors in the edge drivers since for a
chip with N cells there are only order N^(1/2) edge drivers.
Thus I guessed 10^14 transistors which might fit onto about 10^7
chips which in volume (if you own the fabrication facility) cost
about $10 each, or about $10^8 for the chips.  Based on past work
in estimating the cost of large systems I then multiplied this
by three or four to get a build cost.

As far at the speed goes, this machine can compute a dot product
in about 10^6 cycles.  Initially I thought that the board to
board communication would be slow and we might only have a 1MHz
clock for the long haul communication, but I messed up the total
time and got that out as a 1 second matrix reduction.  In fact to
compute a kernel takes about 10^11 times longer.  Fortunately it
turns out that you can drive from board to board probably at a
few GHz or better (using GMII type interfaces from back planes
of network switches).  If we can get this up to 10GHz (we do have
lots to spend on R&D here) we should be able to find a kernel in
somewhere around 10^7 seconds, which is 16 weeks or 4 months.

There are, of course, a number of other engineering issues here.
One would want to try to work out how to build this machine to
be tolerant of hardware faults since getting 10^7 chips to all
run faultlessly for months at a time is tough to say the least.
Secondly, we are going to need some neat power reduction
techniques in these chips to dissipate the huge power that it
needs, which will likely run to 10^8 watts or so so the facility
will look a bit like a steel foundry.

Lastly, I want to reiterate that these are just estimates.  I
give them here because you ask.  I don't expect them to be used
for the design of any real machines; much more research is
needed before that.  I do however think that they are rather
more accurate than my first estimates.

I hope this helps.


The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com

More information about the cryptography mailing list