economics of spam (Re: A Trial Balloon to Ban Email?)

Tue May 13 15:50:17 EDT 2003

To respond on the comments on costs of spamming and costs of CPU, the
figures one can draw from various papers and articles are highly
variable, one suspects they are variously including operator time,
electricity, spam software purchase, and email address list purchase.

To bring it back to just the raw computational costs (equipment
amortized plus electricity) lets do some rough estimates for this.

To take Tim's estimate $500 machine amortized over 2 years seems
entirely reasonable, say this machine has a 1Ghz CPU.  I'll add ADSL
line $500/year for a 1Mbit uplink, and say $200/year in electricity
for a total of $950/year.  For spamming without hashcash let's say
that it can send customized mail messages of size 1KB each, and by
pipelining it manages to max the link and send 64 messages/second.
(Divide by 2 to account for unreachable addresses, etc).

I make that 0.00005c / message.  Presuming the same machine is mostly
unloaded, and the spammer wants to send the same number of mails he
needs a bank of 63 additional CPUs each at a cost of 450/year
(amortized cost+electricity) for a total of $29300/year, so now his
spamming costs 0.0015 / message, and the purely computational costs
have increased by a factor of 30.  On could imagine this would reduce
the amount of untargetted spam a lot.  Clearly you will still receive
spam, just less of it, or more targetted to be likely to interest you
etc.

Other issues include that perhaps the spammer can get bandwidth
cheaper per Mbit if he needs more than 1Mbit, which would tend to
reduce purely computational cost of spamming (without tokens).

A 1 second CPU cost on a 1Ghz machine should be negligible and
acceptable to an email user even if the computation happens while he
waits after he clicks the send button.  If he is on a DSL or similar
it could be backgrounded.  On dialup delivery is slow anyway and a
second probably wouldn't be noticed.  Dialup users also often batch
their mail sending (deliver later from a local MUA maintained queue).

An additional cost for spammers is acquiring the email lists.  However
this cost can be amortized across multiple spamming campaigns on
behalf of different spam clients, and mostly seems to consist of
emails gathered from a web spider if one takes the claims of the CDT
spam report, so is itself just a bandwidth cost.

We could probably as was previously noted get away with a marginally
larger delay if tokens are only required to recipients who have never
replied to us in the past.

If one accepts these figures, at 1 second CPU per sent mail for new
recipients, perhaps it may even be economical for ISPs to do the
computation as part of mail service.

If we could think of a distributed way to precompute the token and yet
still have distributed verification without infrastructure, we could
increase the cost to 5 mins without normal users noticing.  It is not
obvious how one would do this however as unless the entire computation
is tailored to the recipient, parts of the computation could be
re-used across multiple recipients.  

As Tim notes' Moore's law requires that we increase the collision cost
over time.  (But this is not so hard to do -- I can think of a simple
fully scalable mechanisms to achieve this slowly increasing
distribution of a minimum bit collision).

The possibility for accelerator hardware is definitely a limiting
factor.  Counter-measures to this which have been suggested include
(a) changing the algorithm over time with an authenticated code update
mechanism; (b) defining a cost function which makes use of features of
general purpose computers -- eg. IEEE floating point hardware, memory,
cache, larger code footprint algorithm etc.  This could in theory mean
that absent sufficient market general purpose CPUs remain the most
cost effective approach; (c) memory bound functions such as [1] which
are limited by memory latency rather than CPU speed.  Memory bound
functions have their own economic arguments (see conclusions section
of the paper): perhaps accelerator hardware is also a problem because
all you need is a memory chip, plus a really cheap CPU; they mean the
most cost effective hardware to buy is the cheapest CPU and so perhaps
2 or 3 times cheaper than best Mhz/$; plus they intentionally consume
memory data footprint which can interfere with applications.

Another possibility with accelerator hardware; if ISPs were the
primary deployers, then they are better positioned to buy accelerator
hardware to compete head on with spammers.

Adam

[1] http://research.microsoft.com/research/sv/PennyBlack/demo/lbdgn.pdf

C. Dwork, A. Goldberg, and M. Naor, "On Memory-Bound Functions for
Fighting Spam", Proceedings of CRYPTO 2003, to appear.

On Mon, May 12, 2003 at 09:18:25PM -0400, Tim Dierks wrote:
> At 04:45 PM 5/12/2003, Adam Back wrote:
> >Whether you think a few seconds is sufficient depends on your views of
> >the economics of spamming.  Ie how close to losing break-even the
> >spammers are, and whether a few seconds of CPU per message is enough
> >to significantly increase the cost.  This article for example
> >discusses the economics of spam:
> >
> >http://www.eprivacygroup.com/article/articlestatic/58/1/6
> >
> >they give an example of a spam campaign with a 0.0023% response rate,
> >and a yeild of $19 per response.  They estimate the cost of sending
> >the spam was less than 0.01c per message.  I've seen significantly
> >lower estimates for the sending costs.  To deter a given spam campaign
> >we just have to increase the cost to the point of making it
> >unprofitable given the response rate and profit per responder.  The
> >other side of this equation is what a second of CPU costs in monetary
> >terms to a spammer.
> 
> Assuming that a CPU costs $500 and that its value can be amortized over 2 
> years, CPU costs .0016 cents/second.
> 
> Based on the numbers enough, the revenue/spam sent is .044 cents. Thus, the 
> breakeven point is 27.6 seconds/message: assuming other costs are minimal, 
> you have to require > 27.6 seconds of CPU calculation from an email 
> submittant to ruin the spamming business model.
> 
> A few thoughts on this:
>   - You have to adjust the size of the calculation frequently to keep up 
> with Moore's law (although the time/$500 CPU is constant, assuming constant 
> profitability for spam)
>   - If spammers have new technology or economies of scale available to 
> them, it's going to adversely affect everyone else. (That is, if you're 
> using an 18-month-old CPU and CPU-seconds cost you twice what they cost in 
> the volume it costs spammers, your $500 computer will have to spend 2 
> minutes of time to calculate a token it takes a spammer 30 seconds to 
> calculate).
>   - This is going to dramatically increase the costs of sending bulk e-mail 
> for non-spammers: for example, I get airline specials a few times a week; 
> they must send millions of these.
>   - The CPU time required here is several orders of magnitude larger than 
> the cryptographic costs associated with SSL, and SSL is not broadly 
> accepted at least in part due to the CPU cost associated with with it; this 
> implies to me that there will be substantial resistance.
>   - The CPU costs associated with SSL engendered a substantial market in 
> cryptographic accelerators intended to reduce the cost to do an RSA private 
> key operation. Presumably, a system like this will create such a market for 
> e-mail token accelerators: unfortunately, this is exactly the kind of new 
> tech / economy of scale envisioned above: we may end up with a situation 
> where a calculation which costs a spammer .044 cents will take the average 
> user's CPU 10 minutes or more to calculate.

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com