Logging of Web Usage

Bill Stewart bill.stewart at pobox.com
Fri Apr 4 20:10:02 EST 2003

At 11:32 AM 04/03/2003 -0800, Bill Frantz wrote:
>Ah yes, I haven't updated my timings for the new machines that are faster
>than my 550Mhz.  :-)
>The only other item is importance is that the exhaustive search time isn't
>the time to reverse one IP, but the time to reverse all the IPs that have
>been recorded.

Also, until recently, there was the problem that storing a hash value
for every IP address took 8-10 bytes * 2**32, and the resulting 32-40GB
was an annoyingly large storage quantity, requiring a deck of Exabyte tapes
or corporate-budget quantities of disk drive, which also meant that
sorting the results was also awkward.  These days, disk drive prices
are $1/GB at Fry's for 3.5" IDE drives, so there's no reason not to have
120GB on your desk top.

This does mean that if you're keeping hashed logs you should probably
use some sort of keyed hash - even if you don't change the keys often,
you've at least prevented pre-computed dictionary attacks over the
entire IPv4 address space, and the key should be long enough (e.g. 128 bit)
so that dictionary attacks on the "IP addresses of Usual Suspects"
also can't be precomputed.

A related question is keeping lists of public information,
e.g. don't-spam lists, in some form that isn't readily abusable,
such as hashed addresses.  The possible namespace there is much larger,
but the actual namespace isn't likely to be more than a couple of billion,
in spite of the number of spammers selling their lists of 9 billion names.
There's the question of how exact a match do you need -
if mail is for alice+tag1 at example.com, you'd ideally like to be able to check
alice+tag1 at example.com, alice at example.com, and @example.com,
which makes the lookup process more complex.

The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at wasabisystems.com

More information about the cryptography mailing list