"Approximate" hashes

Wed Sep 1 15:27:43 EDT 2004

> -----Original Message-----
> From: owner-cryptography at metzdowd.com 
> [mailto:owner-cryptography at metzdowd.com] On Behalf Of Marcel Popescu
> Sent: Wednesday, September 01, 2004 9:56 AM
> To: cryptography at metzdowd.com
> Subject: "Approximate" hashes
> 
> I am trying to build a Windows anti-spam thingy; it's 
> supposed to "sit" in
> between the mail client and the outer world, and indicate through mail
> headers whether the incoming mail has a valid hashcash
> http://www.hashcash.org/ "coin" (and, of course, to automatically add
> hashcash to outgoing emails).
> 
> My problem is that I don't know what happens with the email in transit
> (this, I believe, is an observation in the hashcash FAQ). I 
> am worried that
> some mail server might dislike ASCII characters with the high 
> bit set, or
> that a client uses some encoding which for some reason 
> doesn't make it to
> the destination unchanged.
> 
> Hence my question: is there some "approximate" hash function 
> (which I could
> use instead of SHA-1) which can verify that a text hashes 
> "very close" to a
> value? So that if I change, say, tabs into spaces, I won't 
> get exactly the
> same value, but I would get a "good enough"?
> 
> I don't know if this is possible. But if it is, I though this 
> would be a
> good place to find out about it.

nilsimsa

Computes nilsimsa codes of messages and compares the codes and finds
clusters of similar messages so as to trash spam.

What's a nilsimsa code?

A nilsimsa code is something like a hash, but unlike hashes, a small change
in the message results in a small change in the nilsimsa code.

http://lexx.shinn.net/cmeclax/nilsimsa.html

 --
Keith Ray <keith at nullify.org> -- OpenPGP Key: 0x79269A12

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com