[Cryptography] Just in case it isn't obvious...

Patrick Chkoreff patrick at rayservers.net
Mon Feb 27 17:20:20 EST 2017


Jerry Leichter wrote on 02/27/2017 01:44 PM:
 problem back when MD5 failed!
> 
> De-duplication engines may well have the same issues.  These run in
> backup programs, in cloud services (e.g., Dropbox) - even (very
> slightly different use case) in rsync.  In some cases, these are very
> high performance hardware boxes which likely do their hashing in
> dedicated hardware.
> 
> Compared to these, fixing git is child's play.

Yes, I once considered doing some hash-based de-duping of my own, and I
immediately thought I had better treat it like an ordinary old-school
1970s-style hash table with the possibility of collisions, maintaining a
linked list of collisions, and testing new content literally
byte-for-byte against the entries in the list.

It would be hard to test with a strong hash function, but I planned to
test it by just using the first byte of the hash value during
development so it would be very easy to generate collisions.

I never got around to caring enough to do it though.


-- Patrick


More information about the cryptography mailing list