[Cryptography] Just in case it isn't obvious...

Jerry Leichter leichter at lrw.com
Mon Feb 27 13:44:54 EST 2017


> The fundamental problem with fixing git is not that it commits to SHA1 as the One True Hash, it is that it assumes that there is a One True Hash to begin with.  That assumption is woven deeply into the structure of git, even down to its data representations.  Git repos have no place to store information about which has is being used, even on the repository level, let alone on the individual blob level.
I wonder how many more instances of this kind of design there are out there.  Many object store implementations (AKA "content-addressable storage") - some of which manage immense volumes of data - use SHA1 hashes to identify objects.  At least one of these - I forget the marketing name, EMC sells it - has been around long enough that it had to deal with this problem back when MD5 failed!

De-duplication engines may well have the same issues.  These run in backup programs, in cloud services (e.g., Dropbox) - even (very slightly different use case) in rsync.  In some cases, these are very high performance hardware boxes which likely do their hashing in dedicated hardware.

Compared to these, fixing git is child's play.

                                                        -- Jerry




More information about the cryptography mailing list