[Cryptography] SHA1 collisions make Git vulnerable to attakcs by third-parties, not just repo maintainers

Jason Cooper cryptography at lakedaemon.net
Thu Feb 23 21:56:28 EST 2017

Hi Peter,

On Thu, Feb 23, 2017 at 01:14:09PM -0500, Peter Todd wrote:
> Worth noting: the impact of the SHA1 collison attack on Git is *not* limited
> only to maintainers making maliciously colliding Git commits, but also
> third-party's submitting pull-reqs containing commits, trees, and especially
> files for which collisions have been found. This is likely to be exploitable in
> practice with binary files, as reviewers aren't going to necessarily notice
> garbage at the end of a file needed for the attack; if the attack can be
> extended to constricted character sets like unicode or ASCII, we're in trouble
> in general.
> Concretely, I could prepare a pair of files with the same SHA1 hash, taking
> into account the header that Git prepends when hashing files. I'd then submit
> that pull-req to a project with the "clean" version of that file. Once the
> maintainer merges my pull-req, possibly PGP signing the git commit, I then take
> that signature and distribute the same repo, but with the "clean" version
> replaced by the malicious version of the file.

As you mentioned in your follow-on, tree objects are ripe for this
abuse.  I'd suggest folks interested in this read over Git From the
Bottom Up[1], and the git-manual section on object construction[2].

A few other points to mention here to avoid the "sky is falling" cries.

 * git is distributed by nature.  Once a sufficient number of people
   have cloned the repo, you have some defense/discovery method
   available.  Albeit non-cryptographic. "Hey Bob, Alice here.  I did a
   fresh clone today and my binaries are comming out differently.  Can
   you send me a sha2-384 hash of path/to/object/in/question?"

 * Adjoining the first point, the only poor sod who gets the malicious
   commit are the ones cloning the tree as opposed to updating (those
   who don't have the hash of interest and need to downloaded it).
   Ergo, targets of interest are the most frequently cloned from:
   kernel.org, github.com, gitlab, etc.

Just to be clear, this is now a *real* problem.  How long it takes from
spotting an object of interest to creating a replacement object is the
critical variable here.  The longer it takes to create, the more
time people have to get a legit copy of the object before the malicious
one can be injected.  Large projects with a plethora of objects (Linux
Kernel) need to start the timer now.  Although, that's tempered by the
fact that the juiciest targets are the new objects that no one has.

/me grumbles because majordomo is ignoring my git ml subscribe



[1] https://jwiegley.github.io/git-from-the-bottom-up/
[2] https://git-scm.com/book/en/v2/Git-Internals-Git-Objects
[3] Probably CCS, Crusty Config Syndrome.  e.g. a stale filter rule from
the last time I was subscribed.

More information about the cryptography mailing list