[Cryptography] Zero Knowledge: Have I Been Pwned?

Sun Sep 10 14:25:30 EDT 2017

FYI --

https://www.troyhunt.com/introducing-306-million-freely-downloadable-pwned-passwords/

Introducing 306 Million Freely Downloadable Pwned Passwords

03 August 2017

"The entire collection of 306 million hashed passwords can be directly downloaded from the Pwned Passwords page.  It's a single 7-Zip file that's 5.3GB which you can then download and extract into whatever data structure you want to work with (it's 11.9GB once expanded)."

https://haveibeenpwned.com/Passwords

--------

Ok, all you crypto wizards: here's a real-world problem that needs to be solved.

I don't think that it is safe to type a password into the HIBP (Have I Been Pwned) page in order to check it.  Why?  Because even if it was safe *before* I typed it in, it won't be *after* I typed it in.

I also don't think that it is safe to type a SHA1 hash of a password into the HIBP either.  Why?  Because the database contains the complete list of pairs (password,SHA1(password)), so inverting these particular hashes is trivial, so this is equivalent to simply typing in the unhashed password.

Yes, I could download 5.3GB of data & decompress it to 11.9GB & search it myself, and never reveal what password(s) I'd like to check.  But I'd rather not download 5.3GB of data.

Soooooo...

What would be a good protocol for the HIBP site itself, and a good protocol for anyone who wants to query it?

Some desiderata for the protocol:

* All I learn from my query is whether or not the password is the database -- i.e., exactly 1 bit.
* All the HIBP database learns is that there *has* been a query, but can't determine what the query was, or whether it was successful.
* The total number of bits transmitted in both directions should be a number of orders of magnitude less than 5.3GB.

Any suggestions?