[Cryptography] Zero Knowledge: Have I Been Pwned?

Mon Sep 11 06:48:27 EDT 2017

On 09/10/2017 11:25 AM, Henry Baker wrote:

> https://www.troyhunt.com/introducing-306-million-freely-downloadable-pwned-passwords/

Thanks for calling attention to that.

> What would be a good protocol for the HIBP site itself, and a good protocol for anyone who wants to query it?
> 
> * All I learn from my query is whether or not the password is the database -- i.e., exactly 1 bit.
> * All the HIBP database learns is that there *has* been a query, but can't determine what the query was, or whether it was successful.
> * The total number of bits transmitted in both directions should be a number of orders of magnitude less than 5.3GB.

That's a fascinating question.  I reckon Bill Frantz gave the right
answer:  Download a Bloom index rather than the whole corpus.  Also,
have one or more trusted third parties sign off that (a) the corpus
was constructed in a reasonable way, and (b) the Bloom index was
constructed correctly.

As a tangentially-related issue:  to avoid /future/ compromises, we
should insist my password never be sent from my machine to anywhere
else.  Instead, as the Subject: line suggests, use a zero-knowledge
proof that I know the password (i.e. that I am using the /same/
password as the one previously set up and validated).

The password issue is as much a user-interface problem as anything
else.  Passwords do not scale well when the user interacts with N
servers.  This can be alleviated by using a password manager, but
from the user's point of view a password manager is indistinguishable
from a zero-knowledge proof manager.

There is a range of possibilities:
  *) Entirely avoid all forms of online activity that require passwords.
  *) Use one password for everything.
  *) Try to remember N different passwords.
  *) Use a password manager.
  *) Use zero-knowledge proofs.

Of these, only the first and last offer reasonable security
without being Pareto-inferior to other items on the list.  In
other words, if you are going to authenticate at all, there
is AFAICT no excuse for not using zero-knowledge methods.

On 09/10/2017 04:06 PM, Barney Wolff wrote:

>> I don't understand your concern with typing the SHA1 hash.  If you
>> get a hit you are going to change the password and never use it
>> again.  If you don't get a hit what can an attacker do with the hash?

Well, here are some scenarios of concern.  These are off-the-cuff
thoughts;  I'm sure a determined adversary could come up with even
nastier attacks:

1) Suppose 
 a) the bad guys own (or pwn) troyhunt.com
   or cloudflare (where troyhunt.com is hosted);
 b) collect all the queries (sha1 and otherwise); and
 c) test them against some corpus with more than 306 million entries.

If your password is in the big corpus but not the small one, you
get back a negative result, and you cannot detect any wrongdoing,
but the bad guys now have your IP address to go along with your
password, so you are now much more vulnerable than before.

2) Suppose you get back a positive result.  You are now in a race against
the owner (or pwner) of the testing site, to see whether you can change
the password faster than they can exploit it.

The bad guys can give themselves a head start by delaying the positive
result.

3) The bad guys can return a false negative result.  This could in
principle be detected by comparing online to offline results, but
I doubt anybody has checked for this.  And the bad guys could do
this selectively, making it even harder to check for.