[Cryptography] Removal of spaces in NIST Draft SP-800-63B

Sun Apr 9 01:45:35 EDT 2017

Arnold, apologies for the lapse in time in my responding to your last reply.

On Tue, Apr 4, 2017 at 12:42 PM, Arnold Reinhold <agr at me.com> wrote:
>
>> On Apr 3, 2017, at 9:00 PM, Kevin W. Wall <kevin.w.wall at gmail.com> wrote:
>>
>> On Mon, Apr 3, 2017 at 11:10 AM, Arnold Reinhold <agr at me.com> wrote:
>>> …
<big snip>
>
>>> I’m not sure hashing answers helps much. If
>>> an attacker gets a hold of the hash value,
>>> the universe of possible answers to test
>>> could be quite small by cryptographic standards. (All
>>> movie, book and song titles, for example, all city
>>> names, all valid address in the U.S., all names in a
>>> phone book -- testing these against a known hash
>>> output would be quite easy.) Chained encryption with
>>> random initial padding might make more sense, since
>>> the data would be accessed infrequently. The account
>>> creation software could encrypt answers with a public
>>> key and the decryption could take place in a special
>>> server used only for answer verification. Having the
>>> decrypted plaintext answers available would let a
>>> human intervene if needed.
>>
>> True in general, but where I think it helps is in terms
>> of customized (i.e., user created) questions.  I always
>> tell people that if you have the opportunity to create
>> a custom question, use that option instead and then
>> pick a topic that only you know about. I personally
>> recommend something that an individual might find quite
>> embarrassing, because those are details that they have
>> generally NOT widely shared. An example might be (note,
>> I am making this completely up):
>>  Q: What did your father do to you when he found you
>>     with his Playboy magazine?
>>  A: He made me run up and down our driveway, naked.
>>
>> When a company allows user-created questions, then I
>> recommend that they encrypt the questions and hash the
>> answers. The reason is that way, it makes it harder for
>> insiders to read the questions and much, much more
>> difficult for them to discover the answers. (Also, requiring
>> user created questions shifts some of the liability back to
>> the user. They can no longer complain that you only had
>> lame questions to choose from that only had a small set
>> of possible answers.)
>>
>> But for the ordinary lame q's: "What's your favorite sports
>> team?" or "What's your favorite flavor of ice cream?",
>> etc., you are spot-on. Hashing does no good. (Of
>> course, that's the type of questions that the OWASP
>> "Choosing and Using Security Questions Cheat Sheet" is
>> meant to prevent in the first place.)
>
> Hashing does no good for simple answers and it isn’t suitable for complex answers.
>
>    Initial answer "He made me run up and down our driveway, naked.”
>
>    Challenge answer: "He made me run naked up and down our driveway.”
>
> Hashes would not match. Encryption would allow easy human intervention and I suspect current language understanding software could match up the two answers. Even a simple algorithm such as sorting the words alphabetically and calculating a correlation might work well enough in many cases. The goal would be to avoid human intervention in most reset requests. You can’t rely on humans remember the exact way they answered a complex question.  “I had to exercise in front of the house with no clothes on” might still take a human to verify.

In reality, the only people that are that conscientious are those who
write their security Q&A into a password manager. (And those who do
that are also the least likely to have to use the "Forgot Password"
flow in the first place.) So in practice, this seems to work, even
though it doesn't in theory.

>
> In every case, it seems to me, hashing is NEVER right for security Q/A. Maybe you could update your cheat sheet?

Nope; not going to update the cheat sheet and here's why. You're
thinking about this from purely a cryptographic or UX perspective.
This approach evolved after extensive discussions with those in our
legal department in my old company and was approved by them.

Soon after legal had corporate security update the company security
policy related to the "forgot password" processing that suggested
customers be offered the ability to create at least one custom
security question, some of development teams started doing this by
storing both security questions and answers in plaintext. We
(corporate security) were made aware and investigated.

A while later, an observation was made that some users were defining
questions like

    What is my social security number?
or
    What is my bank account number?

whose answers involved confidential customer data, but to our
customers probably seemed like a pretty secure answer and ones they
would remember.

At first, corporate security just suggested encrypting both Qs and As,
but then we soon afterwards realized that someone with access to the
DB and the encryption key could decrypt said answers.

So we made our legal department was made aware of this situation. (At
the time, corporate security was under the legal department, so
reporting to them was a rather natural and expected occurrence.) One
of the reasons that legal endorsed customized questions rather only
canned questions in the first place was to shift liability back over
to the users. They didn't want a user to sue on grounds that all of
the questions used for password reset were lame and all the possible
answers only had either a limited answer space (I mean, how many
sports teams are there, really?) or one answers that were relatively
easy to research (e.g., what street someone grew up on or where
someone was born). That was true of many of the canned questions the
development teams were using.

when legal was made aware that customers were using personal questions
with answers involving confidential data, they also had developers
give "help" in the form of some advice of examples of bad questions to
avoid. However, they also wanted to make it impossible that any rogue
insiders could not sneak a peak at the security questions and answers
by decrypting them. The also obviously were concerned about an
external data breach via SQLi, etc.

Thus legal wanted development teams to follow due diligence and best
security practice and treat customized security questions and answers
as restricted data, just like passwords.

Obviously we could not hash the (custom) security questions, so those
were encrypted, but the answers to the custom security questions had
to be hashed with a salt, just like passwords. (IIRC, the canned
answers didn't _have_ to be hashed, but all the dev teams just hashed
and salted them as well just so they could use uniform code to handle
them all.)

Anyhow, long story short, the UX problems that you mentioned about
hashing certainly still exist, but in practice don't seem to occur
very often, and it made legal happy. Sure, for Qs like "What's my
SSN?", an attacker could hash (for example) all conceivable SSNs along
with the salt, but at least it made legal happy.

Now IANAL, but the legal department that I worked for at least thought
hashing security the answers to _custom_ security questions would
limit liability. Maybe it does, maybe it doesn't.

-kevin
-- 
Blog: http://off-the-wall-security.blogspot.com/    | Twitter: @KevinWWall
NSA: All your crypto bit are belong to us.