[Cryptography] GOTO Considered Harmful

Fri Feb 28 17:22:13 EST 2014

Jerry Leichter wrote, On 02/28/2014 03:13 PM:

>> http://fexl.com/goto-considered-harmful
>> 
>> You're welcome.
> One of many, many ways to do this.

Right.

> The original code was more efficient as it didn't repeatedly test
> 'failed'.  Whether that makes any noticeable difference is something
> one can debate.  (Really, one should measure it.  ...

It's a good point about efficiency.  One should measure it, and see if
and where it matters -- and take the *utmost* precaution that the
pursuit of efficiency does not degrade the pursuits of reliability and
productivity.

> In any case, this is exactly why the goto-style is used by kernel
> hackers and others who feel they must be concerned with every
> instruction.

Right, and anyone who dogmatically opposes goto is clearly not an
assembly-language programmer.  :)

> This was a very unusual error, and the fact there was a 'goto' in the
> code was incidental.

That is true, and this may be more of a case study in the benefit of
*proper indentation*.  Along with test coverage, of course -- which,
*unbelievably* was absent.  The most perfunctory test case ought to
ensure that sslRawVerify is called!  How in the *world* could that be
missed?  That's the whole point of calling the routine in the first place!

>   A simple typo breaks yours, too.  Just
> accidentally change any '||=' to '='.  Or change the "if (!failed)"
> to "if (failed)".  Or change "err =" to "err ==".

Sure, any typo in code can be disastrous.  I've changed my code to avoid
||= and the "failed" flag, but of course it's still vulnerable to
swapping = with == in various ways.

> ... (dijkstra stuff, cool) ...

> As far as I'm concerned, the best recommendation here, which avoids
> changing code that's worked correctly for years prior to this bug, is
> to add a sanity check:
> 
> ... goto cleanup;
> 
> fail:
>   assert(err != 0);  // Whatever you want here to die *in
>          production code*!
> 
> cleanup: SSLFreeBuffer(...) ... return err;

That's an excellent recommendation.  Assertions are a great way to help
protect ourselves from our own mistakes, along with several other
redundancy techniques.

The thing is, I never use goto in my code.  It's not that I've gotten
religious about it, it just never crosses my mind.  I always develop
iteratively with well-formed structural elements, which I then fill in.
 So I've already *committed* to using the "while" or "if", and I never
look back.

If I were writing something like this from scratch, I'm sure I would
quickly chafe at the need to check the error status at every turn.  At
that point I would probably create a helper function which simply
bugged-out with a "return" upon hitting any error.

Then the parent function could look like this:

static OSStatus
SSLVerifySignedServerKeyExchange(...)
  {
  /* Set up. */
  OSStatus err;
  SSLBuffer hashOut, ...
  ...
  signedHashes.data = 0;
  hashCtx.data = 0;
  ...

  /* Call the function which does all the crypto stuff. */
  err = do_the_actual_crypto(...)

  /* Clean up. */
  SSLFreeBuffer(&signedHashes);
  SSLFreeBuffer(&hashCtx);
  return err;
  }

The child function could then do things like this:

  if ((err = ReadyHash(&SSLHashMD5, &hashCtx)) != 0)
      return err;
  if ((err = SSLHashMD5.update(&hashCtx, &clientRandom)) != 0)
      return err;

OK, it's an extra function I know, and instead of saying "goto fail"
we're saying "return err".  Big whoop.  But do we really have to use
"goto" just because we have a create-use-destroy pattern?  For *that* we
gotta drag out a goto?  This isn't kernel spinner code here.

I also don't often favor assignments in expressions, so the child
function might look more like this:

  err = ReadyHash(&SSLHashMD5, &hashCtx);
  if (err) return err;

  err = SSLHashMD5.update(&hashCtx, &clientRandom);
  if (err) return err;

In fact, with all the strict compiler flags I like to use, if I changed
a = to == the compiler would warn me like this:

   error: value computed is not used [-Werror=unused-value]

So I got that going for me.

> (You can also flow through into the cleanup code, then jump back to
> it from the fail code.  Choose your poison.  Both code organizations
> were common back in assembler days.)

Nifty.

> This turns the meaning of "fail" into something specific:  We go here
> when there's an error.  If we try to go here when there's no error,
> the code will die.  Then the code does what it says, and says what it
> does.  (In he original code, executing the "fail" statement doesn't
> mean there was a failure!)

Exactly.

Sheesh, is it really possibly that the producers of this code didn't
have a single test case that hit the sslRawVerify call?  Seriously?

-- Patrick