[Cryptography] how reliably do audits spot backdoors?

Mon Dec 30 19:33:25 EST 2013

Jerry Leichter <leichter at lrw.com> writes:
>
> C (and C++) have always defined unsigned arithmetic as arithmetic mod
> 2^n, but signed arithmetic used to be almost completely a "quality of
> implementation" issue.  It's been a while since I looked, but I
> believe the recent standards insist on a binary representation that's
> either 1's- or 2's-complement.

Close, but not quite...

http://c0x.coding-guidelines.com/6.2.6.2.html

C99 and C11 restrict signed integer representations to ones-complement,
twos-complement, and sign/magnitude. Also some of the bits can be
"padding" and ignored.

The same phrasing never made it into any C++ spec, including C++11. So
in C++, integer representations are considerably less constrained, at
least in theory (see e.g. http://stackoverflow.com/questions/13150449/).

However, none of this has anything to do with overflow of signed integer
arithmetic. Signed integer overflow is simply undefined behavior,
period. (What you call "quality of implementation" some would call
"non-portable vendor-specific extensions".)

Undefined behavior does not mean "something strange happens"; it means
the compiler can assume you don't do that and optimize accordingly.

My favorite example:

    int foo(int n)
    {
        return n < n+1;
    }

If you compile this with any modern compiler, enable optimization, and
call it with INT_MAX as argument, it will return *true* (i.e. 1). (Even
though if you print INT_MAX+1 you will probably see a large negative
power of two.)

The compiler's reasoning goes like this. Signed integer overflow is
undefined behavior, which I can assume never happens. If n equals
INT_MAX, this function will perform signed integer overflow. Therefore,
n cannot equal INT_MAX and this function may be optimized to return the
constant 1. (Check the assembly code if you doubt this.)

Whether such a language is suitable for writing secure code is certainly
a matter of opinion. But before having such an opinion it helps to know
what the language acutally *is*. :-)

[Aside: For what it's worth, I found the bug in
http://graphics.stanford.edu/~danielrh/vote/mzalewski.c in less than two
minutes. But then, I have read and written an awful lot of C and C++
code, and the fact that something "obfuscated" is hiding there gives a
huge hint to stare at the macro.

The two classic macro bugs are (1) variable capture (as illustrated) and
(2) this sort of thing:

    #define SQUARE(x) ((x)*(x))

    return SQUARE(random() % 16); // probably not what you intended

End aside.]

 - Nemo