[Cryptography] GCC bug 30475 (was Re: bounded pointers in C)

Wed Apr 23 18:06:09 EDT 2014

| From: Arnold Reinhold <agr at me.com>

| On Tue, 22 Apr 2014 00:51 D. Hugh Redelmeier responded:
| > | From: Arnold Reinhold <agr at me.com>

| > But the problem actually lies with the success of the PDP-11.  That
| > machine uses a pun: signed and unsigned add of two's complement
| > numbers produce the exact same bits of result from the same bits of
| > operands.  The only difference is how to interpret the signedness and
| > overflow of the result.  So the PDP-11 had a single add instruction.
| > It set a lot of condition bits:  the program could test the bits that
| > were relevant to the representation intended.  But the computer could
| > not trap on overflow because it didn't know if there were an overflow.
| > 
| > Contrast this with the IBM/360.  It too used two's complement.  But it
| > had distinct signed and unsigned add operations (it called the
| > unsigned operations "Logical").  It could generate a trap on overflow.
| > 
| > Almost all important machines after the PDP-11 copied it in this 
| > respect (and much else).
| 
| The IBM 360 is 50 years old, the PDP-11 is 44. As you suggest, due to 
| their influence twos-complement has become the de facto standard for 
| binary signed integer arithmetic. The C standards have been revised 
| several times since.

No, that's not what I said.  I said, roughly, that the pun of treating
unsigned and signed adds as the same operator precluded
trap-on-overflow being free.  The /360 did it right and the PDP-11 did
it wrong (in this regard) and most architectures followed the PDP-11.

| The leap from "overflows may or may not be trapped" to "the compiler can 
| generate any evil code it want whenever it thinks it sees a possible 
| signed integer overflow" is simply astounding. Criminal in my opinion.

That's not what the standard says.  It says: when you go outside the
specs, we don't specify what the result will be.

I want a compiler that says: when you go outside the specs, we'll
catch it and tell you (the language can help or hinder this).  The
market seems to disagree (John Gilmore has pointed this out).

| > The assertion
| > 	assert(a + 100 > a);
| > is nonsense: it assumes a definition of arithmetic overflow that does
| > not apply.  I can see that as a human.  But compilers often see tests
| > that are redundant, and this just looks redundant.
| 
| As you say, C is a close-to-the-metal language. That is a major reason 
| for its popularity, particularly with embedded systems. If translated 
| into machine language in the naive way, the assert statement's test will 
| do just what its author intended on the vast majority of computers out 
| there.

Yes.  But people don't think that they want to use naive compilers.

I suspect (but am too lazy to check) that gcc without -O would
probably generate the naive code that you think you want (it does not
have to).

Anyone who's used one knows that an optimizing compiler will do things
that surprise you.

The language specification is a carefully crafted contract.  I don't
agree with all in the C standard, but it isn't secret.  You no longer
have to pay an arm and a leg to read it.

You say "just what the author intended".  The author is either too
smart or too stupid.  We infer from the code that he knows enough
about overflow to try to catch it (good), he thinks he knows what it
does in his language (wrong, bad).  The rule in C is: don't generate
overflows because all bets are then off.

|  If, on some architecture I am not familiar with, it generates a 
| false abort, the assert will still have done its job in alerting the 
| programmer to a potential problem.

What's a false abort?

If you don't overflow, the assertion is true.  If there is an
overflow, any result is legal.

| The only bad case I can see is if, on 
| our outlier architecture, it allows a bad operation to pass. But that is 
| exactly what removing the assert allows on ALL architectures.

You are not programming in machine language.  Stop thinking that you
are getting machine language semantics.

| > I like my compilers to warn me when I write nonsense.  But like
| > everyone else, I get on my high horse when there are false positives.
| > Redundant code that is intended is hard to separate from redundant
| > code that is a mistake.
| 
| That of course is a common problem with no universal solution, but one 
| should err on the side of safety, not blissful ignorance.

No, programmers really really don't like compilers calling wolf every
time they compile.  And they almost never run lint and its successors.

When programmers see warnings in C, they often throw casts around,
making things worse.

| > I write a lot of assertions that I hope are redundant.  I love it when the 
| > compiler can make them free!
| 
| Maybe your management, if your are writing mission critical, or life 
| safety code, might see it differently. Most of us write code expecting 
| to catch all bad conditions.  Assertions are backups in case we missed 
| something. They are helpful when the concern is accidental oversights. 
| They are vital when the concern is active attacks, where a clever 
| attacker might find a way to generate a condition that the compiler 
| considers impossible.

I don't understand how that's a response.  Perhaps I wasn't clear
enough.

I like assertions being free.  That means that the compiler has been a
theorem prover, letting me write clearer code.  Think of assertions as
being enforceable comments, the best kind.  That means that the reader
can believe them.  It also means that comment-rot will be corrected.

This is a special case of what optimizers are really good for.  They
are not for making old programs run more quickly, they are for letting
programmers program at a higher, more productive level, leaving the
detailed book-keeping to the computer. 

War story about assertions: in about 1982 I bought my first UNIX
machine, an 8086-based NABU 1600.  I ported my file compressor.  An
assertion failed.

- first I blamed me for writing code that wasn't portable.  Wrong.

- then I blamed the C compiler (a variant of Ritchie's original,
  ported by Microsoft to the 8086 and not well tested).  Wrong.

- it was a bug in the 8086!  All 8086 processors.  Already shipping
  for about four years.

It took a while to convince them, but Intel came through with a fix.
The architecture manuals were changed: no chips were recalled.  And I
changed the compiler (for me).

The only code that tripped this bug was the assertion itself.  Simply
deleting the assertion made the code work.

The bug?  The shift instruction was supposed to set the condition
code.  But a shift of 0 bits didn't do that on the 8086 or 8088.  It
did on the '186 or higher models.  The 8086 ALU could only shift by
one bit at a time so a shift of n bits went through the ALU n times,
setting the condition code, but when n was 0, ...

That was when chips were simple enough that the bugs could be
understood.

| > I know enough to not write overflow tests that create overflows.  I
| > write them to prevent overflows.
|
|  That can be pretty subtile. See e.g. 
| http://www.gnu.org/software/autoconf/manual/autoconf-2.64/html_node/Signed-Overflow-Examples.html#Signed-Overflow-Examples.

Autoconf is an abomination.  General principal that it violates: avoid
complexity, don't try to master it.

"If your code looks like these examples, it is probably safe even
though it does not strictly conform to the C standard."

Horrible advice.

| The revelation that NSA has been working to weaken publically-available 
| computer security has generated a hunt for possible instances where this 
| has happened. I prefer stupidity and arrogance to conspiracy as an 
| explanation for failures, but it is hard to imaging a a more productive 
| win for the state security snoops than compilers that remove safety 
| tests. Most programming is done by mere mortals.  And program 
| maintenance is rarely assigned to top tier coders. Source code, publicly 
| available or purloined, for targeted programs can be fed through an 
| instrumented compiler to find instances where safety tests are removed 
| and these can then be analyzed further for exploitable weaknesses. This 
| bug/feature of C compilers not just an exploit, its an exploit 
| generator, a gift that keeps on giving. Know your tools, indeed.

I agree.

If you use substandard programmers, don't use C.

If you use standard programmers, avoid C.

If you have excellent programmers, you are wasting their effort with C.

So what's my excuse for using C?  I understand it pretty well.  It is
pretty stable (eg. my compressor program from the 1970's still works).
It isn't clear what the winning alternative would be.  And there is a
large body of code that I value and support written in C.

(And NSA has used some of my C code!  FreeS/WAN, initiated by John
Gilmore.)

I've heard people who advocate Python.  I cannot imagine trying to
write secure code in a language without strong typing.  Typing is
again a kind of theorem proving about your program, one that the
compiler can check.