[Cryptography] Ada vs Rust vs safer C
Jerry Leichter
leichter at lrw.com
Sat Sep 17 06:51:52 EDT 2016
> 4. If you really REALLY want to win big, define a new language that is just like C but where v[offset] and *(v+offset) are NOT equivalent operations, and deprecate the latter.
The VAX C compiler for VMS made this distinction. It wasn't documented or perhaps intended, but apparently just "kind of happened" when the developers fed v[offset] into the array reference code of a universal back end. It didn't do bounds checking, but it *did* forbid 2[x] - which in C is identical to x[2], a fact that surprises even many experienced C programmers (but brings a smile to the face of old assembler hackers). I forget exactly where else the assumption showed up; likely, while C allows exp1[exp2] for arbitrary expressions, VAX C would only allow a limited set of exp1's (variable name, another subscripted expression - x[1][2] - or a function call - f(x)[3] - but not (x+5)[3],would be my guess).
In practice VAX C's treatment of array references broke very little code. As far as I know, it was purely syntactic. The real problem isn't distinguishing an array reference from an arbitrary pointer dereference; it's defining a semantics for arrays (or all pointers) that would carry along bounds to check.
VAX C did break *other* code, but then this was in the period when the C language was more or less defined by pcc's quirks and there was code around that made all kinds of assumptions about the details of the implementation. A classic, long forgotten and unlamented, was that the first word in the address space had address 0 and always contained 0. So you'd see code like:
strcpy(dest, src)
char *dest, /* Old-style declaration - strcpy(char *dest, char *src) */
char *src /* for you young un's */
{ while (*dest++ = *src++)
;
}
and then
strcpy(x, NULL);
which would copy a single \0 character into x. (And now you know why there remain vestiges of the assumption that a char* containing NULL acts like the empty string even in a few places in C++.)
C is deliberately designed these days to *allow* "fat pointers" (carrying bounds information), but other than some compilers specifically designed for debugging/testing purposes - one of the first ones, which I did some very minor work on, was named SaferC or maybe SafeC - this has never seen much uptake in the C community. (Of course, "everyone knows" that a pointer fits in an int - and if not, certainly in a long.)
Of course, language with real bounds checking don't typically allow arrays to be treated like pointers to begin with, so arrays don't *usually* need to be "fat" in order to let the compiler generate bounds checks. Now that C since C99 has had variable-length arrays - which always have the size information available - it's possible to have a compiler actually generate proper bounds checking for them. Forbid any other kind of array and you have your wish. Unfortunately, (a) variable-length arrays can only be allocated on the stack (an easy thing to fix in the language); (b) few of the standard library calls that deal with arrays - or likely other common packages of libraries - deal with VLA's; (c) many common C practices that rely on letting an array reference decay to a pointer can't work properly with VLA's (though decent compilation analysis might allow many to be accepted).
-- Jerry
More information about the cryptography
mailing list