[Cryptography] Ada vs Rust vs safer C

Sat Sep 17 06:51:52 EDT 2016

> 4.  If you really REALLY want to win big, define a new language that is just like C but where v[offset] and *(v+offset) are NOT equivalent operations, and deprecate the latter.
The VAX C compiler for VMS made this distinction.  It wasn't documented or perhaps intended, but apparently just "kind of happened" when the developers fed v[offset] into the array reference code of a universal back end.  It didn't do bounds checking, but it *did* forbid 2[x] - which in C is identical to x[2], a fact that surprises even many experienced C programmers (but brings a smile to the face of old assembler hackers).  I forget exactly where else the assumption showed up; likely, while C allows exp1[exp2] for arbitrary expressions, VAX C would only allow a limited set of exp1's (variable name, another subscripted expression - x[1][2] - or a function call - f(x)[3] - but not (x+5)[3],would be my guess).

In practice VAX C's treatment of array references broke very little code.  As far as I know, it was purely syntactic.  The real problem isn't distinguishing an array reference from an arbitrary pointer dereference; it's defining a semantics for arrays (or all pointers) that would carry along bounds to check.

VAX C did break *other* code, but then this was in the period when the C language was more or less defined by pcc's quirks and there was code around that made all kinds of assumptions about the details of the implementation.  A classic, long forgotten and unlamented, was that the first word in the address space had address 0 and always contained 0.  So you'd see code like:

	strcpy(dest, src)
	char *dest,	/* Old-style declaration - strcpy(char *dest, char *src) */
	char *src	/*  for you young un's */
	{	while (*dest++ = *src++)
			;
	}

and then

	strcpy(x, NULL);

which would copy a single \0 character into x.  (And now you know why there remain vestiges of the assumption that a char* containing NULL acts like the empty string even in a few places in C++.)

C is deliberately designed these days to *allow* "fat pointers" (carrying bounds information), but other than some compilers specifically designed for debugging/testing purposes - one of the first ones, which I did some very minor work on, was named SaferC or maybe SafeC - this has never seen much uptake in the C community.  (Of course, "everyone knows" that a pointer fits in an int - and if not, certainly in a long.)

Of course, language with real bounds checking don't typically allow arrays to be treated like pointers to begin with, so arrays don't *usually* need to be "fat" in order to let the compiler generate bounds checks.  Now that C since C99 has had variable-length arrays - which always have the size information available - it's possible to have a compiler actually generate proper bounds checking for them.  Forbid any other kind of array and you have your wish.  Unfortunately, (a) variable-length arrays can only be allocated on the stack (an easy thing to fix in the language); (b) few of the standard library calls that deal with arrays - or likely other common packages of libraries - deal with VLA's; (c) many common C practices that rely on letting an array reference decay to a pointer can't work properly with VLA's (though decent compilation analysis might allow many to be accepted).

                                                        -- Jerry