[Cryptography] It's all K&R's fault

Sun Apr 20 16:59:42 EDT 2014

On Apr 20, 2014, at 8:45 AM, Patrick Chkoreff <patrick at rayservers.net> wrote:
>> Don't do either. Get an intelligent compiler for an intelligent
>> language, which inserts the bound checks when needed, and optimizes them
>> away when possible. Win-win.
> 
> Well, the irony is that I'm using it only for the purpose of
> implementing an intelligent language (Fexl) which always checks memory
> bounds no matter what.
> 
> I even leave bounds checking enabled in extremely low-level code, such
> as in this fast buffering routine:
> 
> https://github.com/chkoreff/Fexl/blob/fresh/src/buf.c#L36
> 
> I benchmarked buffering up a 2.6 GB string, one character at a time,
> with and without that bounds checking assertion, and I saw no
> statistically significant difference in run time.
> 
> I could of course "prove" that the bounds check was unnecessary and thus
> remove it, but it would make the code less fault-tolerant.
If you look at my old RISKS article, linked to from another note in this thread, you'll see that I comment that the Vector class I implemented had a checked operator[] and an unchecked UnsafeAt() function, and that I used the latter in the innards of a hash table where a bounds violation is provably impossible.

Well ... some time after I wrote that note, something led us to look at the hash table more closely.  (It may have been a Purify - the dynamic memory access tester - test run.)  It looked like the "proof" wasn't really right.  So I went through the code, took out the UnsafeAt() calls, and put in operator[].  A test run identified an oddball corner case that violated the bounds.

An easy fix, but I never took the checked operator[] calls out after that (though in many years of use in the field, it never triggered again).  And you know what:  Despite very heavy usage of the hash table code (and of the Vector code in general), the bounds checking never showed up as using enough time to even be visible in profiling.

The cost of bounds (and other sanity) checking, even without compiler assistance, is often completely irrelevant to program performance.  Too many programmers "know in their guts" that *their* algorithms are so good, and *their* code so tight, and what *they* are writing is so central to their project, that they just "can't afford" the time wasted in such checks.  (Not to mention that they are such good coders that there aren't any errors to check for anyway.)  The only answer to such claims is:  Prove it.  They will almost never succeed - and even if they do, often there are other ways to gain significantly more performance *safely* by re-considering algorithms and data structures.

                                                        -- Jerry