[Cryptography] VW/EPA tests as crypto protocols ?

Fri Sep 25 07:20:32 EDT 2015

> This topic touches on crypto only tangentially;  see last two
> paragraphs below.
> 
> In the VW situation, the correct regulatory protocol is simple:
> Stick a probe up the tailpipe and then go for a drive under 
> real-world conditions....
> 
> The general principle here is simple:  
>   *Measure the thing you care about.*
> 
> To say the same thing the other way:  Avoid measuring something
> that is only a proxy for the thing you care about.  As soon as
> you start rewarding and/or regulating the proxy, it ceases to
> be a reliable measure.  In this double-negative form it is 
> known as Goodheart's law:
>  https://en.wikipedia.org/wiki/Goodhart%27s_law
This idea has been around for many years in many forms.  One example I first saw some time in the early '70s was a response to one of those clever memes about management that made the rounds:  "You can't manage what you don't measure".  Response:  "You'll get exactly what you measure".

On the more general issue of detection:  This is an issue not so much of measurement as of testing.  There are multiple kinds of systems and system failures (deliberate or otherwise), and testing strategies have to be chosen to match the kind of system and failure that's possible.  Thus:

1.  You cannot distinguish correct from incorrect outputs.  These are rare, but there is one cryptographic element that falls into this category:  Random number generators.  These simply cannot be tested by looking at the results.  (It's interesting that the only example I know of in this class takes no inputs.)

2.  You can construct an input set with the property that if the results are correct on the input set, they are correct everywhere.  There are systems that we assume have this property in the face of *random* failures, and we often implicitly assume it - e.g., with sets of test vectors for cryptographic algorithms.  Of course, in the face of deliberate wrongware, it's easy for the system to give the right answers exactly for the test vectors.

Almost all traditional testing really falls into this category, even if we don't explicitly write down the assumed "indicator set":  It's implicit in any fixed collection of deterministic tests.

3.  The set of inputs that induce failure makes up a substantial fraction of the input space, even though you don't know how to characterize the members of that fraction.  A sufficient number of random inputs (chosen independently each time you run the tests) can give you any desired probability that there are no failures.

Tests like this are done, but in my experience are relatively rare.  They shouldn't be.  In fact, the best tests consist of two parts:  Tests on a fixed set of typically sensitive areas (e.g., for any kind of numeric code, 0, -0 if your system has one, small, medium, large, and "at the limits" negative and positive); and as many random tests as you can manage.

4.  The set of inputs that induce failure is very small and unpredictable.  Unless you can afford to sample pretty much the entire input space, these cannot be detected by testing.  The famous Intel division bug of years back was in this class (though once the causes were understood, related bugs could probably be detected by carefully chosen tests).  Much wrongware falls into this class:  The Ethernet adapter that detects a particular random 16-byte header on a packet and drops the rest of the packet over sensitive code in the driver, for example.  In general, carefully targeted  attacks against hardware fall into this class.

The VW wrongware is actually in class 3:  It does the wrong things *almost always*.  It was able to stay hidden only as long as the wrong kind of tests - tests appropriate to class 2 - were the only ones being applied.  The real lesson is:  Class 2 testing *is only appropriate for detecting random failures* (and not always even then, of course).

There is a real conflict with traditional views of law here.  When we prohibit or require something in law, it's a general principle that it has to be clear *exactly 
what* is being prohibited or required.  VW could (very weakly, because of other aspects of the law that forbade special cases) argue:  The regulations require certain outputs when run on a defined set of inputs; we provide that, what's your problem?  If you think no one could hope to get away with this - just look at tons of agreements between Telco's and governments, where the agreement "says" the Telco will make (say) broadband available to 90% of its customers in return for a tax break; but the actual *test* is that broadband "passes" 90% of the customers - as in the fiber goes down the street, but you can't actually connect to it.

For an example of a law that goes the other way, the SEC has always refused to actually define "insider trading".  It has some broad and deliberately vague language about improper manipulation of markets and improper use of proprietary information, but that's it.  The argument - for which there is tons of evidence - is that the instant you explicitly write down what's prohibited, the market manipulators will find a way to do something "just the other side of the line".  So you need the flexibility to go after them.  This is one of the few areas of the law that's openly and deliberately vague, and the SEC's track record in going after people is mixed:  Sometimes the courts accept the argument, sometimes they don't.

                                                        -- Jerry