[Cryptography] VW/EPA tests as crypto protocols ?

Sat Sep 26 02:59:06 EDT 2015

>
> 1.  You cannot distinguish correct from incorrect outputs.  These are
> rare, but there is one cryptographic element that falls into this
> category:  Random number generators.  These simply cannot be tested by
> looking at the results.  (It's interesting that the only example I know of
> in this class takes no inputs.)
>

Indeed. I seem to deliver a talk on this topic about once a week. You have
to look at the entropy source process to have a hope of determine if it's
actually entropic and determine if it meets the input requirements of the
extractor.

> 2.  You can construct an input set with the property that if the results
> are correct on the input set, they are correct everywhere.  There are
> systems that we assume have this property in the face of *random*
> failures, and we often implicitly assume it - e.g., with sets of test
> vectors for cryptographic algorithms.  Of course, in the face of
> deliberate wrongware, it's easy for the system to give the right answers
> exactly for the test vectors.
>
> Almost all traditional testing really falls into this category, even if we
> don't explicitly write down the assumed "indicator set":  It's implicit in
> any fixed collection of deterministic tests.
>

On this I think world is a little further along, than you suggest. In
hardware, formal equivalence verification and formal assertion
verification is routine. If you push the process a little the tools exist
to support verifying a high level language implementation that is proven
formally equivalent to a high level algorithm model and which is then
synthesized down to gates and formal equivalence checking is done over
that step. So you can show your hardware does only what the model
describes and nothing more. The limits are really that this tends to be
done over algorithms, rather than bigger system compositions. SPIN has
been around for decades for making certain assertions over protocols but
I've yet to see systems that can make formal correctness assertions about
say a general purpose CPU or a bus network outside of university research.

> 3.  The set of inputs that induce failure makes up a substantial fraction
> of the input space, even though you don't know how to characterize the
> members of that fraction.  A sufficient number of random inputs (chosen
> independently each time you run the tests) can give you any desired
> probability that there are no failures.
>
> Tests like this are done, but in my experience are relatively rare.  They
> shouldn't be.  In fact, the best tests consist of two parts:  Tests on a
> fixed set of typically sensitive areas (e.g., for any kind of numeric
> code, 0, -0 if your system has one, small, medium, large, and "at the
> limits" negative and positive); and as many random tests as you can
> manage.
>

I've seen this go too far, where randomized testing is all that is done
and so the corner cases you can reach with directed testing are never
handled. Randomized testing is good, especially in complex systems, but
it's no substitute for thinking about how you can establish the system
behavior over larger sets than can be tested in a brute-force manner. You
have to build a randomized test environment, but then every directed test
is more design work. It's easy to stop adding directed tests because it's
expensive and you might thing the randomized testing is enough.

> 4.  The set of inputs that induce failure is very small and unpredictable.
>  Unless you can afford to sample pretty much the entire input space, these
> cannot be detected by testing.  The famous Intel division bug of years
> back was in this class (though once the causes were understood, related
> bugs could probably be detected by carefully chosen tests).  Much
> wrongware falls into this class:  The Ethernet adapter that detects a
> particular random 16-byte header on a packet and drops the rest of the
> packet over sensitive code in the driver, for example.  In general,
> carefully targeted  attacks against hardware fall into this class.
>

Like I said above, we've moved on in hardware. Tools are available to test
over the entire input space that don't require the logic designer to have
the specialist knowledge that used to be typical of formally proving
implementation properties. It would be good if they were more widely
deployed, but they typically are deployed over security sensitive things
like I/O and crypto. It's no big surprise to me that the nature of system
security failures in chips has moved to places where the composition of
disparate functions can lead to vulnerabilities. Subsystems which are
individually secure become insecure when brought together. I'm aware of a
lot of work directed at addressing those problems. I predict many PhDs
will be written on this topic.

None of this stops an organization cheating though. It can make life
difficult for a cheater in an honest organization, but a dishonest
organization is free to cheat.

> The VW wrongware is actually in class 3:  It does the wrong things *almost
> always*.  It was able to stay hidden only as long as the wrong kind of
> tests - tests appropriate to class 2 - were the only ones being applied.
> The real lesson is:  Class 2 testing *is only appropriate for detecting
> random failures* (and not always even then, of course).
>

I'm keeping my skeptic's hat on. Until someone shows us the code, I have
seen nothing that can reconcile the journalists claims with what I know
about how the testing works. It's possible that the necessary testing
simply was not done, otherwise it would have found what the researchers
found. But I would be surprised if that's what happened. What did the
researchers do different to the government tests? Nothing I suspect. The
car doesn't know when there's a sensor inserted in the exhaust pipe. I
seems like the government outsourced this testing to the researchers, so
maybe it wasn't being done beforehand, or palms were being greased or
incompetence led to the tests being invalid. An enterprising journalist
might want to take a look as the normal government testing procedures and
ask what differs from what the researchers published recently.

> There is a real conflict with traditional views of law here.  When we
> prohibit or require something in law, it's a general principle that it has
> to be clear *exactly
> what* is being prohibited or required.  VW could (very weakly, because of
> other aspects of the law that forbade special cases) argue:  The
> regulations require certain outputs when run on a defined set of inputs;

Given my past experience in vehicle technology, I would argue that this is
how car manufacturers have to meet the testing requirements. Cars are
designed to work over a set of continuous variables and in general the
test requirement are point tests. It's simply a fact that the car has to
do something at all the points between the tested points. They have to do
something different to what it done at the tested points and so the
difference between 'cheating' and 'normal behavior' is qualitative, not
quantitative. You might be able to formulate testing schemes to make
cheating harder, but that is not at all an aspect of the current testing
requirements.

> we provide that, what's your problem?  If you think no one could hope
> get away with this - just look at tons of agreements between Telco's and
> governments, where the agreement "says" the Telco will make (say)
> broadband available to 90% of its customers in return for a tax break; but
> the actual *test* is that broadband "passes" 90% of the customers - as in
> the fiber goes down the street, but you can't actually connect to it.
>
> For an example of a law that goes the other way, the SEC has always
> refused to actually define "insider trading".  It has some broad and
> deliberately vague language about improper manipulation of markets and
> improper use of proprietary information, but that's it.  The argument -
> for which there is tons of evidence - is that the instant you explicitly
> write down what's prohibited, the market manipulators will find a way to
> do something "just the other side of the line".  So you need the
> flexibility to go after them.  This is one of the few areas of the law
> that's openly and deliberately vague, and the SEC's track record in going
> after people is mixed:  Sometimes the courts accept the argument,
> sometimes they don't.
>
>                                                         -- Jerry