Protocol implementation errors

Sun Oct 5 07:29:06 EDT 2003

| >This is the second significant problem I have seen in applications that use
| >ASN.1 data formats.  (The first was in a widely deployed implementation of
| >SNMP.)  Given that good, security conscience programmers have difficultly
| >getting ASN.1 parsing right, we should favor protocols that use easier to
| >parse data formats.
| >
| >I think this leaves us with SSH.  Are there others?
|
| I would say the exact opposite: ASN.1 data, because of its TLV encoding, is
| self-describing (c.f. RPC with XDR), which means that it can be submitted to a
| static checker that will guarantee that the ASN.1 is well-formed.  In other
| words it's possible to employ a simple firewall for ASN.1 that isn't possible
| for many other formats (PGP, SSL, ssh, etc etc).  This is exactly what
| cryptlib does, I'd be extremely surprised if anything could get past that.
| Conversely, of all the PDU-parsing code I've written, the stuff that I worry
| about most is that which handles the ad-hoc (a byte here, a unit32 there, a
| string there, ...) formats of PGP, SSH, and SSL.  We've already seen half the
| SSH implementations in existence taken out by the SSH malformed-packet
| vulnerabilities, I can trivially crash programs like pgpdump (my standard PGP
| analysis tool) with malformed PGP packets (I've also crashed quite a number of
| SSH clients with malformed packets while fiddling with my SSH server code),
| and I'm just waiting for someone to do the same thing with SSL packets.  In
| terms of safe PDU formats, ASN.1 is the best one to work with in terms of
| spotting problems.
I think there's a bit more to it.

Properly implementing demarshalling code - which is what we are really talking
about here - is an art.  It requires an obsessive devotion to detailed
checking of *everything*.  It also requires a level and style of testing that
few people want to deal with.

Both of these are helped by a well-specified low-level syntax.  TLV encoding
lets you cross-check all sorts of stuff automatically, once, in low-level
calls.  Ad hoc protocols scatter the validation all over the place - and
some of it will inevitably be overlooked.

A couple of years back, the place I work decided to implement its own SNMP
library.  (SNMP is defined in ASN.1)  We'd been using the same free library
done at, I think, UC San Diego many years before, and were unhappy with many
aspects of it.  The guy we had do it had the right approach.  Not only did he
structure the code to carefully track and test all "suspect" data, but he also
wrote a test suite that checked:

	- Many valid inputs.  Most people stop here:  It gets the
		right results on valid data, who cares about the rest.
	- Many expectable forms of invalid data.  A few people will write
		a few of these tests.  Usually, they write test cases that
		match the error paths they thought to include in their
		code.  Fine, but not nearly enough.
	- Many randomly-generated tests.  The trick here is to "shape"
		the randomness:  Most completely random bit strings will
		be rejected at very low levels in the code.  What you want
		are strings that look valid enough to pass through multiple
		layers of testing but contain random junk - still with
		some bias, e.g., 1 too high or too low is a test you want
		to bias toward.  Hardly anyone does this.

The proof of the pudding came a couple of years later, when a group a OUSPG
(Oulu University Secdure Computing Group) came up with a test that ripped
through just about everyone's SNMP suites, leading to a CERT advisory and a
grand panic.  Our stuff passed with no problems, and in fact when we looked at
OUSPG's test cases, we found that we had almost certainly run them all in some
form or another already, since they were either in our fixed test list or were
covered by the randomized testing.

OUSPG's efforts were actually directed toward something more on point to this
discussion, however:  Their interest is in protocol test generation, and their
test cases were generated automatically from the SNMP definitions.  This kind
of thing is only possible when you have a formal, machine-readable definition
of your protocol.  Using ASN.1 forces you to have that.  (Obviously, you can
do that without ASN.1, but all too often, the only definition of the protocol
is the code.  In that case, the only alternative is manual test generation -
but it can be so hard to do that it just doesn't get done.)

BTW, the OUSPG saga is another demonstration of the danger of a monoculture.
There are, it turns out, vey few independent implementations of SNMP around.
Just about everyone buys SNMP support from SNMP Research, which I believe
sells a commercialized version of the old UCSD code.  Find a bug in that, and
you've found a bug in just about every router and switch on the planet.

							-- Jerry

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com