[Cryptography] Is ASN.1 still the thing?

Tue Nov 14 17:00:34 EST 2017

On Mon, Nov 13, 2017 at 04:26:47PM -0500, Jerry Leichter wrote:
> I haven't seen a direct comparison, but I very much doubt ASN.1 would

Can we be a bit more pedantically correct?  ASN.1 is *syntax* and has an
associated set of encoding rules.

Those rules include:

 - tag-length-value (TLV) encoding rules that suck: BER, DER, CER

 - PER -- an XDR-like set of rules, but with 1-byte alignment (vs.
   4-byte alignment for XDR)

   XDR is also very inefficient at encoding optional fields and booleans
   in general since it does not pack booleans and the implied booleans
   for optional fields.  Whereas PER does pack booleans into bitfields,
   so PER is very efficient.

 - XER -- XML Encoding Rules.

   Yes, ASN.1 is just syntax, allowing any number of encodings, even
   XML.

   This is so much so that there's even an ASN.1/PER-based compression
   scheme for XML (FastInfoSet).

Confusing ASN.1 and DER is how we lead others to say "screw ASN.1, I'm
going to do a brand new thing", and then we all get burdened with that
new thing.  Hello Protocol Buffers (which, ironically, is
ASN.1/DER-like).

> be competitive from a CPU-usage point of view.  Maybe it was well

DER sucks because it's a definite-length encoding with variable-sized
lengths, which means that before you can encode your data structure you
must compute the encoded size, or alternatively that you must traverse
the structure in post-order to encode it from the end and realloc
buffers as you go.  Either way sucks, though if you have a compiler and
run-time then the only thing you'll observe is that encoding is
not-online (though decoding is).

BER doesn't have this problem in that you can choose either indefinite-
length encoding or to not minimize the encodings of lengths.  And if you
choose indefinite-length encoding then BER is online for encoding.

PER is as good as it gets, even today, with the only tunable of interest
being its alignment (which is 1-octet, but it would probably be faster
with 4-octet alignment).

The only downside to PER -- the reason we don't use it universally -- is
that you really do need tools that can compile a complete ASN.1 module,
and these did not exist for a long time, not as open source code
anyways.  Of course, the situation is better now, but it's too late.
Though it's never too late to say NO to new encodings.

(I wouldn't say no to things like CBOR or JSONB, since those are
specifically tailored to JSON, which exists.  But please, no new
Protocol Buffers alikes.)

> matched to CPU's at the time it was designed, but CPU's have changed

BER/DER/CER were definitely NOT designed to match 1980s CPUs.

As for ASN.1 the *notation*, the faster CPUs get, the less reason there
is to not use it, since it's just a compile-time thing.

As for encoding rules, PER, or PER with a four-octet alignment, is still
probably the best (fastest, most compact, most efficient) encoding.

> significantly.  There are a couple of competing formats - Thrift (I
> think done by Facebook) and Avro (not sure who designed it) are two I
> know of - which do somewhat better on speed, compression, or maybe
> [...]

Certainly, if the data compresses well, and is full of octet or
character strings, then applying a compression function will help.  But
compression is still orthogonal to encoding rules, because you'll still
need those.

> As has been discussed here in the past, ASN.1 is rather difficult to
> implement correctly.  The protobuf wire format (which is what we're
> really discussing) is quite simple by comparison.  Granted, these days
> there are libraries that (allegedly) get both right.

Protocol buffers is a TLV encoding with definite-length encoding --
i.e., just a variation on ASN.1's DER.

I don't recall the details, but if it doesn't use variable length
encodings, that would be a win, though still not a huge win as long as
definite-length encoding is part of the system.

If you're looking at "ASN.1" (meaning DER) and think "wow that sucks",
what you want to do is look at PER and think of how you might improve it
(there isn't much room for improving PER, really).

> In the end ... the best answer for which is better seems to be "it
> depends" - which is exactly why we are still inventing new
> serialization formats.

It's perfectly fine to produce CBOR or JSONB when you're dealing with
JSON data.  Since JSON exists, you have to deal with it.

It's perfectly fine to look for alternate encodings of XML, and for the
same reason.  (And people have.  Again, see FastInfoSet.)

But when you don't have a pre-existing thing to deal with, and the only
problem you're facing is lack of tooling for your chosen programming
language, well, there's no excuse for building your own.  First do the
research, pick an existing specification, and write tools for that.

Alternatively, if you're lucky enough to find suitable tools, use them
and save yourself the burden of building them.

Nico
--