[Cryptography] Is ASN.1 still the thing?

Mon Nov 13 17:19:52 EST 2017

Jerry Leichter wrote:
>> The subject of this message thread ought to be "why are people still 
>> inventing serialization formats?" ASN.1 works well from network and CPU 
>> efficiency perspective, *and* is reliable for security-oriented usage....
> I haven't seen a direct comparison, but I very much doubt ASN.1 would be 
> competitive from a CPU-usage point of view.  Maybe it was well matched to 
> CPU's at the time it was designed, but CPU's have changed significantly. 
>   There are a couple of competing formats - Thrift (I think done by Facebook) 
> and Avro (not sure who designed it) are two I know of - which do somewhat 
> better on speed, compression, or maybe even both, so people are still learning 
> new tricks - but the differences aren't very large any more.  Here's one 
> 5-year-old, somewhat out of date, comparison (e.g., for protobufs, Google 
> released an RCP implementation and added more supported languages.  I'm sure 
> Thrift and Avro have improved as well.): 
> https://www.slideshare.net/IgorAnishchenko/pb-vs-thrift-vs-avro. Performance 
> Comparison of Data Serialization Schemes ... - ThinkMind 
> <https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=7&ved=0ahUKEwin4JDEvLzXAhUf0IMKHZR7Bl0QFghRMAY&url=https%3A%2F%2Fwww.thinkmind.org%2Fdownload.php%3Farticleid%3Dtele_v8_n12_2015_5&usg=AOvVaw2kvViUWsSrGL67dxGlHSwQ> does 
> seem to indicate that ASN.1 does better than protobuf - but it's a long paper 
> that I don't feel like reading through and while it does show that in some 
> cases (but not others, with differences I don't understand) ASN.1 produces 
> significantly smaller encodings, I couldn't quickly tell how CPU times compare.

You should read anything you're going to reference. I just ran thru that paper 
and in pretty much every metric ASN.1 was superior to protobuf. Memory 
footprint, CPU consumption, and resulting message length. I'm not familiar 
with the ASN.1 compiler they used, but it appeared to require messages to be 
decoded into objects all in one go, which would greatly affect its memory 
footprint. OpenLDAP's liblber allows messages to be decoded incrementally, 
in-place for zero-copy streaming. We can't do a direct comparison since 
they're using PER and we only support BER/DER, but on BER/DER, I suspect 
liblber would greatly outperform their chosen ASN.1 software.

> At Google, protobuf's are *the* medium of exchange for data.  Everything is 
> done through RPC - I/O calls that use the Google file system turn into RPC's - 
> and those RPC's are implemented as protobufs.  Protobuf values are also define 
> a "record format" for saving data to files.  So there's really good reason to 
> optimize for some combination of CPU and size - though *what* combination is 
> really going to be application- and technology-dependent.

Google is not a paragon of efficiency; there's plenty of things they get 
wrong, left to their own devices. I'll note that Google contracted with my 
company back in 2007 or so when they needed help scaling their servers to 
handle multi-thousand concurrent connections.

Using your wire format for your storage format has been done before. It works 
when you have a simple data model and a simpler/nonexistent security model. It 
doesn't work for e.g. OpenLDAP where you have fine-grained access controls 
which require hiding some fields or values from some requestors, thus 
requiring field and record lengths and offsets to be recomputed on the fly.

> As has been discussed here in the past, ASN.1 is rather difficult to implement 
> correctly.  The protobuf wire format (which is what we're really discussing) 
> is quite simple by comparison.  Granted, these days there are libraries that 
> (allegedly) get both right.

Like most things, difficult things get easier with practice. Good tools are 
just as easily misused as used, it all comes down to the person using it.
liblber has been around for a couple decades, carefully optimized, heavily 
tested, and widely deployed. It has never been broken by buffer overflows or 
other such nonsense. Comparatively, the protobuf guys are still just getting 
their feet wet.

> In the end ... the best answer for which is better seems to be "it depends" - 
> which is exactly why we are still inventing new serialization formats.
> 
>                                                          -- Jerry
> 

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/