[Cryptography] Is ASN.1 still the thing?
leichter at lrw.com
Mon Nov 13 16:26:47 EST 2017
>>> Do JSON, Yaml, or protobuff allow representing data format in ways that give a unique and well defined checksum, that will not be affected by endianess or compiler options?
>> For protobuf, I'm pretty sure the answer is yes. It would take a careful reading of the specs to be sure there are no corner cases, and it depends on proper implementation: protobuf representations of some datatypes are transferred in a compressed format. For example, integers use a varying-length representation that can drop leading zeroes. So you *could* represent an integer in multiple ways - though you're *supposed* to use the shortest representation (which is unique). Whether a receiver would reject a non-canonical representation, I don't know - probably not.
>> Then again, one could say the same thing about ASN.1.
> In ASN.1 DER you're required to use the shortest representation, and the decoder must reject the input if it's not in shortest form.
> The subject of this message thread ought to be "why are people still inventing serialization formats?" ASN.1 works well from network and CPU efficiency perspective, *and* is reliable for security-oriented usage....
I haven't seen a direct comparison, but I very much doubt ASN.1 would be competitive from a CPU-usage point of view. Maybe it was well matched to CPU's at the time it was designed, but CPU's have changed significantly. There are a couple of competing formats - Thrift (I think done by Facebook) and Avro (not sure who designed it) are two I know of - which do somewhat better on speed, compression, or maybe even both, so people are still learning new tricks - but the differences aren't very large any more. Here's one 5-year-old, somewhat out of date, comparison (e.g., for protobufs, Google released an RCP implementation and added more supported languages. I'm sure Thrift and Avro have improved as well.): https://www.slideshare.net/IgorAnishchenko/pb-vs-thrift-vs-avro <https://www.slideshare.net/IgorAnishchenko/pb-vs-thrift-vs-avro>. Performance Comparison of Data Serialization Schemes ... - ThinkMind <https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=7&ved=0ahUKEwin4JDEvLzXAhUf0IMKHZR7Bl0QFghRMAY&url=https%3A%2F%2Fwww.thinkmind.org%2Fdownload.php%3Farticleid%3Dtele_v8_n12_2015_5&usg=AOvVaw2kvViUWsSrGL67dxGlHSwQ> does seem to indicate that ASN.1 does better than protobuf - but it's a long paper that I don't feel like reading through and while it does show that in some cases (but not others, with differences I don't understand) ASN.1 produces significantly smaller encodings, I couldn't quickly tell how CPU times compare.
At Google, protobuf's are *the* medium of exchange for data. Everything is done through RPC - I/O calls that use the Google file system turn into RPC's - and those RPC's are implemented as protobufs. Protobuf values are also define a "record format" for saving data to files. So there's really good reason to optimize for some combination of CPU and size - though *what* combination is really going to be application- and technology-dependent.
As has been discussed here in the past, ASN.1 is rather difficult to implement correctly. The protobuf wire format (which is what we're really discussing) is quite simple by comparison. Granted, these days there are libraries that (allegedly) get both right.
In the end ... the best answer for which is better seems to be "it depends" - which is exactly why we are still inventing new serialization formats.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cryptography