[Cryptography] DIME // Pending Questions // Seeking Your Input

Jerry Leichter leichter at lrw.com
Sun Mar 1 10:28:46 EST 2015


On Mar 1, 2015, at 8:47 AM, Phillip Hallam-Baker <phill at hallambaker.com> wrote:
> ...In particular, a decoder can verify the syntactic correctness of each token in the stream in a single pass using only the data previously read. Checking correctness of an ASN.1 file is a real horrorshow because an inner length encoding can be inconsistent with either an outer or an inner one.
Not to disagree that this is a good feature, but ... having written (actually, fixed) a parser for an encoding (not ASN.1, which has its own special complexities) that used a nested TLV (Type/Length/Value encoding), I'd say it's not particularly hard to get the bounds checking right.  But you have to design for it from the beginning and follow the design consistently.  I used something very much like recursive descent parse, but the rule was that every call took a begin pointer and end pointer (this was C).  Internally, you maintain a "current position" that starts at the begin pointer and may never reach the end pointer; when you've read the length of a subelement, you compute its end pointer, which you'll pass to the subelement parser and which will become your new current pointer on return.

Note that C programmers will all too often ignore the "pass the end pointer" part (as an "optimization" since, e.g., if the field has a known length, the caller would have checked, right?).  Programmers of languages with native strings will all too often just pass the starting point - which guarantees that a sub-element parser can't read past whoever constructed the top-most string, but doesn't prevent reading past the sub-element's boundaries.

Perhaps the easiest language to get this right in is Java, since the substring operation is essentially free:  It points into the parent string but has a different length.  Of course, you do have to get your substring arithmetic right consistently - having a helper function that pulls it out is the way to go.  In C++, a substring operation will copy the data, so you generally need to compute and pass the end pointer yourself.

Still, I'll agree that people get this on-its-face trivial bit of coding wrong all the time.  A parser generator is really the way to go:  Get it right once and for all.
                                                        -- Jerry

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.metzdowd.com/pipermail/cryptography/attachments/20150301/a8c8aa03/attachment.html>


More information about the cryptography mailing list