[Cryptography] Langsec & authentication

Mon May 26 22:24:35 EDT 2014

On Mon, May 26, 2014 at 2:23 PM, Judson Lester <nyarly at gmail.com> wrote:
> I've been fascinated to discover and read about the langsec movement
> in the wake of heartbleed. The fundamental ideas seem sound, but
> there's at least one question I'm have but haven't seen addressed
> anywhere.
>
> As I understand it, the langsec position is that specifying your
> protocol language to be as easy to parse as possible, in Chomsky
> hierarchy terms, has direct security implications - if the uppermost
> surface of your networked application doesn't have to include a Turing
> machine, that severely limits an avenue of attack on that application.
>
> What confuses me is trying to align this with a principle of
> cryptography that you should only authenticate what you mean, as
> opposed to authenticating a particular series of bytes, especially in
> the face of langsec sites that recommend the use of JSON after having
> argued convincingly against ASN.1 DER.
>
> Here's what I mean: moving on immediately from JSON, it seems to me
> that any language that includes key-value pairs, to be safe to
> authenticate, has to guarantee that the keys in any mapping form a
> set. Otherwise I can produce two documents that *mean* the same thing
> even though they have different bytes - because in foo=bar,foo=baz,
> our interpretation has to choose a meaning - does foo == bar, baz or
> maybe [bar,baz]?
>
> But I think that requiring that the keys belong to a set pushes the
> language into context sensitivity i.e. as bad as ASN.1 DER.
>
> Conversely, I can't think of a system I use regularly that doesn't
> define a language that doesn't either use set-of-keys, should use
> set-of-keys or repeat-implies-array, all of which imply
> context-sensitivity, I think. On the other hand, removing key/value
> from a protocol would make it comparatively easy to reduce to a
> regular language.

I am not sure ASN.1 is the best choice for illustrating what they are
about. ASN.1 is after all a formalized grammar and should be the sort
of tool that is part of the solution set they are looking for. Its
fact that ASN.1 is horribly designed and has got far worse over time
that disqualifies it rather than being an ad-hoc approach.

For the protocols I have developed with Protogen, every one is defined
using a formal grammar and the parser code is generated from that
grammar. Most of the applications are in JSON because its the simplest
of the commonly used encodings and none of the others offer an
advantage.

[I do have an ASN.1 encoder in the system but it seems to have a bug
in it, I can't get other programs to accept my certs right now. Or
maybe I am just doing the byte ordering wrong on the signature...]

For each client/server interface there is a rigorous separation of the
message and API. First the entire message is read in and
authenticated. Only if the message verifies is it passed to the
parser. And only messages that parse correctly are passed to the
handler.

Rather oddly, there is quite a resistance to this approach. People
don't seem to be able to grock that a network application is really
just a remote procedure call (though possibly the server making a call
to the client rather than the reverse).

One of the reasons I want something more than TLS authentication is
that there is no guarantee that the TLS framing of authentication will
match up with what the application authentication requirement is.

So for example, imagine we have a TLS session using RC4 encryption
that plops a MAC value onto the pipe every 64Kb. And the target
environment has some sort of SSL accelerator in place. Unless the
application developer reaches into the TLS stack to assure themselves
that their packets are too going to be authenticated properly it is
quite possible that the receiving end pulls a 2048 byte transaction
'fire the nuclear missiles' out of the TLS stream and acts on it
before the MAC value is found to be bad and the socket closes. Since
we are using a stream cipher with chosen plaintext it is easy to
recover and reuse the cipherkey.

Requiring every transaction to have a separate application layer
authentication blob protects against any nonsense at the transport
layer and protects against injection attacks. There is no possibility
of injecting commands when every command requires a separate
authentication value.

Using a parser generator tool to parse e.g. DNS protocol messages is
rather easier than writing the code without a tool and it does mean
that a rigorous approach to buffer allocation can be taken and we can
abort on data streams with errors like length-A [ length-B [  Data ] ]
where Length-B > Length-A.