[Cryptography] DIME // Pending Questions // Seeking Your Input

Fri Feb 27 17:57:07 EST 2015

On 27/02/2015 16:08 pm, Ladar Levison wrote:
> Hi,
>
> I’m about to spend a significant amount of time working on the DIME
> specifications, with a focus towards writing the sections missing from
> the current version, adding some of the low-level details missing from
> the current draft, and incorporating the feedback the community has
> provided. My goal is to publish a revised draft in conjunction with the
> upcoming IETF meeting (March 22nd). To that end, I wanted to solicit
> input from members of this list who haven't already sent me feedback, so
> I can incorporate it, and more importantly, seek your input on some of
> questions I've included below. Feel free to reply to me on the mailing
> list, or if you'd prefer, in the thread I've setup on the DIME message
> board.
>
> The December draft can be found here:
>
> https://darkmail.info/downloads/dark-internet-mail-environment-december-2014.pdf
>
> The forum topic I setup to discuss these questions is here:
>
> https://darkmail.info/forums/viewtopic.php?f=4&t=106

Nice write up of questions.  For my money, responses below.  Note that I 
haven't had time to read the draft.

> Thanks,
> Ladar Levison
>
>
> *Protocol Questions*
>
> 1. While I’ve identified the majority of the functionality associated
> with the access protocol (DMAP), my attempts to document the specifics
> keep getting sidelined by a single question: */should DMAP be a line
> based protocol, like IMAP (and POP, and SMTP), or should it be designed
> as a JSON-RPC protocol, like the Magma camelface, or JMAP?/* See:
>
> https://github.com/lavabit/magma.classic/raw/master/docs/magma.web.api.pdf

For my money, binary.  Line-based or JSON are good if you are working 
with lots of random implementations of low quality, but for security 
work, it is easier to work in binary.  Be precise about sodding everything.

> 2. The current RFCs dictate that domain names are handled case
> insensitively, while mailbox names (what goes in front of the @ symbol),
> should be considered case sensitive. This behavior stems from the fact
> that most early email systems ran atop Unix, which has a case sensitive
> file system. Thus a capital letter in the mailbox would result in the
> email server saving a message in a different file. These days email
> systems generally operate case insensitively on mailbox names because of
> the obvious implications associated with allowing email addresses which
> are almost identical. The question is: */should DIME mandate mailbox
> names be compared case insensitively?/* Keep in mind that if names are
> considered case sensitive, a capitalized letter could result in a server
> returning a different signet (with different encryption keys)!

case insensitive.

> 3. Somewhat related to the previous question, */should support for
> international domain names and mailboxes (using UTF-8) be mandatory?/*
> Or should UTF-8 address support be optional? Systems without support for
> UTF-8 addresses would be forced to use the ASCII encoding scheme defined
> in the current email RFCs. Sadly, I am not an expert on
> internationalization, and the prospect of normalizing UTF-8 mailboxes
> and domains for comparison operations could be complicated, error prone,
> and a potential source for security problems. Do the benefits outweigh
> the drawbacks? (See RFCs 5890 and 6530 for a discussion of international
> mailboxes and domain names.)

I'd be comfortable accepting UTF-8 etc if there were a canonical 
identifier defined (eg hash) that made visual substitutions implausible.

I think the days of assuming ASCII are over.  Half the world out there 
wants something impenetrable to us.

> *Cryptographic Questions*
>
> 4. */How should ECC public keys be encoded?/* In the past I’ve used the
> compressed form defined by the X9.62 standard. OpenPGP encodes the keys
> as uncompressed multi-precision integers (MPI). Both are big endian
> formats and rather long, leading some to suggest a more compact
> alternative be used. The Ed25519 paper, for example, defines its own
> compressed little-endian format to represent public keys (see endianness
> question above). Anyone care to make a suggestion?

Pass?  (Or, steal one of them, wrap it up in a packet with a leading 
version=1 number.  Is what I'd do.)

> 5. */What key derivation function (KDF) should be applied when deriving
> a symmetric key from a Diffie-Hellman key exchange? It would be nice if
> the same KDF could also be applied to user passwords. /*Note, the KDF
> must provide at least 48 bytes, and looking for feedback on which of the
> potential alternatives we should use. The following have all been
> suggested to me: bcrypt, scrypt, Makwa, yescrypt. Some KDF’s allow for
> us to pick from different hash functions as well, with SHA-512, Skein
> and Blake being suggested. Currently the KDF is SHA-512.

I used SHA-512 last time I had this question.  In the future I'd 
probably use Keccak, as sponge is good.

> 6. */What “special” considerations should be applied when encrypting
> private keys, knowing they will eventually be stored on an untrusted
> server? Should an AEAD cipher be used? /*Presumably the KDF discussed in
> the above could be used again to derive the symmetric keys protecting
> this encrypted data. The question, beyond using a salt, what else should
> be done?

> 7. The current draft of the DIME spec says the E-521 curve will for
> “alternate” asymmetric encryption keys,

Don't do it, man!

The notion that an "alternate" is needed is some magical thinking from 
the golden age of irresponsibility.  If you need an alternate, it means 
(a) you can predict where the break is going to happen, and (b) didn't 
fix it!?

Also, if you have an alternate, you must have a rollout plan, and nobody 
has that, especially at RFC level.  The experience we've got when 
anything breaks in these protocols is that we're in a mess, and all 
these spare protocols only helped once, and even then only by walking 
backwards to a deprecated alg.

> but doesn’t define an alternate
> symmetric cipher to go along with it.

My preferred way to handle the need for protocol break it is to define 
++version.  As you roll out the current version, start working on the 
new one.  Have it in advanced readiness and tune it over time.

> */Should the alternate cipher
> suite incorporate a different symmetric algorithm? How about a different
> choice for signing?/*

An alternate everything.  If you can predict what is not changing in the 
alternate, then you're predicting what is going to break.  If you could 
do that, you could also fix it.  The only logical conclusion is that 
everything should change.  Our knowledge has a time component, and we're 
not yet at the point where we can predict the future.

> ChaCha20+Poly1305 has been suggested as the
> alternate symmetric cipher.

(Yes, that is good, it is the current fave indie AE suite, and there is 
a good choice of implementations to crib by now.  You might also want to 
look at the new Keccak modes coming from NIST this summer, rumour has it 
that it will include an AE mode.)

> 8. The current draft stipulates TLS certificates are verified using
> Ed25519 signatures created with an organization’s POK. */Is this
> significantly more secure than providing a SHA-512 fingerprint, or just
> unnecessary complexity?/*

Providing a SHA-512 fingerprint is a solution that assumes people talk 
to each other out of band.  Super.  Using key signing to verify a 
supplied cert assumes there is a hierarchy and someone at the top tells 
you what to do.  Can also work.  The answer is more about the business 
than the crypto.  There is a preference for the former because it is 
simpler in tech, and a preference for the latter because it is complex 
and requires lots of jobs.

> *Data Formats*
>
> 9. Should the data formats encode binary values in little endian, or big
> endian form? While most network protocols employ binary data formats
> using the big endian form, a number of more recent upstarts (dare I say
> rebels) have switched to little endian. The vast majority of computers
> today are natively little endian (or at least bi-endian). */Should we
> honor tradition, and continue using big endian, thus forcing a future
> generation of programmers to convert their integers before evaluating
> them. Or should we acknowledge the world has changed, and use little
> endian?/*

This is the weirdest question.  A security system should be in total 
control of its data, so it should never need to even ask whether the CPU 
is big/little endian.  It should read in the data, byte by byte, and 
construct the number needed at high level using logic.

(A more practical answer might be to just use network order.  I have no 
idea which that is...)

> *Signet Questions*
>
> 10. In the current draft, the signet informational fields are divided
> into 2 ranges, with each range using either a 1-byte or 2-byte length
> variable. The size of the length variable provides a technical
> limitation on the amount of a data a field could potentially hold. For
> example, the “Postal-Code” field has a limit of 255 bytes, which far
> exceeds any possible legitimate value. The fixed ranges were created so
> parsers would know the size of the length variable when they encountered
> an unrecognized field type, and skip over it. It has been argued that we
> simplify the parser by using a 2-byte length variable for all fields.
> The specification would then provide length limits which a parser would
> have to enforce by trimming any field value over the defined limit.
> */Which strategy is better?/*

This is what I do:  A number is a number of 7-bit encoded bytes. If the 
high bit is zero, you're on the last one.  If the high bit is set, there 
is another byte following.

This is simple.  It works.  There is no need for any other number type.

(OK, that's not quite true.  If there is a need for a negative, a float, 
or a bignum, you need something a bit special.  But that's what objects 
are for...  In practice, 99% of numbers are low positive numbers.)

> 11. */Should organizational signets include a self-signature following
> the cryptographic fields, like user signets, so they can be split?/*
> This change would allow space constrained clients to avoid storing
> information they don’t need/want about organizational domains, while
> retaining the signet in a form that could still be cryptographically
> validated against a management record.
>
>
> *Message Format Questions*
>
> 12. This question goes toward the user experience: when accessing
> displayable content, and binary attachments over a slow network
> connection, */should a DMAP client be able to find out the content-type,
> and/or any of the other meta information found in the MIME headers of a
> body-part/chunk without having to download the entire chunk? If the
> answer is yes, does the chunk’s content-type need to be encrypted? In
> other words, do we consider servers knowing a chunk is video, versus
> rich text, versus plain text a serious privacy leak?/* Note that if the
> answer is “yes” to both questions, the solutions will probably add fair
> bit of additional complexity to the message format.

Yes (I guess) and yes (absolutely).  The notion that you're leaking 
metadata is right at the heart of security.

iang