[Cryptography] DIME // Pending Questions // Seeking Your Input

Tue Mar 3 13:53:30 EST 2015

On Fri, Feb 27, 2015 at 10:08:17AM -0600, Ladar Levison wrote:

> 2. The current RFCs dictate that domain names are handled case
> insensitively, while mailbox names (what goes in front of the @ symbol),
> should be considered case sensitive. 

Yes, the traditional ASCII LDH domain parts of email addresses are
case-insensitive.  With IDNA 2008 and EAI, U-labels are simply
required to be lower case.  IDNA Punycode encoded A-label are then
also lower case, and case-folding of A-labels that happen to contain
an upper-case ASCII letter is fine (should generally be unnecessary
if the A-labels are generated by fully compliant software and not
subsequently mangled).

How and whether a user-agent chooses to convert a non-ASCII domain
name provided by a user to a (possibly corresponding) lower-case
U-label is left to the user-agent, with strong warnings against
naive automatic case folding by software that does not know
the language context.

> These days email
> systems generally operate case insensitively on mailbox names because of
> the obvious implications associated with allowing email addresses which
> are almost identical. 

Case-insensitive local parts are a legacy of gatewaying email
to/from mainframe computers (BITNET) where all addresses were
upper-case and email lists kept in mainframe databases where they
were generally also upper case.  This was then further entrenched
by LDAP schemas that define case-insensitive matching rules for
email addresses (and often define their encoding as IA5STRING
complicating the transition to UTF-8).

SMTP however has never *required* that local parts be case-insensitive,
and instead requires relays to preserve the case of localparts,
leaving the option of case folding to the destination system.

> The question is: */should DIME mandate mailbox
> names be compared case insensitively?/* Keep in mind that if names are
> considered case sensitive, a capitalized letter could result in a server
> returning a different signet (with different encryption keys)!

If you choose to define ASCII locaparts as case-insensitive in
DIME, you'll probably not run into too many problems.  If you
attempt to extend this to UTF-8, things get much more complicated,
and you're not supposed to do that.

> 3. Somewhat related to the previous question, */should support for
> international domain names and mailboxes (using UTF-8) be mandatory?/*

If you're building a brand new protocol suite, it may as well be
UTF-8 by definition.  In practice most users will want ASCII addresses,
as the usability of non-ASCII addresses will be limited to just systems
that support DIME or EAI (Postfix 3.0 has usable if not yet mature EAI
support).

> Or should UTF-8 address support be optional? Systems without support for
> UTF-8 addresses would be forced to use the ASCII encoding scheme defined
> in the current email RFCs.

That works for domains, but there is no standard ASCII encoding of
non-ASCII localparts.  (There probably should have been, and perhaps
will yet be such an encoding).

> 12. Should a DMAP client be able to find out the content-type,
> and/or any of the other meta information found in the MIME headers of a
> body-part/chunk without having to download the entire chunk? If the
> answer is yes, does the chunk's content-type need to be encrypted? In
> other words, do we consider servers knowing a chunk is video, versus
> rich text, versus plain text a serious privacy leak?/* Note that if the
> answer is 'yes' to both questions, the solutions will probably add fair
> bit of additional complexity to the message format.

If this is to be useful on mobile devices that might want to download
message bodies sans attachments, then probably yes, and in that case,
indeed encrypted message-part metadata.

One way to do that is to use a modified format in which the message
starts with a "MIME skeleton" (all the headers for the body and
attachments) with the content of all leaf parts replaced by an
offset+length from a stream of content blobs appended to the message.
The boundaries between invidiual content parts should not be directly
apparent.  You could chunk the content encoding to allow integrity
verification of incompletely downloaded parts.

That way the client can retrieve the complete MIME skeleton with
placeholders for the remote message parts, then download whichever
parts are small enough or are explicitly requested.

The MIME skeleton would be encrypted as a single object.  One could
encrypt just the primary headers separately from all other MIME
headers if there is a concern about edge cases in which the MIME
metadata is excessively large.

-- 
	Viktor.