[Cryptography] DIME // Pending Questions // Seeking Your Input

Jerry Leichter leichter at lrw.com
Sat Feb 28 17:47:26 EST 2015


On Feb 28, 2015, at 2:33 PM, Hasan Diwan <hasan.diwan at gmail.com> wrote:
> 2. The current RFCs dictate that domain names are handled case insensitively, while mailbox names (what goes in front of the @ symbol), should be considered case sensitive. This behavior stems from the fact that most early email systems ran atop Unix, which has a case sensitive file system. Thus a capital letter in the mailbox would result in the email server saving a message in a different file. These days email systems generally operate case insensitively on mailbox names because of the obvious implications associated with allowing email addresses which are almost identical. The question is: should DIME mandate mailbox names be compared case insensitively? Keep in mind that if names are considered case sensitive, a capitalized letter could result in a server returning a different signet (with different encryption keys)!
> 
> How would this work for languages like Arabic or Chinese, which have no notion of upper and lower case?
There are notions of normalization to remove non-semantically meaningful information.  These are highly language-dependent, and only someone who really knows the language should attempt to define such things.  Even Western languages have traps for the unwary who try to generalize from their own native language - e.g., in German upper case SS (two characters) corresponds to lower case ... hmm, can't figure out how to generate the single character, it looks similar to a beta.

I have no idea what would be appropriate for Chinese or Arabic.  If I remember right, Arabic has no case distinction, but some letters come in three variants, one used at the beginning of words, one in the middle, one at the end.  But I don't know if there would be any reason to "normalize" these to one of the three forms in name matching.  Perhaps there *is* no universal normalization for some languages; in that case, you have to leave the name alone.  (There are additional levels of complexity when a character set is used by more than one language, and the languages have different rules.  This occurs for sorting of names in the related but distinct Nordic languages.)

I'm sure all of this has been beaten to death in the internationalization community.  A mail system standard shouldn't attempt to repeat the work - it should simply defer to existing standards on how to handle names in different languages and cultures.
                                                        -- Jerry

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.metzdowd.com/pipermail/cryptography/attachments/20150228/1d8c9e44/attachment.html>


More information about the cryptography mailing list