[Cryptography] canonicalizing unicode strings.

John Levine johnl at iecc.com
Mon Jan 15 02:17:10 EST 2018


In article <0eecd4c2-e418-2c21-ad29-2010dc865b67 at sonic.net> you write:
>It's pretty much a given that strings which are in mixed alphabets AND
>contain characters in one of the alphabets that are homoglyphs for any
>character in another alphabet used in the same string, should never be
>allowed as URLs, identifiers, usernames, titles, product names,
>certificate identifiers, etc.
>
>It's also pretty much a given that VERY few CAs, chat boards, social
>media platforms, e-commerce sites, etc, check for and enforce any such rule.

Most popular web browsers do.  Domain names in mixed scripts get shown as punycode:

https://en.wikipedia.org/wiki/IDN_homograph_attack#Defending_against_the_attack

I realize this is nothing close to a complete solution but it defends against
microsoft with a Russian "o" and the like.

R's,
John


More information about the cryptography mailing list