[Cryptography] canonicalizing unicode strings.
John Levine
johnl at iecc.com
Mon Jan 15 02:17:10 EST 2018
In article <0eecd4c2-e418-2c21-ad29-2010dc865b67 at sonic.net> you write:
>It's pretty much a given that strings which are in mixed alphabets AND
>contain characters in one of the alphabets that are homoglyphs for any
>character in another alphabet used in the same string, should never be
>allowed as URLs, identifiers, usernames, titles, product names,
>certificate identifiers, etc.
>
>It's also pretty much a given that VERY few CAs, chat boards, social
>media platforms, e-commerce sites, etc, check for and enforce any such rule.
Most popular web browsers do. Domain names in mixed scripts get shown as punycode:
https://en.wikipedia.org/wiki/IDN_homograph_attack#Defending_against_the_attack
I realize this is nothing close to a complete solution but it defends against
microsoft with a Russian "o" and the like.
R's,
John
More information about the cryptography
mailing list