[Cryptography] filtering html

James A. Donald jamesd at echeque.com
Sun Oct 15 04:12:24 EDT 2017


An arbitrary and possibly hostile web page passes through proxy or a 
server, which makes a record of it.

Is there any easy way to filter that web page, stripping out javascript 
and links to outside images and such, so that record is guaranteed to 
display the same way, or closely equivalent way, as the original?

Seems to me this is a job for an html compiler, that you need to parse 
it, filter the parse tree, and then regenerate the vanilla html document 
from the parse tree.  Which sounds like a great deal of work.

But there are lots of services that allow one client to generate html 
that will be seen in their web page by another client.  Which gives Bob 
the potential to doing surprising things to Carol's subscription when 
Carol views content supplied by Bob, so this problem, or somewhat 
similar problems, must have been solved many times before, the problem 
of rendering html incapable of doing surprising things.



More information about the cryptography mailing list