[Cryptography] Using AI to identify state secrets.

Henry Baker hbaker1 at pipeline.com
Tue Jan 24 09:44:46 EST 2017


At 09:42 AM 11/2/2016, Ray Dillinger wrote:
>From: Ray Dillinger <bear at sonic.net>
>
>So some researchers picked over recent dumps of unclassified and secret
>information with a neural network, training it to identify features
>more likely to occur in documents marked secret.  This turns out to be
>about 90% predictable, with a few caveats about some categories being
>far more predictable than others.
>
>https://arxiv.org/abs/1611.00356
>
>Interesting article.
>
>Obviously, this could also be used to pick over unclassified information
>like newspaper articles and identify things ("false positives") that
>some state actor *ought* to have made secret or might have *preferred*
>to have made secret but for some reason didn't or couldn't.  IMO that
>could turn it into a really productive intelligence asset.

It's probably already *far worse* than this article would indicate.

The US govt simply can't keep up with classification/declassification
anymore, and I suspect that it's *already* using "AI" -- perhaps even
Google/Siri/Alexa technology -- for such tasks.

Remember in the 1920's when all of the trends projected that 100% of
the workforce by the 1950's would be telephone operators?  Well, if
the govt actually did what it was supposed to do, 100% of the govt
employees *today* would be reading & declassifying documents.

But the risks should be obvious to those on this list: massive
leakage of information & secrets.  We have a huge corpus of plaintext
to use to "query" an AI (neural net or other) bot, and using enough
*unclassified* probes, we can isolate words, phrases, etc., that the
AI bot considers "secret".  Assuming that the AI bot has been trained
on highly classified info, it should be relatively easy to probe the
bot until it gives up most of its secrets.

The conversation with the AI bot would sound very much like the
conversation Senator Ron Wyden and Representative Justin Amash had
with our recent DNI, but it could be automated to work at considerably
higher speed.

Now, it's not out of the question that Google/Siri/etc. may *already*
be used for this purpose, so it may be time for a researcher to start
querying Google & Siri to see what they know.  Perhaps Russian &
Chinese researchers are *already* performing such queries.

With Alexa finding itself in heavy use around the DC & San Antonio
areas, Alexa may already be a prime target of Russian & Chinese hackers.



More information about the cryptography mailing list