[Cryptography] Reading encrypted generative AI chats
Jerry Leichter
leichter at lrw.com
Sat Mar 16 09:22:56 EDT 2024
https://arstechnica.com/security/2024/03/hackers-can-read-private-ai-assistant-chats-even-though-theyre-encrypted/ describes an attack that reads - actually, guesses with good accuracy - the responses of generative AI programs, even though they are sent through a TLS connection.
The attack relies on a couple of features of these interactions:
1. All the AI's they attack return responses one token at a time, as each token is generated; and there are substantial gaps between tokens. This in turn means that the tokens are delivered as separate TLS packets.
2. The underlying distribution of tokens is far from random. Not only is it English text, but apparently even compared to English, there are predictable patterns.
Because of 1, they can extract a sequence of token lengths. Because of 2, they can train an AI to make very good guesses about the underlying token sequence.
We've seen this before! Well over a decade ago, there were demonstrated attacks against encrypted, compressed voice, based on exactly the corresponding properties. A similar attack does really well at determining which sites someone is browsing even if all of them are https sites, or even through a VPN in some cases: Pages on the Web today are constructed from many components and link to other pages on the same site. The lengths of page and page segment downloads are highly description of the actual sites being browsed.
The attack is described (probably even by the authors) as a side-channel attack. This is *wrong*, and it's exactly the kind of thinking that leads us to overlook these attacks repeatedly. In the ideal world of math, an encryption algorithm simply maps strings to strings. That's probably a reasonable description for encrypted data at rest, but it's dead wrong for what's probably the majority of encryption usage today: Encryption of streams of data in segments delivered over time. The lengths of those segments aren't "side channel" information - they are part of the data being transmitted. Those lengths, in many protocols, may even appear explicitly inside the data being transmitted.
Attacks based on characterizing sequences of lengths should be seen as akin to dictionary attacks. Just as we expect cryptosystems today to resist dictionary attacks (by adding randomness to the encryption thus avoiding encrypting the same data repeatedly), we should expect them to resist attacks against segment lengths. There really only seems to be one way to do this: Add noise data to the encrypted channel thus adding enough noise to the recovered segment lengths to eliminate expected attacks; or, in the limit, always keep the channel busy so as to hide segment boundaries (the same technique used to defeat traffic analysis). We seem to resist doing anything in this direction because it increases the amount of data that needs to be transmitted, and despite the incredible increases in bandwidth over the last couple of decades, we still worry about efficient encodings (because we manage to add so much "useful" stuff to everything we send). It's time to realize that there is an inherent tradeoff between security and efficiency in the use of the channel and start making appropriate tradeoffs.
-- Jerry
More information about the cryptography
mailing list