the futility of DRM (Re: DMCA Crypto Software)

Fri Apr 18 17:12:30 EDT 2003

So the problem I think higher level: the traitor tracing model is not
practically usable.  ie. To have traitor tracing, you have to
personalize the content by embeding identity in the watermark.  (The
idea is that the watermark should be hard to remove, even if the
content is readily obtainable in digital form.)

But the security of identity capture for such a low value service (a
few $ for a movie rental) limit what can economically be spent on
assuring identity.  Therefor to break the system, you don't even
bother renting enough different copies to overcome the traitor tracing
system; you simply obtain a movie in a fake identity, or pluasibly
deniably have someone "steal" your copy, or remotely compromise your
machine that is playing it.  

And on a slightly different aspect of the picture, if you did consider
that the digital copy would be hard to obtain (tamper resistant
player, such as the DVD player model (without the software player
option)), then you don't need watermarks, all you need is a signed
identity of the movie renter's identity, and to make players have a
policy of not playing unsigned content.

But overall I think DRM is economically stupid, and that we are stuck
in a bad local optima for content distribution industry, which is both
bad for them, and bad for freedoms, and bad for the computer hardware
industry.  DRM generically _can not_ stop copying, because
watermarking doesn't work technically (traitor tracing past some low
threshold), and doesn't work economically (because you can't afford
good enough identity assurances to avoid plausibly deniable still
watermarked copies, or copies obtained with forged identity).  

Also digital content encrypted out to the monitor, video card,
speakers encrypted to that output device with keys negotiated with the
content provider is also stupid.  It places a silly burden on
hardware, and won't stop copying.  High quality output devices,
together with high quality personal capture devices, plus the
existance of digital content inside the output devices mean that
content will be captured digitally and re-encoded, or simply undergo a
high quality D->A->D path.

QED.

On the economic side of economic models which make sense where one
includes the presumption that copying does exist, and can't be
stopped: we already have that model.  ie. The content distribution
industry can sell digital copies and compete with pirate distribution
channels (such as kazaa, etc) because:

- convenience -- if the price is reasonably low, and the rental model
is convenient, it is simply not worth people's time to find an
download copies

- quality -- original digital copies tend to be higher quality because
compared to pirate downloads because the downloads are typically
re-encode at lower bit rates to conserve bandwidth; 

- reliability -- also current generation file trading does not
generally deal with spoofing, where you end up with something other
than what you wanted, or you end up with a file full of 0s.  I'd think
this is not a limiting factor and will likely be fixed.

- branding, visibility -- if the content distributors work on better
placing in search engines, more visible brands, content available from
official sites etc., they compete again on convenience

I think the content industry could make more money if they lowered
prices, and improved convenience to compete on value for money and
convenience with the free quasi-illegal services.

Basically the only licensed, legitimate content distribution industry
move that I saw that tried to do this was movie88, which promptly got
attacked by the MPAA and lost their internet connectivity.  (Movie88
was a streaming movie rental business which rented one-view of
streamed video for $1-$2; at that price I think they would have
competed very well with kazaa et al on overall
price-convenience-reliability).

Adam

On Fri, Apr 18, 2003 at 04:39:04AM -0700, Ian Clelland wrote:
> On Fri, Apr 18, 2003 at 01:50:04AM +0200, Nomen Nescio wrote:
> > CR advocates "forensic watermarking".  In the longer report (available by
> > email request) they describe this as a system where there are two versions
> > of selected portions of the content - for example, two alternate versions
> > of a particular movie frame.  There would be multiple such "polymorphs"
> > throughout the content, and each device would have keys such that for each
> > polymorph it would see only one version.  By randomizing and encrypting
> > the frames it can be arranged that the devices can't even tell which
> > frames are polymorphic.  The set of keys assigned to a playback device
> > implicitly identifies the device itself, so that if an unprotected version
> > of the movie is released, the specific versions of the polymorphs that
> > are present will reveal which device did the decryption.
> > 
> > The obvious attack is to combine the output from multiple devices
> > from which keys have been scraped, but this does not work (up to a
> > point) because even when multiple devices are used, there is still
> > enough information in the output to identify which specific devices
> > were involved.  CR gives an example of a 90 minute movie, 30 frames per
> > second, with 1% of the frames being polymorphic - 1620 frames.  Even if
> > an adversary breaks into 4 playback devices and gets their keys in order
> > to identify the polymorph frames, the manufacturer can identify those four
> > devices with an error probability, according to the formula derived by the
> > CR report, of less than 4 x 10^(-10), an extremely good detection rate.
> > 
> > But what happens if you use the CR formula with the assumption that
> > the attacker cracks one more device for a total of 5?  Suddenly the
> > system doesn't work so well, and there are over 10^20 possible sets
> > of 5 devices that could produce the combined output!  We go from
> > 4 x 10^(-10) to 10^20 with just one more device.  This kind of exponential
> > explosion is common to many traitor tracing schemes.  The attackers
> > have an inherent mathematical advantage which is very hard to address.
> > All this is glossed over in the CR analysis.
> 
> Can you post at least an overview of this formula, or describe how it is
> constructed?
> 
> My take on the math is this: Suppose that a 90 minute movie contains
> 1620 'polymorphic frames,' each version of which movie contains one of
> two alternate versions of each such frame. Then every version of the
> movie has effectively been watermarked with a 1620-bit id. All an
> attacker needs to do to release a watermark-stripped version is to find
> both versions of each of the 1620 key frames, and put together a new
> version of the movie, with a random selection for each frame. This new
> version cannot be traced back to any particular source, so the attacker
> cannot be identified.
> 
> If an attacker can only obtain a small number of 'official' versions
> with which to work, then his chances of success depend to a large extent
> on the total number of official versions in existence. (Someone, whether
> the company releasing the movie, or the company doing the watermarking,
> is going to have to keep track of all of the official versions, and whom
> they are licensed to).
> 
> If the attacker can only obtain four versions, then he will have access
> to both versions of 15/16 of the polymorphic frames (assuming that
> watermark bits appear random). Knowing their positions within the movie
> file, he can randomise them, effectively obscuring 1508 bits of the
> watermark. However, the remaining 102 bits will identify the four source
> files with very high probability.
> 
> Assuming that there are 1 billion legitimate copies of the movie
> (roughly 2^30), then the odds are only 2^30/2^102 = 1 in 2^72, or 1 in
> 4*10^21, that any other copy shares those 102 bits with the four source
> files.
> 
> With five source files, the situation changes a bit - there are now only
> 51 identifying bits, and a 1 in 2^21 (1 in 2*10^6) chance that another
> source file might match. Still not good, though.
> 
> With a sixth source, the attacker has cut the number of identifying bits
> to 25. In this scenario, you can expect there to be about 32 copies out
> of 1 billion which share those bits. That's 26 others on top of his 6.
> Much better odds, but still not a lot to hide behind.
> 
> Of course, the attack only gets better with each additional source, and
> after a while (after about lg 1620 = 11 sources,) it is very likely that
> an attacker can put together all versions of every polymorphic frames,
> and there is no way to track it at all.
> 
> 
> Anyway, I'm very curious to know where the original numbers (4*10^-10
> and 10^20) came from. Or perhaps my reasoning is just way off on this.
> 
> 
> Ian Clelland
> <ian at veryfresh.com>

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo at metzdowd.com