[Cryptography] How does the size of a set of target results influence the complexity of a preimage attack?

Sat Oct 17 03:21:35 EDT 2015

2015-10-16 2:04 GMT+02:00 Ryan Carboni <ryacko at gmail.com>:

> Aren't collisions checked manually to ensure that they actually are the
> same file? I don't really understand the intent of the question.
>

No,  there are large sets of hashes that are distributed by organisations
spending time on categorizing files and distributing the hashes with
categorization to parties wanting to use these sets of hashes in a forensic
investigations. In the forensic investigation, these hash sets are then
used with the hashes calculated over files found on media in a set-theory
way in order to focus both the automated and manual part of the digital
forensic investigation.  Some of the hash sets are used as a "we can safely
ignore these files" type of set. These include files from OS distributions,
large collections of software packages, clipart collections, etc,etc.
Basically the stuff that you don't want to waste man-hours , CPU-cycles or
IO activity on.  Some sets are "A person needs to look at these" type of
set. These sets are things like collections of the hashes of child
pornographic images.

So there are three scenario's where a preimage attack could be a problem:

1) As an anti forensic technique, making an evidence containing file have a
hash from one of the  "we can safely ignore these files" set.
2) As denial of service attack on the human component of the investigation
by making a relatively large set of non-legally-problematic files match
with a hash from a "A person needs to look at these" set.
3) As part of a targeted anti-forensic attack. Making a file that can
exploit a flaw in a viewer for its file type match with a hash from a  "A
person needs to look at these" set.

So basically if an attacker is able to successfully do a first preimage
attack where the hash of the modified file matches "ANY ONE OF" of the
hashes from the large set of hashes (for what we mostly don't have the
original files).

Now my question is the following:

If the complexity of a first preimage attack for the used hasing algoritm
is defined as 2^{N}, and the size of the hash-set that our "ANY ONE OF"
implies is 2^{M}, what would be the worst case complexity of an  "ANY ONE
OF" geared first preimage attack?

I gather it would be smaller than 2^{N} yet probably not as small as the
naive assumption of 2^{N-M}.

I hope my question is making more sense now.

> _______________________________________________
> The cryptography mailing list
> cryptography at metzdowd.com
> http://www.metzdowd.com/mailman/listinfo/cryptography
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.metzdowd.com/pipermail/cryptography/attachments/20151017/8c4b5f79/attachment.html>