[Cryptography] graph theory and sybil attack
jamesd at echeque.com
jamesd at echeque.com
Wed Jul 3 21:02:45 EDT 2019
The sibyl attack is a graph theory problem, not a cryptography problem,
and its application is in distributed systems.
But distributed systems are a cryptography problem, so, if the
moderators permit, I would like to ask about this problem here:
Yacy is an open source distributed search engine for publishing, but the
hard problem is the sibyl attack, the problem that Google refers to a a
link farm – a bunch of fake sites that link to each other to give each
other fake reputation. Not clear how Yacy addresses the sibyl attack,
and I suspect that if usage of Yacy increases, it will be subject to
Similarly, systems for giving people reputation for supplying goods are
subject to sibyl attack with fake customers. Big problem with Dark Web
sites that attempt to do an ebay on Tor. Distributed systems tend to be
subject to sibyl attacks.
Resolving the sibyl attack is a problem of partitioning the graph to see
if a group of nodes well connected to each other are, considered as a
single node, well connected to rest of the network. Page rank and
analogous measures such as seller reputation are a measure of connection
to the graph as a whole, and the sibyl attack is faking good connection
to the graph as a whole by creating a seemingly large subgraph that is
well connected to itself but poorly connected to the rest of the graph.
Considering the sibyl attack as a graph theory problem, we have some
ranking criterion that is in some sense or some part a measure of
connection to the graph as a whole, such as page rank, seller
reputation, or the node's provision of services (we want to penalize
leecher nodes and reward seeder nodes) and we want the ranking criterion
to exclude relatively isolated large subgraphs from having
inappropriately high rank.
Assume for example that an pseudonymous purchaser can publish a review
that is demonstrably linked to a payment actually made (but which might
have been made by his left hand to his right hand), and we want to give
sellers rank on the basis of reviews. But we want to give high impact
to unfavorable reviews by purchasers that have issued lots of favorable
reviews to real sellers, and to give favorable reviews impact only if
the purchaser nym has made lots of reviews of real sellers.
This is similar to the google page rank problem. A site is high rank if
it has lots of links from real sites that have high rank. So what is a
It would be nice to be able to state this problem as a graph theoretic
problem and to relate it to well known graph traversal algorithms.
Excluding the sibyl attack and excluding a link farm is identifying a
subgraph that is strongly internally connected, and then ranking that
subgraph as a single node. We want a rank value for nodes that is
unaffected or not strongly affected by collapsing subgraphs that are
strongly internally connected.
This is also related to the DuckDuckGo algorithm, where as search query
is identified as a search on a topic, and a topic is associated with a
small, human identified and human generated, set of subsets of the whole
It is also related to the optimization of performing one's rank
algorithm on a model of the graph that is substantially smaller than the
actual graph, in that groups of nodes are collapsed into a single node.
A ranking algorithm optimized in this manner is likely to inherently be
able to deal with the sibyl attack.
A distributed search engine like Yacy gets an inherently large number of
humans contributing groupings, but it is hard to deal with such human
DuckDuckGo is apt to generate politically incorrect search results, and
making AI engage in crimestop is currently a major research issue.
DuckDuckGo resolves this problem as part of its topic identification
algorithm, by passing searches likely to generate politically incorrect
search results to Bing, which is of course not part of the graph theory
question that I am asking. What I am asking is how do we abstract these
messy particular problems as graph theory problems of treating groups of
nodes with strong internal connection as a single node.
Google does this analysis with respect to politically incorrect groups
of nodes. I don't know how Bing does it - I suspect they rely primarily
on humans identifying major politically incorrect nodes, and then down
ranking nodes that link to them.
This email has been checked for viruses by Avast antivirus software.
More information about the cryptography