[Cryptography] graph theory and sybil attack

jamesd at echeque.com jamesd at echeque.com
Mon Jul 15 07:35:19 EDT 2019

```On 2019-07-11 6:17 pm, Jerry Leichter wrote:
>>> There are going to be a lot of naturally-occurring, legitimate
>>> (closed) groups of nodes in a network of buyers....

>> Legitimate fans are likely to be fans of other musicians as well, and
>> likely to have genuine interactions with fans of other musicians.

> Only experiment will determine if this is really true, and to what
> degree - but my guess is you'll have plenty of false positives.  In the
> end this will be a ROC problem - you have to trade off among
> sensitivity, specificity, and so on.  Since we're talking about an
> active opponent here, not random noise ... expect this to be very hard.

The Sybil Guard paper, linked by Kate Sills, provided me with relevant
information:""SybilGuard and SumUp are two graph theory solutions for
sybils in social networks that you might be interested in. "

According to Sybil Guard, social networks are empirically determined to
be "fast mixing"

"social networks tend to be fast mixing (defined in the next section),
which necessarily means that subsets of honest nodes have good
connectivity to the rest of the social network"

So the legitimate musician with legitimate fans will be connected to the
rest of the social network - his legitimate fans will have interactions
with legitimate fans of other legitimate musicians, and will purchase
stuff from other legitimate musicians, while the fake fans making fake
purchases will not (though they could do any number of fake reviews of
legitimate musicians).

"in a fast mixing graph, after a small number of hops a random walk is
equally likely to be traversing any edge in a given hop."

The paper argues that sybil subsets of the network are substantially
less likely to be entered after a small number of hops.

"With a length w random walk, clearly the distribution of the ending
point of the walk depends on the starting point. However, for connected
and nonbipartite graphs, the ending point distribution becomes
independent of the starting point when w → ∞. This distribution is
called the stationary distribution of the graph. The mixing time T of a
graph quantifies how fast the ending point of a random walk approach the
stationary distribution. In other words, after Θ(T) steps, the node
on the random walk becomes roughly independent of the starting
point. If T = Θ(logn), the graph is called fast mixing."

If we mix the graph for a suitable time, legitimate nodes will be well
mixed with each other, and sybil nodes will not be well mixed with
legitimate nodes.

Legitimate nodes will have positions that are close to each other in the
space of distributions of end probabilities, and sybil nodes will be far
from legitimate nodes in that space.

This algorithm is O[n*n] in data size (the distribution for each
starting point), and O[n*n*log(n)] in computational time, which is a lot
better than np, though still apt to be inconveniently large.  It can
probably be speeded up considerably by random dimensional reduction,
wherein a vector of very large dimension (the probability of reaching
each end point, the distribution of end points) is randomly mapped to a
vector of dimension O[log of the original dimension]

Which will, I think, speed it up to O[n * (log n)^2] which is acceptable.

Mixing for a suitable time, a suitable path distance, divides the graph
into well mixed regions, one of which is the legitimate nodes.  We
identify which of the regions is non sybil by checking the region of a
small number of known legitimate nodes.  If they all belong to the same
region, our mixing distance is sufficient, and chances are the sybil
nodes are not in that region.

During mixing, the points draw closer to each other in the space of
distributions.  The fan group of a particular musician will coalesce
rapidly, and then the fan groups of all legitimate coalitions will, more
slowly draw together, while the fake fan group drifts very slowly
indeed. We halt the mixing when the known legitimate nodes are all close
to each other, because by then everyone who is a legitimate node is
close to them.

We then downrank reviews or page ranks that are distant from the
coalescence.

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

```