[Cryptography] MaidSafe

Lee Clagett forum at leeclagett.com
Sat Aug 27 23:24:49 EDT 2016

On Sat, 20 Aug 2016 18:04:05 -0400
"J.M. Porup" <jm at porup.com> wrote:

> Given the fervor over Bitcoin on this list lately, I wonder if anyone
> has any strong thoughts on MaidSafe?
> For an article, please contact me off list if you'd like to be
> attributed.
> thanks
> Jens

I worked under contract for Maidsafe for several months until roughly
when they switched to Rust (~16 months ago). They hired/tasked me with
fixing bugs in the filesystem emulation and creating their C++ storage
API. Distributed systems were new to me, and I am still heavily relying
on intuition from my background in standard client/server networking so
maybe someone on this list knows a bit more about research concepts
that could solve the problems I think Maidsafe has.

I do not see any advancements from existing distributed storage
networks. The C++ code required that every node in the quorum group
receive and successfully process messages in identical order, and a
node making a request assumed the first "quorum-size" messages received
were from the quorum. I do not know how they intend to properly keep
the group in consensus with p2p churn, and they had a non-implemented
solution for the group verification which [Christophe Aguettaz
discovered would likely have security issues][0]. The Rust code
currently mirrors the C++ code in these two areas, and potential
solutions to the problems are vague.

Group Consensus
Maidsafe tries to reduce the number of nodes in a consensus operation
by "randomly" selecting a subset of nodes instead of requiring the
entire network to achieve consensus. Keeping the group in consensus
will still be difficult due to the usual problems - network latencies,
network partitioning, faulty nodes, "churn" (users connecting /
disconnecting). The current Rust implementation assumes that every node
in the group receives and successfully processes the exact same
messages in the exact same order. If either of these constraints is not
true, then one or more nodes in the group will have a different state
which will lower the number of nodes in the quorum. An additional
problematic area is when nodes connect or disconnect. A group can lose
quorum if enough nodes drop out simultaneously, which would mean that
no more updates can be made to the resource. OR new nodes could
replicate the data from the remaining members of the group which are
below quorum. But if the network were partitioned instead, both sides
of the partition would replicate/repair the quorum and continue to
accept writes. The network would then need an algorithm for merging two
data forks.

Bitcoin always goes with the longest chain, so the history of the
partition with the most "hashing power" is more likely to be chosen
with the probability increasing as the duration of the partition
increases. Bitcoin also has an economic incentive for miners to
continue on the longest chain. I am not sure how Maidsafe plans to
resolve forked history. Perhaps some calculation of total closeness
between the competing groups? On April 25th an [update mentioned
something about when "groups go below quorum"][1], and on april 26th
["network segmentation prevention" was listed in the update][2]. This
is not a solution to the fundamental churn and partition problem. A
[newer document][3] mentions group merging, but does not describe how
groups with different states will be resolved. At various points there
have also been mentions of algorithms for selecting better behaved
peers (i.e. network selected super nodes), but these ideas have
generally been vague and add complexity. I am not aware of any other
proposed solutions by Maidsafe. Existing literature on p2p quorum
systems have a tradeoff between bandwidth and consistency /
availabilitiy. A reduction in consistency and availability seem
unsuitable for a system that needs to track virtual coin ownership.
Perhaps someone on this list know more about this research topic, and
some system that could be applied here.

Group Selection
The security of Maidsafe relies on the difficulty of generating a
public-key whose cryptographic hash is the quorum-sized "closest" to
the cryptographic hash of some resource (a storage directory, coin,
etc.). The "distance" or "closeness" between the hashes is determined
by the XOR distance. If it is easy to forcefully join a group, then
the status of the resource can be changed by an attacker easily too.

The problem is determining what nodes are actually the closest in the
dynamically changing p2p network. David Irvine [proposed an idea][4] in
which the requesting node would estimate the size of the network, and
then filter out responses from nodes that were outside of a range based
on the network size. Cryptographic hashes should have an even
distribution, so the range of acceptable hashes would have to be quite
large. For example, a network with 16 million nodes would filter on a
~24-bit prefix (a fudge factor to allow a larger range would likely be
necessary). Unless the network is extremely large, an attacker should
be able to generate keys close enough to the target to have messages
accepted as a legitimate response. In the change to Rust, Maidsafe
moved from RSA-4096 to Ed25519, so the guess rate should have increased
as well, making the situation worse.

In the [blog post][4] mentioned previously, David also discussed having
a group signing key to detect fraudulent messages. However, the
requesting node would need a way to verify the correct public key which
should change as the group membership changes. So a group signing key
adds complexity, but does not add any security if an attacker can gain
control of any group controlling a resource.

The proposed solutions do not appear to be implemented in the Rust
version, so it is possible that I misunderstood the proposed
solution, since the details are sparse. And it appears it does not
matter, because there is a [newer proposal][3]. I do not believe
this to be a fix for the sybil issue; the difficulty of joining a
particular group will depend on the size of the network. Comparative
systems such as Freenet have similar difficulties.

Various Issues
The team also had issues in other areas in the C++ version, which I
brought to their attention previously. They have repeated those
mistakes in a few instances in the Rust version, but all of these
mistakes are more easily solved than the critical issues above. I am
highlighting these because their continued existence makes me question
whether they can solve the more critical issues.

Directory Forking
Maidsafe supports subdirectories, where each subdirectory is maintained
by a separate quorum group. A randomly generated ID determines these
different quorum groups; the quorum groups for subdirectories are not
deterministic. If a subdirectory is deleted and then a new
subdirectory is created with the same name, there are at least two
different quorum groups responsible for tracking the contents of the
subdirectory. If a client accessed the subdirectory before the
deletion, it could continue to write to the directory unaware that its
updates are being ignored by clients that connected to the top-level
folder at a later point in the history.

The solutions are to disallow subdirectories (like most cloud storage
services), not allow subdirectory deletion, or create subdirectory ids
deterministically based on the parent directory with only the root
being randomly generated. The latter solution still has issues when
deleting a subdirectory because the subdirectory contents need to be
deleted atomically when the subdirectory is deleted, which requires two
quorum groups to synchronize the operation somehow.

Convergent Encryption
Maidsafe owns a patent for a convergent encryption variant where a file
is split into chunks, and the encryption key for each chunk is
dependent on the previous 2 chunks. If an entire file has low entropy,
the system is susceptible to the same issue that [Tahoe-LAFS had with
convergent encryption][5]. However, a single block with low entropy is
now protected iff either of the previous two blocks have good entropy.
System-wide Convergent encryption should likely be dropped like it was
in Tahoe-LAFS.

Nonce Re-use
The user account information is encrypted with XSalsa20 and a [reused
key/nonce pair][6]. The construction of XSalsa20 is similar to a
block-ciper in CTR mode, so nonce key/nonce reuse can result in
[plaintext recovery][7]. Directory listings are encrypted with
25519+XSalsa20 with a [reused nonce][8] and [key][9]. The randomly
generated key/pair used in the last function call is irrelevant because
they are encrypted with XSalsa by libsodium's public key encryption
function with the reused nonce and key.

Luckily, the pertinent information in both cases will be cryptographic
keys and therefore harder to analyze via plaintext XORing, but this is
not reassuring. Any changes to the plaintext serialization would affect
the privacy.



More information about the cryptography mailing list