How do group participants in a decentralized, metadata resistant, asynchronous environment come to a consensus on transcript consistency? This is a fundamental problem when designing such systems and one that we need to solve in order to advance the discipline and build usable tools that people can rely on.
Cwtch presents a partial solution to the problem through the introduction of a concept called “Untrusted Infrastructure”. All participants in a group transcript relay their messages through Cwtch Server, the metadata resistant properties of the system mean that, the server gains no information on which to manipulate the transcript in a targeted fashion and as such peers can have some assurance that the transcript they ultimately see reflects reality.
There are conditions where such a structure fails, most notably if there is collusion between a conversation participant and the Cwtch Server, at this point the server can gain information sufficient to present different transcripts to different participants.
In Cwtch, such manipulation is detectable as each Cwtch message contains a collection of reference signatures to previous messages - as such the dropping of a message to manipulate the transcript won’t go unnoticed.
As such, in Cwtch, we can guarantee transcript consistency under a set of, what I believe are acceptable, assumptions:
- If we trust all group participants not to collude with the Server, then we guarantee transcript consistency.
- If we are willing to use multiple servers that we assume don’t collude, then we guarantee transcript consistency (even if one server drops messages under collusion, we assume the other(s) will not) - this comes with a performance penalty (bandwidth and servers aren’t free).
As our trust assumptions in group participants weaken when we are forced to rely on resolving issues in a less than ideal manner and we hit up against some of the problems encountered by Briar.
How we resolve such inconsistency is the heart of the problem.
- If we trust at least two thirds of group participants, we can detect and resolve attacks on transcript consistency (in an eventually consistent manner).
Any PBFT protocol can resolve transcript consistency where we trust two thirds of participants. However such algorithms rely on the assumptions of bounded delay and guaranteed delivery, assumptions which are not compatible with Briar’s threat model, and only partially compatible with Cwtch’s threat model.
Asynchronous PBFT protocols are possible with some element of trusted or synchronous setup. Trusted setup isn’t usually a term that is applicable to metadata resistant systems - Though interestingly in Cwtch we have the concept of a group originator, which might be compatible with a trust dealer concept (one I might explore as an academic exercise).
Practically speaking however, the two-thirds honest assumption might be a step too far into the abstract. Metadata resistant systems are very adversarial. Collusion should be expected, and we generally assume small numbers of participants which makes violating the assumption trivial.
- If we trust at least one group participant to be consistently honest, we can detect and resolve attacks on transcript consistency (in an eventually consistent manner).
And so we come back to the idea of consistently-honest explicit trust. An assumption that I think is more compatible with metadata resistant systems. If I unconditionally trust at least one group participant, I can eventually gain an honest view of the network - through the use of standard CFDT.
If coupled with a non-collusion assumption (or at least a no-practical-collusion) assumption, then we can derive a strong guarantee of long-term transcript consistency in both Cwtch and Briar.
- If we trust group participants to be inconsistently honest, such that in any given period at least one is honest, we can detect, but not resolve attacks on transcript consistency (in an eventually consistent manner).
This is our worst case scenario, if we can trust no participants to be consistently honest, it is impossible to resolve a transcript - indeed, the very idea of a transcript is thrown out the window - the only way to resolve such a scenario is to fall back to a many-server (i.e. a many observer) architecture, in a way, such that we can assume, that no single entity is able to collude with every server and, because every message refers to other messages, the aggregate of all messages received from all servers represents the consistent transcript.
The idea of a set of independent observers is, I think, an important one. We can sufficiently bind such infrastructure to the operating assumptions of untrusted infrastructure and as such gain a view of the world we can trust.
There are a number of practical considerations (reliance, availability and setup to name a few - see the Cwtch paper for more details), but I believe these are fundamentally solvable.
Whatever the case, the tradeoff between trusting group participants or relying on independent observers is evident and any metadata resistant system needs to be explicit about where in that spectrum they want to base their assumptions - as that design choice clearly impacts many other parts of the system.