Raft Consensus with a Minority of Nodes

(padhye.org)

105 points | by moarbugs a day ago ago

17 comments

rubiquity 5 hours ago ago
This was already known to be true by Heidi Howard’s research that yielded Flexible Paxos[0], Relaxed Paxos[1], and her more general thesis on Distributed consensus[2] as a whole
0 - https://fpaxos.github.io/
1 - https://dl.acm.org/doi/10.1145/3517209.3524040
2 - https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-935.pdf
[-]
- dgacmu 8 minutes ago ago
  I don't think this is correct. Heidi's work made a different observation: That you can smear quorum intersection across phases of paxos, whereas the blog post in this submission is observing that you can do bog-standard quorum intersection in a way other than just thinking about majority intersection, via algebraic data structures. I believe these are generally orthogonal observations.
  (Heidi's work is both deeper and more practical; this post is just a really cute observation that there's something mathematically deeper underlying the idea of intersecting quora.)
- senderista 4 hours ago ago
  Research on quorum systems (such as the finite projective planes described in the article) dates back to the 80s.
  [-]
  - mjb an hour ago ago
    The 70s, if you want to be pedantic (e.g. Gifford's "Weighted Voting for Replicated Data" or Thomas's "A Majority Consensus Approach to Concurrency Control for Multiple Copy Databases", both from '79).
danbruc 8 hours ago ago
The key correctness insight is this: any two majorities of nodes must overlap in at least one node. So between any two consecutive global state changes [...] at least one node participated in both. This single overlapping node carries forward the knowledge of what was previously committed, preventing conflicts and ensuring consistency.
There is another side to this, it must not be possible for two »majorities« to coexist, otherwise they could independently move on in case of a split cluster. This also rules out allowing consensus by majority in addition to majority by a bloc. In the seven node example, there could be a { 1, 2, 3 } and { 4, 5, 6, 7 } split, the first partition being a bloc and the second one being a majority but not containing a bloc.
[-]
- matthewaveryusa an hour ago ago
  Ah thanks for the insight. I was wondering why not ‘block || majority’ to get around scenario 5. Quorum has nothing to do with majority and everything to do with trust structure. My mind is blown.
MathiasPius 8 hours ago ago
I really enjoy when it when someone injects a dose of "wacky" into something that is taken more or less for granted (Raft) to challenge the standard way of thinking about it.
This article flipped my understanding of split-brain or network partitions on its head: You don't actually have to have a majority to ensure progress, you just have to design your quorum selection criteria in such a way that no other partition believes they are authoritative, and these finite projection planes are an interesting way of proving that (with caveats).
senderista 4 hours ago ago
https://www.cs.yale.edu/homes/aspnes/pinewiki/QuorumSystems....
https://vukolic.com/QuorumsOrigin.pdf
https://link.springer.com/book/10.1007/978-3-031-02007-0
https://dl.acm.org/doi/10.1145/3447865.3457962
mjb an hour ago ago
This is cool, and a really fun reminder that "majority" isn't required for quorum systems (it just happens to be the simplest way of thinking about it, and optimal in some senses). Moving from majorities to some other definition of quorum isn't super practical all that often, but is an interesting tool when you think about systems that don't have a uniform probability of failure or disconnection. That's not infrequent - large scale networks have very variable amounts of redundancy depending on geography and distance.
The idea of non-MDS erasure codes isn't quite the same, but they're related in the way that MDS codes are the easiest to think about, and non-MDS codes come with interesting complexities while opening up some cool new options for system design and recovery.
Using "majority" as the criterion has been around for a long time (e.g. Gifford in '79 https://pages.cs.wisc.edu/~remzi/Classes/739/Fall2015/Papers..., and Thomas also in '79 https://dl.acm.org/doi/10.1145/320071.320076). Also related is the idea of weighted voting (e.g. Peleg and Wool in '95 https://www.sciencedirect.com/science/article/pii/S089054018...).
vessenes 8 hours ago ago
Really enjoyed this, and learning a little bit of combinatorics at the same time.
As danbruc mentions below we also would really like our networks to only ever split into sets such that there is at most one set which could include a leader; otherwise we might have a more durable consensus split.
That said, algebraic structures are a tool for working with consensus problems, but there’s also process. Together we get consensus protocols. So, for example, you could have a healing process step that privileges the larger group and forces a merge even if at some moment you had two candidates that believed they were a valid leader for their own split network view.
[-]
- danbruc 7 hours ago ago
  Just to be clear, this is not a problem with this construction. As any two blocs overlap, there can be no split with a bloc on each side. But that is also the problem, a subset containing a bloc is relatively rare property. So while at first it seems that this is all great because you only need a few live nodes to potentially form a bloc, it turns out that it is just too rare for a random set of nodes to contain a bloc to buy you much if anything. In the worst case you could have 99 of 100 nodes live but not have a bloc in case you choose your blocs naively.
  And for the merging, if you can do that, then why bother with consensus to begin with? The problem is that things that got committed are usually not just sitting in a database, they get read and acted upon. Webservice calls made, credit card transaction processed, parcel shipped, ... You can merge and undo commits in one database easily, controlling the ripple effects of those changes in other systems and the real world becomes impossible quickly.
ryanshrott 4 hours ago ago
This is neat, but I wonder how it handles asymmetric partitions. The Fano plane math assumes clean splits. Real networks don’t cooperate like that.
[-]
- smallerize 3 hours ago ago
  Asymmetric meaning that a node can receive messages, but can't send them?
oa335 8 hours ago ago
excellent article.
> The key correctness insight is this: any two majorities of nodes must overlap in at least one node. So between any two consecutive global state changes — whether two commits, two leader elections, or one of each — at least one node participated in both.
intuitively makes sense, but would be nice to see this result explicitly derived or illustrated the same way the fano planes were.
PunchyHamster 3 hours ago ago
> our modified protocol is not guaranteed to make progress whenever a majority of nodes is active
That seems like horrible tradeoff
[-]
- teraflop an hour ago ago
  This isn't being proposed as a serious, useful version of Raft. It's just a thought experiment.
  The sentence you quote is inevitably going to be true for any type of Raft quorum that can reach consensus with a minority of nodes. You don't even need to get into the specifics of the math.
  Suppose you have a quorum Q. Then its complement Q' must not also be able to form a quorum; if it did, a network partition between Q and Q' would create a split-brain. So if Q is a minority subset, then Q' is a majority that cannot reach consensus on its own.
throwaway27448 4 hours ago ago
Private capital, undermining democracy as is typical.