¿Cuál es la diferencia entre Paxos y W+R>=N en Cassandra?

Dynamo-like databases (e.g. Cassandra) can enforce consistency by means of quorum, i.e. a number of synchronously written replicas (W) and a number of replicas to read (R) should be chosen in such a way that W+R>N where N is a replication factor. On the other hand, PAXOS-based systems like Zookeeper are also used as a consistent fault-tolerant storage.

What is the difference between these two approaches? Does PAXOS provide guarantees that are not provided by W+R>N schema?

preguntado el 28 de agosto de 12 a las 10:08

FWIW, Zookeeper isn't Paxos-based, it is a two phase commit protocol (sans aborts) with a separate custom leader election protocol when the master goes down. Granted, you can consider that as an implementation of Vertical Paxos, but in the end, all correct consensus algorithms can be mapped onto Paxos. -

5 Respuestas

Yes, Paxos provides guarantees that are not provided by the Dynamo-like systems and their read-write quorums. The difference is how failures are handled and what happens during a write. After a successful write, both kind of systems behave similarly. The data will be saved and available for reading afterwards (until overwritten or deleted) and so on.

The difference appears during a write and after failures. Until you get a successful answer from W nodes when writing something to the eventually consistent systems, then the data may have been written to some nodes and not to others and there is no guarantee that the whole system agrees on the current value. If you try to read the data back at this point, some clients may get the new data back and some the old data back. In other words, the system is not immediately consistent. This is because writes aren't atomic across nodes in these systems. There are usually mechanisms to "heal" an inconsistency like this and "eventually" the system will become consistent again (i.e. reads will once again always return the same value, until something new is written). This is the reason why they are often called "eventually consistent". Inconsistencies can (and will) appear, but they will always be dealt with and reconciled eventually.

With Paxos, writes can be made atomic across nodes and inconsistencies between nodes are therefore possible to avoid. The Paxos algorithm makes it possible to guarantee that non-faulty nodes never disagree on the outcome of a write, at any point in time. Either the write succeeded everywhere or nowhere. There will never be any inconsistent reads at any point (if it's correctly implemented and if all the assumptions hold, of course). This comes at a cost, however. Mainly, the system may need to delay some requests and be unavailable when for example too many nodes (or the communication between them) aren't working. This is necessary to assure that no inconsistent replies are given.

To summarize: the main difference is that the Dynamo-like systems can return inconsistent results during writes or after failures for some time (but will eventually recover from it), whereas Paxos based systems can guarantee that there are never any such inconsistencies by sometimes being unavailable and delaying requests instead.

Respondido 17 Jul 14, 21:07

Paxos is non-trivial to implement, and expensive enough that many systems using it use hints as well, or use it only for leader election, or something. However, it does provide guaranteed consistency in the presence of failures - subject of course to the limits of its particular failure model.

The first quorum based systems I saw assumed some sort of leader or transaction infrastructure that would ensure enough consistency that you could trust that the quorum mechanism worked. This infrastructure might well be Paxos-based.

Looking at descriptions such as https://cloudant.com/blog/dynamo-and-couchdb-clusters/, it would appear that Dynamo is no based on an infrastructure that guarantees consistency for its quorum system - so is it being very clever or cutting corners? According to http://muratbuffalo.blogspot.co.uk/2010/11/dynamo-amazons-highly-available-key.html, "The Dynamo system emphasizes availability to the extent of sacrificing consistency. The abstract reads "Dynamo sacrifices consistency under certain failure scenarios". Actually, later it becomes clear that Dynamo sacrifices consistency even in the absence of failures: Dynamo may become inconsistent in the presence of multiple concurrent write requests since the replicas may diverge due to multiple coordinators." (end quote)

So, it would appear that in the case of quorums as implemented in Dynamo, Paxos provides stronger reliability guarantees.

Respondido 08 Abr '19, 14:04

Paxos and the W+R>N quorum try to solve slightly different problems. Paxos is usually described as a way to replicate a state machine, but in fact it is more of a distributed log: each item written to the log gets an index, and the different servers eventually hold the same log items + their index. (Replicated state machine can be achieved by writing to the log the inputs to the state machine and each server replays the state machine on the agreed inputs according to their index). You can read more about Paxos in a blog post I wrote aquí.

The W+R>N quorum solves the problem of sharing a single value among multiple servers. In the academia it is called "shared register". A shared register has two operations: read and write, where we expect the read to return the value of the previous write.

So, Paxos and the W+R>N quorum live in different domains, and have different properties (e.g., Paxos saves an ordered list of items). However, Paxos can be used to implement a shared register, and a W+R>N quorum can be used to implement a distributed log (although, very inefficiently).

Saying all the above, sometimes the W+R>N quorums aren't implemented in their "fully robust" way, as it will require more than one communication round. Thus, in systems that want low latency, it is possible that their implementation of W+R>N quorums provide weaker properties (e.g., conflicting values can co exist).

To sum up, theoretically, Paxos and the W+R>N can achieve the same goals. Practically, it would be very inefficient, and each one is better for something slightly different. Even more practically, W+R>N isn't always implemented fully, thus scarifying some consistency properties for speed.

Noticias: Paxos supports a very general failure model: messages can be dropped, nodes can crash and restart. The W+R>N quorum scheme has dfferent implementations, many of which assume less general failures. So, the difference between the two also depends on the assumption on the possible failures that are supported.

Respondido el 05 de diciembre de 16 a las 21:12

There is no difference. The definition of a quorum says that any two quorums' intersection is not empty. Simple majority quorum is an example NOT a definition. Take a look at Dr. Lamport's later paper "Vertical Paxos", where he gave some other possible configuration of quorums.

Multi-decree paxos protocol (AKA Multi-Paxos), in steady state it's just two phase commit. Ballot number changes are only needed when the leader fails.

Zookeeper's replication protocol (ZAB) , and RAFT are all based on Paxos. The differences are in fault-detection and transition after a leader fails.

Respondido 13 Jul 15, 19:07

Raft is not based on Paxos as follows from the first paragraph of the "Raft paper" raft.github.io/raft.pdf - Ilia Silvestrov

As mentioned in other answers, in an R+W > N system, the writes are not atomic on all nodes which means that when a write is in progress (or during a write failure) some nodes will have newer values and some older ones. Take an example of a system where n=3, r=2, and w=2. For clarity let's assume the 3 nodes are named A, B, and C. Consider this scenario: a write is in progress; node A has been updated while B and C are still in process of receiving the updated value. Clients reading from A and B will see the newer value (resolved using version vectors or last write wins) and clients reading from B and C will see old values. This type of read is not considered linearizable. Such issues will not occur with proper linearizable systems such as Paxos or Raft.

Respondido el 25 de diciembre de 21 a las 16:12

Su respuesta podría mejorarse con información de apoyo adicional. Por favor editar para agregar más detalles, como citas o documentación, para que otros puedan confirmar que su respuesta es correcta. Puede encontrar más información sobre cómo escribir buenas respuestas. en el centro de ayuda. - Comunidad

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.