Data Consistency Concepts

Without a transaction (transaction isolation set to NONE), GemFire XD ensures FIFO consistency for table updates. Writes performed by a single thread are seen by all other threads in the order in which they were issued, but writes from different threads may be seen in a different order by other threads.

When a table is partitioned across members of the distributed system, GemFire XD uniformly distributes the data set across members that host the table so that no single member becomes a bottleneck for scalability. GemFire XD ensures that a single member owns a particular row (identified by a primary key) at any given time. When a member that owns a particular row fails, the ownership of the row is transferred to an alternate member in a consistent manner so that all peer servers have a consistent view of the new owner.

It is the responsibility of the owning member to propagate row changes to configured replicas. Concurrent operations on the same row are serialized through the owning member before those operations are applied to replicas. In this way, all replicas see the row updates in the same order. For partitioned tables GemFire XD ensures that all concurrent modifications to a given row on the owning member are atomic and isolated from each other, and that the 'total ordering' is preserved across configured replicas. Keep in mind, however, that the update of a row or an index on a replica member is a non-atomic operation. If you update a primary row outside of a transaction, you are subject to an array of possible failure scenarios as described in Failure Scenarios with Non-Atomic Operations.

GemFire XD uses an eager replication model between peers by propagating from the owning member to each replica in parallel and synchronously. Each replica is responsible for processing the operation, and it responds with an acknowledgment (ACK). Only after receiving all ACKs from all replicas does the owning member return control to the caller. This approach favors data availability and low latency for propagating data changes. By eagerly propagating to each of its replicas, it is possible for clients reading data to be load balanced to any of the replicas. It is assumed that network partitions are rare in practice and when they do occur within a clustered environment, the application ecosystem is typically dealing with many distributed processes and applications, most of which are not designed to cope with partitioning problems.

Note: Other optimistic and "eventually consistent" replication schemes use lazy replication techniques that are designed to conserve bandwidth. These systems increase throughput by batching, and by lazily forwarding messages. Conflicts are resolved after they happen by reaching agreement on the final contents incrementally. This class of systems favors availability of the system even in the presence of network partitions. However, it compromises data consistency for reads, or makes those reads very expensive by having to read from each replica.