Data Consistency Concepts

Without a transaction (transaction isolation set to NONE), GemFire XD ensures FIFO consistency for table updates. Writes performed by a single thread are seen by all other threads in the order in which they were issued, but writes from different threads may be seen in a different order by other threads.

When a table is partitioned across members of the distributed system, GemFire XD uniformly distributes the data set across members that host the table so that no single member becomes a bottleneck for scalability. GemFire XD ensures that a single member owns a particular row (identified by a primary key) at any given time. When an owning member fails, the ownership of the row is transferred to an alternate member in a consistent manner so that all peer servers have a consistent view of the new owner.

It is the responsibility of the owning member to propagate row changes to configured replicas. All concurrent operations on the same row are serialized through the owning member before the operations are applied to replicas. All replicas see the row updates in the exact same order. Essentially, for partitioned tables GemFire XD ensures that all concurrent modifications to a given record on on the owning member are atomic and isolated from each other, and that the 'total ordering' is preserved across configured replicas. Keep in mind, however, that the update of the row on a replica member (or the update of an index) are non-atomic operations. If you update a primary row outside of a transaction, you are subject to an array of possible failure scenarios as described in Failure Scenarios with Non-Atomic Operations.

The operations are propagated in parallel from the owning member to all configured replicas. Each replica is responsible for processing the operation, and it responds with an acknowledgment (ACK). Only after receiving all ACKs from all replicas does the owning member return control to the caller. This ensures that all operations that are sequentially carried out by a single thread are applied to all replicas in the same order.

There are several other optimistic and eventually consistent replication schemes that use lazy replication techniques designed to conserve bandwidth, and increase throughput through batching and lazily forwarding messages. Conflicts are discovered after they happen by reaching agreement on the final contents incrementally. This class of systems favor availability of the system even in the presence of network partitions but compromises consistency on reads or make the reads very expensive by reading from each replica.

GemFire XD instead uses an eager replication model between peers by propagating to each replica in parallel and synchronously. This approach favors data availability and low latency for propagating data changes. By eagerly propagating to each of its replicas, it is possible for clients reading data to be load balanced to any of the replicas. It is assumed that network partitions are rare in practice and when they do occur within a clustered environment, the application ecosystem is typically dealing with many distributed processes and applications, most of which are not designed to cope with partitioning problems.