Because there is no atomicity for multiple operations that occur outside of a
transaction, different GemFire XD member failure scenarios can lead to data inconsistencies
that are hard to diagnose. Some potential failure scenarios and their consequences are
described in this section.
- Partitioned table updates with secondaries. When a client sends an update operation
to a partitioned table with redundancy configured, GemFire XD first puts the
updated value on the member that holds the primary partition for the table.
Afterwards, the insert is applied to an internal global index, which may reside
on a different node. The insert is then applied to any secondaries that are
configured for the partition. Without a transaction, each of these operations is
subject to a potential failure scenario which may cause data inconsistency.
For example, a failure may occur after updating the global index but
before updating secondary partitions, which leads to an inconsistency
between the table and its index. (Retrying the update to apply to
secondaries fails with an EntryExistsException, because the GemFire XD
internal global index already contains the entry.) The client itself may
also fail along with one or more servers before having the opportunity to
retry the operation, which can lead to inconsistencies. In either case, the
table and index remain inconsistent, which can lead to incorrect foreign key
checks for future operations against the table.
If the update
operation in this scenario involved GemFire XD deleting an entry from its
global index, then failures can result in the entry being deleted from the
index but not from secondary partitions. Retrying the delete results in no
object being found, and the delete fails without an error.
of the above failure scenarios the update or delete operation would also
fail to be replicated over a WAN configuration, leading to inconsistencies
between multiple GemFire XD clusters as well as inconsistencies within the
- Tables with unique constraints on a non-partitioned column. With
partitioned tables, if a unique constraint is created on a column that is not
used for partitioning the table, GemFire XD once again uses an internal global
index to manage updates to the column. Updates to the column itself and the
internal global index are again treated as separate operations, and failures or
concurrent operations can result in the table holding data that violates the
- Parent/Child table updates when tables are not colocated on the foreign key column.
When parent/child tables are not colocated on the foreign key column, GemFire XD
again uses an internal index when managing updates. An insert operation on a
child table may look up the foreign key of a parent table in the internal global
index. If the parent key exists in that index at the time of the insert, then
the insert to the child succeeds. However, a concurrent delete operation on the
parent table could scan the child table at a time before the child insert
operation has completed in the table, so the delete operation would also
succeed. This can result in a situation where both tables to contain values that
violate the parent-child constraint between the two tables.
- Bulk DML operations. Bulk DML operations with overlapping qualifying rows
can update or delete incorrect rows. See Atomicity for Bulk Updates.
- Automatic Retry and Consistency. GemFire XD client drivers are designed to
detect internal system failures and retry a failed operation on any available
redundant copy. However, retries can result in non-idempotent behavior in these
- When a trigger is fired a second time, any changes to the data that were
done from within the trigger may be applied twice.
- Accumulative expressions in statements (such setting a column "x" to the
value "x+1") can result in the column value being incremented twice
after an automatic retry.
Best Practices for Avoiding Data Inconsistencies
Each of the above failure scenarios can cause data inconsistency problems that may be
hard to detect, and that cannot be resolved by simply retrying operations from the
client. Always keep in mind the potential pitfalls associated with using non-atomic
operations, and use transactions where necessary to preserve the integrity of your
data. To ensure data consistency at all times, always perform the following
operations from within a transaction with REPEATABLE_READ isolation:
- Insert, update, or delete operations on rows in tables with unique or
foreign key constraints.
- Operations that invoke triggers.
- Statements that use accumulative expressions (such setting a column "x" to
the value "x+1").
- Bulk DML operations.
In addition, always configure partitioned tables with at least one redundant