Overview of Disk Stores
The two persistence options, overflow and persistence, can be used individually or together. Overflow uses disk stores as an extension of in-memory table management for both partitioned and replicated tables. Persistence stores a redundant copy of all table data in disk stores.
See Evicting Table Data from Memory for more information about configuring tables to overflow to disk.
Shared-Nothing Disk Store Design
Individual GemFire XD peers that host table data manage their own disk store files, completely separate from the disk stores files of any other member. When you create a disk store, you can define certain properties that specify where and how each GemFire XD peer should manages disk store files on their local filesystem.
GemFire XD supports persistence for replicated and partitioned tables. The disk store mechanism is designed for locally-attached disks. Each peer manages its own local disk store files, and does not share any disk artifacts with other members of the cluster. This shared nothing design eliminates the process-level contention that is normally associated with traditional clustered databases. Disk stores use rolling, append-only log files to avoid disk seeks completely. No complex B-Tree data structures are stored on disk; instead GemFire XD always assumes that complex query navigation is performed using in-memory indexes.
Disk stores also support the GemFire XD data rebalancing model. When you increase or decrease capacity by adding or removing peers in a cluster, the disk data also relocates itself as necessary.
Data Types for Disk Storage
- Table data. Persist and/or overflow table data managed in GemFire XD peers.
- Gateway sender queues. Persist gateway sender queues for high availability in a WAN deployment. These queues always overflow, and can be persistent.
- AsyncEventListener and DBSynchronizer queues. Persist these queues for high availability. These queues always overflow, and can be persistent.
Creating Disk Stores and Using the Default Disk Store
Create named disk stores in the data dictionary using the CREATE DISKSTORE DDL statement. You can then specify named disk stores for individual tables in the CREATE TABLE DDL statements for persistence and/or overflow. You can store data from multiple tables and queues in the same named disk store. See Guidelines for Designing Disk Stores.
Tables that do not name a disk store but specify persistence or overflow in their CREATE TABLE statement use the default disk store. The location of the default diskstore is determined by the value of the sys-disk-dir boot property. The default disk store is named GFXD-DEFAULT-DISKSTORE.
Gateway sender queues, AsyncEventListener queues, and DBSynchronizer queues can also be configured to use a named disk store. The default disk store is used if you do not specify a named disk store when creating the queue. See CREATE GATEWAYSENDER or CREATE ASYNCEVENTLISTENER.
Peer Client Considerations for Persistent Data
Peer clients (clients started using the host-data=false property) do not use disk stores and can never persist the GemFire XD data dictionary. Instead, peer clients rely on other data stores or locators in the distributed system for persisting data. If you use a peer client to execute DDL statements that require persistence and there are no data stores available in the distributed system, GemFire XD throws a data store unavailable exception (SQLState: X0Z08).
You must start locators and data stores before starting peer clients in your distributed system. If you start a peer client as the first member of a distributed system, the client initializes an empty data dictionary for the distributed system as a whole. Any subsequent datastore that attempts to join the system conflicts with the empty data dictionary and fails to start with a ConflictingPersistentDataException.