The container for data that determines its storage site (or sites when there is redundancy), and the unit of migration for rebalancing.
A relationship between two tables whereby the buckets that correspond to the same values of their partitioning fields are guaranteed to be physically located in the same server or peer client. In GemFire XD, a table configured to be colocated with another table has a dependency on the other table. If the other table needs to be dropped, then the colocated tables must be dropped first.
A server or peer client process that is connected to the distributed system and has the host-data property set to true. A data store is automatically part of the default server group, and may be configured to be part of other server groups.
A colocation relationship that is set up automatically between tables when there is no COLOCATED WITH clause in the CREATE TABLE statement.
The anonymous server group that implicitly includes all servers in the distributed system. This is the server group that hosts the data for a table where there is no SERVER GROUPS clause in the CREATE TABLE statement, and there were no server groups specified in the CREATE SCHEMA statement for the schema that this table belongs to.
A typical GemFire XD deployment is made up of a number of distributed processes that connect to each other to form a peer-to-peer network. These member processes may or may not host any data. The JDBC client process and all servers are always peer members of the distributed system. The members discover each other dynamically through a built-in multicast based discovery mechanism or using the GemFire XD locator service when TCP is more desirable. Sometimes a distributed system is also referred to as a GemFire XD cluster.
A parallel SQL query engine that can read and write data to HDFS. HAWQ uses standards compliant SQL. GemFire XD utilizes the PXF driver installed with HAWQ to provide HDFS table data to HAWQ external tables.
Hadoop Distributed File System. GemFire XD supports the HDFS implementation provided with Pivotal HD Enterprise.
Memory allocated for use by the JVM. Heap memory undergoes garbage collection.
Horizontal partitioning refers to partitioning strategies where a table is split by rows so that a bucket always contains entire rows. Vertical partitioning refers to strategies where a table is split by columns so that a bucket always contains entire columns. GemFire XD currently only supports horizontal partitioning strategies.
A partitioning strategy based on specified lists of values of one or more fields. For example, a table could be list-partitioned on a string-valued field so that all the values for a specified list of string values are placed in the same bucket.
A locator facilitates discovery of all members in a distributed system. This is a component that maintain a registry of all peer members in the distributed system at any given moment. Though typically started as a separate process (with redundancy for HA), a locator can also be embedded in any peer member (like a server). This opens a TCP port and all new members connect to this process to get initial membership information for the distributed system.
Memory that is not part of the JVM's allocated heap but is allocated upon server startup for data storage. Off-heap memory is not managed by JVM garbage collection processes.
A table that manages large volumes of data by partitioning it into manageable chunks and distributing it across all the servers in its hosting server groups. Partitioning attributes, including the partitioning strategy can be specified by supplying a PARTITION BY clause in a CREATE TABLE statement. See also replicated table, partitioning strategy.
The policy used to determine the specific bucket for a field in a partitioned table. GemFire XD currently only supports horizontal partitioning , so an entire row is stored in the same bucket. You can hash-partition a table based on its primary key or on an internally-generated unique row id if the table has no primary key. Other partitioning strategies can be specified in the PARTITION BY clause in a CREATE TABLE statement. The strategies that are supported by GemFire XD include hash-partitioning on columns other than the primary key, range-partitioning , and list-partitioning.
Also known as the embedded client, this is a process that is connected to the distributed system using the GemFire Peer Driver. The member may or may not host any data depending on the configuration property host-data. By default, all peer clients will host data. Configuration describes how this property can be set at connection time. Essentially, the peer client can be configured to just be a "pure" client or can be a client as well as a data store. When hosting data, the member can be part of one or more server groups.
JDBC driver packaged in gemfirexd.jar. The client connects to the distributed system using the GemFire XD driver with the URL jdbc:gemfirexd: and doesn't specify a host and port in the URL. This driver provides single-hop access to all the data managed in the distributed members. (The GemFire XD JDBC thin-client driver also supports one-hop access for lightweight client applications.)
A driver plug-in that enables HAWQ to query HDFS table data as an external table. The PXF driver is installed with HAWQ.
The process that executes the query and determines the overall plan. It may distribute the query to the appropriate servers that host the data. When using a peer client, the query coordinator is the peer client itself. When using a thin client, the query coordinator is the server member to which the client is connected.
A partitioning strategy based on specified contiguous ranges of values of one or more fields. For example, a table could be range-partitioned on a date field so that all the values within a range of years are placed into the same bucket.
A table that keeps a copy of its entire dataset locally on every data store in its server groups. GemFire XD creates replicated tables by default if you do not specify a PARTITION BY clause. See also partitioned table.
A JVM started with the gfxd server command, or any JVM that calls the FabricServer.start method. A GemFire XD server may or may not also be a data store, and may or may not also be a network server.
A process that is not part of the distributed system but is connected to the distributed system through a thin driver. The thin client connects to a single server in the distributed system which in turn may delegate requests to other members of the distributed system. JDBC thin clients can also be configured to provide one-hop access to data for lightweight client applications.
The JDBC thin driver bundled in the product (gemfirexdclient.jar). A process that is not part of the distributed system but is connected to it through a thin driver. The connection URL for this driver is of the form jdbc:gemfirexd://hostname:port/.