Column Definition

The column definition defines the name of a column and its data-type.

Syntax

column-name data-type
    [ column-constraint ] *
    [ [ WITH ] DEFAULT { constant-expression | NULL } 
      | [ GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
          [ ( START WITH integer-constant [, INCREMENT BY integer-constant ] ) ] ] ]
    [ column-constraint ] *

The set of supported data-types are detailed in Data Types. Identity columns must be of type BIGINT or INTEGER.

Identity Columns

GemFire XD supports both GENERATED ALWAYS and GENERATED BY DEFAULT identity columns only for BIGINT and INTEGER data types. The START WITH and INCREMENT BY clauses are supported only for GENERATED BY DEFAULT identity columns. Creating an identity column does not create an index on the column.

Note: DBSynchronizer does not apply the generated identity column value to a synchronized database. DBSynchronizer expects that the back-end database table also has an identity column, and that the database generates its own identity value.

For a GENERATED ALWAYS identity column, GemFire XD increments the default value on every insertion, and stores the incremented value in the column. You cannot insert a value directly into a GENERATED ALWAYS identity column, and you cannot update a value in a GENERATED ALWAYS identity column. Instead, you must either specify the DEFAULT keyword when inserting data into the table or you must leave the identity column out of the insertion column list.

Consider a table with the following column definition:

create table greetings (i int generated always as identity, ch char(50));

You can insert rows into the table using either the DEFAULT keyword, or by omitting the identity column from the INSERT statement:

insert into greetings values (DEFAULT, 'hello');
insert into greetings(ch) values ('hi');

The values that GemFire XD automatically generates for a GENERATED ALWAYS identity column are unique.

For a GENERATED BY DEFAULT identity column, GemFire XD increments and uses a default value for an INSERT only when no explicit value is given. To use the generated default value, either specify the DEFAULT keyword when inserting into the identity column, or leave the identity column out of the INSERT column list.

In contrast to GENERATED ALWAYS identity columns, with a GENERATED BY DEFAULT column you can specify an identity value to use instead of the generated default value. To specify a value, include it in the INSERT statement.

For example, consider a table created using the statement:
create table greetings (i int generated by default as identity, ch char(50)); 
The following statement specifies the value "1" for the identity column:
insert into greetings values (1, 'hi'); 
These statements both use generated default values:
insert into greetings values (DEFAULT, 'hello');
insert into greetings(ch) values ('bye');

Although the automatically-generated values in a GENERATED BY DEFAULT identity column are unique, a GENERATED BY DEFAULT column does not guarantee unique identity values for all rows in the table. For example, in the above statements the rows containing "hi" and "hello" both have an identity value of "1." This occurs because the generated column starts at "1" and the user-specified value was also "1."

To avoid duplicating identity values (for example, during an import operation), you can use the START WITH clause to specify the first identity value that GemFire XD should assign and increment. Or, you can use a primary key or a unique constraint on the GENERATED BY DEFAULT identity column to check for and disallow duplicates.

By default, the initial value of a GENERATED BY DEFAULT identity column is 1, and the value is incremented by 1 for each INSERT. Use the optional START WITH clause to specify a new initial value. Use the optional INCREMENT BY clause to change the increment value used during each INSERT.

See also Auto-Generated Column Limitations for information about limitations with identity columns.

CONSTRAINT Clause

A CONSTRAINT clause is an optional part of a CREATE TABLE or ALTER TABLE statement that defines a rule to which table data must conform.

Note: Within the scope of a transaction, GemFire XD automatically initiates a rollback if it encounters a constraint violation. Errors that occur while parsing queries (such as syntax errors) or while binding parameters in a SQL statement do not cause a rollback.
There are two types of constraint:
  • Column-level constraints refer to a single column in the table and do not specify a column name (except check constraints). They refer to the column that they follow.
  • Table-level constraints refer to one or more columns in the table. Table-level constraints specify the names of the columns to which they apply. Table-level CHECK constraints can refer to 0 or more columns in the table.
Column and table constraints include:
  • NOT NULL— Specifies that a column cannot hold NULL values (constraints of this type are not nameable).
  • PRIMARY KEY— Specifies a column (or multiple columns if specified in a table constraint) that uniquely identifies a row in the table. The identified columns must be defined as NOT NULL.
  • UNIQUE— Specifies that values in the column must be unique. NULL values are not allowed.
  • FOREIGN KEY— Specifies that the values in the columns must correspond to values in referenced primary key or unique columns or that they are NULL.

    If the foreign key consists of multiple columns and any column is NULL, then the whole key is considered NULL. GemFire XD permits the insert no matter what is on the non-null columns.

    GemFire XD supports only the ON DELETE RESTRICT clause for foreign key references, and can be optionally specified that way. GemFire XD checks dependent tables for foreign key constraints. If any row in a dependent table violates a foreign key constraint, the transaction is rolled back and an exception is thrown. (GemFire XD does not support cascade deletion.)

    Note: GemFire XD implicitly creates an index on child table columns that define a foreign key reference. If you attempt to create an index on that same columns manually, you will receive a warning to indicate that the index is a duplicate of an existing index.
    Note: HDFS read/write tables that use eviction criteria cannot have foreign key constraints, because the table data needed to enforce such constraints would require scanning persisted HDFS data. To improve query performance for these tables, create indexes on the columns where you would normally assign foreign key constraints.
  • CHECK— Specifies rules for values in a column, or specifies a wide range of rules for values when included as a table constraint. The CHECK constraint has the same format and restrictions for column and table constraints.

Column constraints and table constraints have the same function; the difference is in where you specify them. Table constraints allow you to specify more than one column in a PRIMARY KEY, UNIQUE, CHECK, or FOREIGN KEY constraint definition. Column-level constraints (except for check constraints) refer to only one column.

If you do not specify a name for a column or table constraint, then GemFire XD generates a unique name.

Column Constraint

A column constraint applies a rule to a single column in the table. You include a column constraint as part of the column definition.

{ 
    NOT NULL | 
    [ [ CONSTRAINT constraint-name ] 
    { 
        CHECK ( search-condition ) | 
        { 
            PRIMARY KEY | 
            UNIQUE | 
            REFERENCES table-name [ ( column-name [ , column-name ] * ) ] 
                [ ON DELETE RESTRICT ] 
        } 
    } 
} 
Note: If you use the REFERENCES clause to create a foreign key reference, then GemFire XD implicitly creates an index the child table column. If you attempt to create an index on that same column manually, you will receive a warning to indicate that the index is a duplicate of an existing index.

Table Constraint

This constraint allows you to specify table-level constraints, including the primary key, a unique key, or a foreign key. Alternatively, a check constraint can be specified that is tested for each INSERT, UPDATE, and DELETE operation.

[ CONSTRAINT constraint-name ] 
{ 
    CHECK ( search-condition ) | 
    { 
        PRIMARY KEY ( column-name [ , column-name ] * ) | 
        UNIQUE ( column-name [ , column-name ] * ) | 
        FOREIGN KEY ( column-name [ , column-name ] * ) 
            REFERENCES table-name [ ( column-name [ , column-name ]* ) ] 
            [ ON DELETE RESTRICT ] 
    } 
}
Note: GemFire XD implicitly creates an index on child table columns that define a foreign key reference. If you attempt to create an index on that same columns manually, you will receive a warning to indicate that the index is a duplicate of an existing index.

CHECK Constraint Search Condition

A search condition is a boolean expression that is specified in a CHECK constraint. The expression cannot be dynamic and so cannot contain dynamic parameters, date/time functions, user functions, or sub-queries.

If a check constraint is included as part of a column-definition, a column reference can only be made to the same column.

If a check constraint is specified as part of a table definition, column references can specify any columns that were previously defined in the CREATE TABLE statement. When specified in a table constraint, the search condition must be satisfied for all rows in the table. The search condition is applied to each row that is modified on an INSERT or UPDATE at the time of the row modification. The entire statement is aborted if any check constraint is violated.

Examples of Constraints

-- create column-level primary key constraint named OUT_TRAY_PK: 

CREATE TABLE SAMP.OUT_TRAY 
( 
    SENT TIMESTAMP, 
    DESTINATION CHAR(8), 
    SUBJECT CHAR(64) NOT NULL CONSTRAINT OUT_TRAY_PK PRIMARY KEY, 
    NOTE_TEXT VARCHAR(3000) 
 ); 

-- the table-level primary key definition allows you to 
-- include more than one columns in the primary key definition: 

CREATE TABLE SAMP.SCHED 
 ( 
    CLASS_CODE CHAR(7) NOT NULL, 
    DAY SMALLINT NOT NULL, 
    STARTING TIME, 
    ENDING TIME, 
    PRIMARY KEY (CLASS_CODE, DAY) 
 );


-- Use a column-level constraint for an arithmetic check 
-- Use a table-level constraint 
-- to make sure that a employee's taxes does not 
-- exceed the bonus 

CREATE TABLE SAMP.EMP   
 ( 
    EMPNO CHAR(6) NOT NULL CONSTRAINT EMP_PK PRIMARY KEY, 
    FIRSTNME CHAR(12) NOT NULL, 
    MIDINIT VARCHAR(12) NOT NULL, 
    LASTNAME VARCHAR(15) NOT NULL, 
    SALARY DECIMAL(9,2) CONSTRAINT SAL_CK CHECK (SALARY >= 
      10000),    BONUS DECIMAL(9,2), 
    TAX DECIMAL(9,2), 
    CONSTRAINT BONUS_CK CHECK (BONUS > TAX) 
  );

-- use a check constraint to allow only appropriate 
-- abbreviations for the meals 

CREATE TABLE FLIGHTS 
 ( 
    FLIGHT_ID CHAR(6) NOT NULL , 
    SEGMENT_NUMBER INTEGER NOT NULL , 
    ORIG_AIRPORT CHAR(3), 
    DEPART_TIME TIME, 
    DEST_AIRPORT CHAR(3), 
    ARRIVE_TIME TIME, 
    MEAL CHAR(1) CONSTRAINT MEAL_CONSTRAINT 
    CHECK (MEAL IN ('B', 'L', 'D', 'S')), 
    PRIMARY KEY (FLIGHT_ID, SEGMENT_NUMBER) 
 );

-- create a table with a table-level primary key constraint 

CREATE TABLE CITIES
( 
   CITY_ID INTEGER NOT NULL CONSTRAINT CITIES_PK Primary Key ,
   CITY_NAME VARCHAR(24) NOT NULL, 
   COUNTRY VARCHAR(26) NOT NULL, 
   AIRPORT VARCHAR(3), 
   LANGUAGE VARCHAR(16) 
 );

-- and a table-level foreign key constraint
-- Note that GemFire XD implicitly creates an index on CITY_ID for
-- the foreign key reference.

CREATE TABLE METROPOLITAN 
( 
  HOTEL_ID INT NOT NULL CONSTRAINT HOTELS_PK PRIMARY KEY, 
  HOTEL_NAME VARCHAR(40) NOT NULL, 
  CITY_ID INT CONSTRAINT METRO_FK REFERENCES CITIES 
);

-- create a table whose city_id column references the 
-- primary key in the Cities table 
-- using a column-level foreign key constraint 
-- Note that GemFire XD implicitly creates an index on CITY_ID for
-- the foreign key reference.

CREATE TABLE CONDOS 
( 
  CONDO_ID INT NOT NULL CONSTRAINT Condos_PK PRIMARY KEY, 
  CONDO_NAME VARCHAR(40) NOT NULL, 
  CITY_ID INT CONSTRAINT city_fk REFERENCES Cities ON DELETE RESTRICT 
);

PARTITION BY Clause

The GemFire XD partitioning clause controls the locality and distribution of data in the given server groups. This is important for optimizing queries, and it is essential for cross-table joins. The clause can be one of column partitioning, range partitioning, list partitioning, or generic expression partitioning.

Syntax

{
    {
      PARTITION BY PRIMARY KEY | Partitioning by Column ( column-name [ , column-name ] * ) }
      |
      Partitioning by a Range of Values ( column-name )
      (
          VALUES BETWEEN value AND value
              [ , VALUES BETWEEN value AND value ] *
      )
      |
      Partitioning by a List of Values ( column-name )
      (
          VALUES ( value [ , value ] * )
              [ , VALUES ( value [ , value ] * ) ] *
      )
      |
      Partitioning by Expression
    }
  [ Colocating Related Rows ( table-name [ , table_name ] *  ) ]
}
[ REDUNDANCY Clause integer-constant ]
[ BUCKETS Clause integer-constant ]
[ RECOVERYDELAY Clause integer-constant ]
[ MAXPARTSIZE Clause integer-constant ]
Note: You cannot use an expression to provide a value in the PARTITION BY RANGE or PARTITION BY LIST clauses.

Description

If no partitioning clause is specified, then the table's data is replicated on all the members of the server groups of the table. REPLICATE can also be explicitly used to create a replicated table.

Partitioning by Column

Use the PARTITION BY COLUMN clause to provide a set of column names that will determine the partitioning. As a shortcut you can use PARTITION BY PRIMARY KEY to refer to the primary key columns defined for the table. GemFire XD uses an internal hash function that typically returns the EXCLUSIVE OR of the hashcode() return values for the columns' underlying Java types.

-- partition by primary key 
CREATE TABLE HOTELAVAILABILITY 
( HOTEL_ID INT NOT NULL, 
  BOOKING_DATE DATE NOT NULL, 
  ROOMS_TAKEN INT DEFAULT 0,
  PRIMARY KEY (HOTEL_ID, BOOKING_DATE) 
) PARTITION BY PRIMARY KEY

-- partition by column (booking date only) 
CREATE TABLE HOTELAVAILABILITY 
( HOTEL_ID INT NOT NULL, 
  BOOKING_DATE DATE NOT NULL, 
  ROOMS_TAKEN INT DEFAULT 0, 
  PRIMARY KEY (HOTEL_ID, BOOKING_DATE) 
) PARTITION BY COLUMN (BOOKING_DATE)

Partitioning by a Range of Values

Use the PARTITION BY RANGE clause to specify the ranges of a field that should be colocated. Doing so ensures the locality of data in case of range queries and for cross table joins. The lower limit of the range is inclusive and the upper limit is exclusive. It is not necessary for the ranges to cover the whole spectrum of the possible values for the field. The ranges not covered are automatically partitioned on the available servers by GemFire XD with no guarantee of locality for those values.

Note: You cannot use an expression to provide a value in the PARTITION BY RANGE or PARTITION BY LIST clauses.
-- partition by range 
CREATE TABLE HOTELAVAILABILITY 
( HOTEL_ID INT NOT NULL, 
  BOOKING_DATE DATE NOT NULL, 
  ROOMS_TAKEN INT DEFAULT 0,
  PRIMARY KEY (HOTEL_ID, BOOKING_DATE)
) PARTITION BY RANGE (BOOKING_DATE) 
( VALUES BETWEEN '2010-01-01' AND '2010-04-01', 
  VALUES BETWEEN '2010-04-01' AND '2010-07-01', 
  VALUES BETWEEN '2010-07-01' AND '2010-10-01', 
  VALUES BETWEEN '2010-10-01' AND '2011-01-01'
)

Partitioning by a List of Values

Use the PARTITION BY LIST clause to specify the set of values of a field that should be colocated to help optimization of queries and for cross table joins. It is not necessary to list all the possible values for the field, and the values that are not covered are partitioned automatically.

Note: You cannot use an expression to provide a value in the PARTITION BY RANGE or PARTITION BY LIST clauses.
-- partition by list 
CREATE TABLE FLIGHTS 
( 
  FLIGHT_ID CHAR(6) NOT NULL , 
  SEGMENT_NUMBER INTEGER NOT NULL , 
  ORIG_AIRPORT CHAR(3), 
  PRIMARY KEY (FLIGHT_ID, SEGMENT_NUMBER) 
) PARTITION BY LIST (ORIG_AIRPORT) 
( VALUES ('PDX', 'LAX'), 
  VALUES ('AMS', 'DUB'), 
  VALUES ('DTW', 'ORL')
) 

Partitioning by Expression

The PARTITION BY clause with an expression is a type of hash-like partitioning where the value to hash on is evaluated from the expression. The expression must reference only field names from the table. This allows rows to be colocated based on some function of their values.

Note: When you colocate tables that are partitioned by expression, you must ensure that the expressions resolve to the same values in both tables, so that data can be colocated. GemFire XD ensures that the parent table uses an expression for partitioning, but it performs no other compatibility checks.
-- partition by expression 
-- This example will colocate all the rows that are booked 
-- in the same month together 
CREATE TABLE HOTELAVAILABILITY 
(
  HOTEL_ID INT NOT NULL, 
  BOOKING_DATE DATE NOT NULL, 
  ROOMS_TAKEN INT DEFAULT 0, 
  PRIMARY KEY (HOTEL_ID, 
  BOOKING_DATE)
) PARTITION BY (MONTH(BOOKING_DATE)) 

Colocating Related Rows

The COLOCATE WITH clause specifies a partitioned table with which the new partitioned table must be colocated. The referenced table must already exist.

When two tables are partitioned using the same column values and colocated, it forces partitions with the same values for those columns in the two tables to be located on the same GemFire XD member. For example, in case of range or list partitioning, the rows that satisfy a particular range or list are colocated on the same member for all of the colocated tables.

Note: All rows that need to be colocated across multiple tables must carry the same partition key. While the column names in each table may be different, the values must be the same.

The COLOCATE WITH clause is only used as part of a partitioning clause. If no PARTITION BY clause is used when there is a COLOCATE WITH clause, GemFire XD throws a SQLException. The partitioning clause that you specify for the new table must specify the same partitioning values and partitioning strategy that are used in the target table (the table referenced in the COLOCATE WITH clause). For example, if you use PARTITION BY PRIMARY KEY or PARTITION BY COLUMN in the target table, then you should use one of the same clauses to partition the colocated table on the same column values. If you partitioned the target table using multiple columns, then specify the columns in the same order when colocating the new table.

When you colocate tables that are partitioned by expression, GemFire XD only ensures that the parent table is also partitioned by an expression. No additional compatibility checks are performed. You must ensure that the expressions resolve to the same values in both tables, so that data can be colocated.

The REDUNDANCY and BUCKETS, clauses must be identical for colocated tables. Colocated tables must either specify the same server groups (identical SERVER GROUPS clauses), or one table's server groups must be a subset of the other table's server groups.

If you do not specify a COLOCATE WITH clause, GemFire XD attempts to perform default colocation by using foreign key references. Default colocation is performed only if you omit the BUCKETS clause entirely, or if you specify a BUCKETS value that matches the BUCKETS value of the referenced tables.

See also Colocating Related Rows from Multiple Tables.

CREATE TABLE COUNTRIES 
( 
  COUNTRY VARCHAR(26) NOT NULL CONSTRAINT COUNTRIES_UNQ_NM Unique, 
  COUNTRY_ISO_CODE CHAR(2) NOT NULL CONSTRAINT COUNTRIES_PK PRIMARY KEY, 
  REGION VARCHAR(26), 
  CONSTRAINT COUNTRIES_UC
    CHECK (country_ISO_code = upper(country_ISO_code) ) 
) PARTITION BY PRIMARY KEY

-- Explicitly colocate the CITIES table with the COUNTRIES table. 
-- Both of these tables will be hash partitioned on the 
-- COUNTRY_ISO_CODE 
-- field. This guarantees that cities are colocated with 
-- the country they are in. 
CREATE TABLE CITIES 
( 
  CITY_ID INTEGER NOT NULL CONSTRAINT CITIES_PK Primary key,
  CITY_NAME VARCHAR(24) NOT NULL, 
  COUNTRY VARCHAR(26) NOT NULL, 
  AIRPORT VARCHAR(3), 
  LANGUAGE VARCHAR(16), 
  COUNTRY_ISO_CODE CHAR(2) CONSTRAINT COUNTRIES_FK 
  REFERENCES COUNTRIES (COUNTRY_ISO_CODE) 
) PARTITION BY COLUMN (COUNTRY_ISO_CODE) 
  COLOCATE WITH (COUNTRIES)

REDUNDANCY Clause

Use the REDUNDANCY clause to specify the number of redundant copies that should be maintained for each partition, to ensure that the partitioned table is highly available even if members fail. Because GemFire XD is primarily a memory-based data management system, it is important to configure this carefully to enable failover in case of failures. A value too large adversely affects performance, network usage, and memory usage. Typically a value of 1 is recommended; this setting maintains one copy in addition to the primary copy of the table data.

Note: Colocated partitioned tables must have the same redundancy.

In this example it is assumed that you have configured GemFire XD to create partitioned tables (rather than replicated tables) by default.

CREATE TABLE COUNTRIES
(
  COUNTRY VARCHAR(26) NOT NULL,
  COUNTRY_ISO_CODE CHAR(2) NOT PRIMARY KEY,
  REGION VARCHAR(26),
)
REDUNDANCY 1

BUCKETS Clause

The optional BUCKETS attribute specifies the fixed number of "buckets," the smallest unit of data containment for the table that can be moved around. Data in a single bucket resides and moves together. If not specified, the number of buckets defaults to 113.

Note: Colocated tables must all have the same number of buckets. If you specify a mismatched BUCKETS value when using the COLOCATE WITH clause, GemFire XD throws an exception. If you specify a mismatched BUCKETS value without using the COLOCATE WITH clause, GemFire XD does not perform default location for the new table. If you specify the COLOCATE WITH clause but do not specify a BUCKETS value, GemFire XD inherits the BUCKETS value from the colocated table.

For range or list partitioning, each range or list is mapped to one bucket for the table, so total number of buckets cannot be less than the number of ranger or lists. If total number of buckets is larger than the total number of ranges or lists, then the values that fall outside of the specified ranges or lists are distributed among all the available buckets including those that are not mapped to any of the ranges or lists.

RECOVERYDELAY Clause

Use the RECOVERYDELAY clause to specify the default time in milliseconds that existing members will wait before satisfying redundancy after a member crashes. The default is -1, which indicates that redundancy is not recovered after a member fails. By default a GemFire XD system always attempts to satisfy redundancy if required when a new server member that is hosting data for the table is started. Redundancy recovery also occurs for redundant, partitioned tables if you configure the default-recovery-delay boot property.

MAXPARTSIZE Clause

The MAXPARTSIZE attribute specifies the maximum memory for any partition on a member in megabytes. Use it to load-balance partitions among available members. If you omit MAXPARTSIZE, then GemFire XD calculates a default value for the table based on available heap memory. You can view the MAXPARTSIZE setting by querying the EVICTIONATTRS column in SYSTABLES.

Note: If you also configure eviction for the table by memory size (LRUMEMSIZE), then Pivotal recommends that you set both LRUMEMSIZE and MAXPARTSIZE to the same value. If you specify an LRUMEMSIZE value that is larger than the table's MAXPARTSIZE, then GemFire XD automatically sets LRUMEMSIZE equal to MAXPARTSIZE. See EVICTION BY Clause.
-- example showing how to specify the BUCKETS, RECOVERYDELAY
-- and MAXPARTSIZE
CREATE TABLE COUNTRIES 
( 
  COUNTRY VARCHAR(26) NOT NULL, 
  COUNTRY_ISO_CODE CHAR(2) NOT PRIMARY KEY, 
  REGION VARCHAR(26), 
) PARTITION BY PRIMARY KEY BUCKETS 10 RECOVERYDELAY 10 MAXPARTSIZE 50

SERVER GROUPS Clause

Configure the GemFire XD datastore members that host data for table.

Syntax

[ SERVER GROUPS ( server_group_name [, server_group_name ] * ) ]
Note: The server group names that you specify are automatically converted to all-uppercase characters.

Description

The group of GemFire XD servers that store data for the table is either specified using the SERVER GROUPS clause or is deduced from the colocated parent table's server groups if a colocation clause is specified. If both the SERVER GROUPS clause and colocation clause are omitted, then GemFire XD stores data for the table on all members of the default server group for the schema.

Note: The default group is specified in the DEFAULT SERVER GROUP directive of the CREATE SCHEMA statement. If no server group is specified for the schema, the implicit default server group is used, which includes all servers in the distributed system.

A GemFire XD data store's server groups are specified when you start the member, using the -server-groups option. See server.

Keep in mind that all data store and accessor members in the GemFire XD system create the table, even if they do not store data for the table. Data stores that are not a member of one of the table's server groups act as accessors when performing operations against the table.

At least one data store member that hosts data for the table must be available in order to execute statements against the table. If all data stores in all of the table's server groups are offline, then statements involving the table fail with ERROR X0Z08: No Datastore found in the Distributed System for 'statement'.

When you include the SERVER GROUPS clause in a CREATE TABLE statement, consider these requirements for partitioned and replicated tables:
  • In order to colocate the data for two partitioned tables, you must specify identical SERVER GROUPS clauses in both table's CREATE TABLE statements.
  • In order to colocate the data for two replicated tables, both tables must specify the same server groups or one table's server groups must be a subset of the other table's server groups.
  • You can configure both partitioned tables and replicated tables to use a gateway sender. However, you must create replicated tables in the same server groups as the gateway sender(s) that you specify in the CREATE TABLE statement. See CREATE GATEWAYSENDER.

Example with SERVER GROUPS Clause

CREATE TABLE COUNTRIES
(
  COUNTRY VARCHAR(26) NOT NULL,
  COUNTRY_ISO_CODE CHAR(2) NOT NULL PRIMARY KEY,
  REGION VARCHAR(26)	
) SERVER GROUPS (SG1, SG2)

GATEWAYSENDER Clause

The GATEWAYSENDER keyword associates the new table with one or more named gateway senders, so that DML operations against the table are replicated to the remote GemFire XD cluster associated with the sender(s).

Note: You can optionally use the GemFire XD ALTER TABLE command to add or remove gateway senders after you have created a table.

Syntax

[ GATEWAYSENDER ( sender-name [,sender-name ] * ) ]

Example with GATEWAYSENDER Clause

The following example associates a table with the gateway sender, TEST_SENDER.

CREATE TABLE TESTTABLE
(
  ID INT NOT NULL, 
  DESCRIPTION VARCHAR(1024), 
  ADDRESS VARCHAR(1024)
)
GATEWAYSENDER(TEST_SENDER)

The following example associates a table with two gateway senders in the system.

CREATE TABLE TESTTABLE
(
  ID INT NOT NULL, 
  DESCRIPTION VARCHAR(1024), 
  ADDRESS VARCHAR(1024) 
)
GATEWAYSENDER(TEST_SENDER1, TEST_SENDER2)

ASYNCEVENTLISTENER Clause

The ASYNCEVENTLISTENER keyword associates a table with one or more named AsyncEventListener implementations.

Syntax

[ ASYNCEVENTLISTENER (async-listener-id [,async-listener-id] * ) ]

Description

See Configuring and Using an AsyncEventListener.

Note: The listener configuration that you specify does not have to be available at the time you create the table, so GemFire XD does not display an error message if the specified listener name does not exist. Make sure that you use the same listener name with both the CREATE ASYNCEVENTLISTENER command and the CREATE TABLE command.
Note: You can optionally use the GemFire XD ALTER TABLE command to add or remove asynceventlistener configurations after you have created a table.

Example with ASYNCLISTENER Keyword

The following example associates a new table with two AsyncEventListener implementations.

CREATE TABLE TESTTABLE
(
   ID INT NOT NULL,
   DESCRIPTION VARCHAR(1024),
   ADDRESS VARCHAR(1024)
)
ASYNCEVENTLISTENER(TEST_LISTENER1, TEST_LISTENER2)

EVICTION BY Clause

Use the EVICTION BY clause to evict rows automatically from the in-memory table based on different criteria. You can use this clause to create an overflow table where evicted rows are written to a local GemFire XD disk store, or you can simply destroy the evicted rows to control memory usage.

Note: You cannot use this form of the EVICTION BY clause to control memory usage for a table that is configured with HDFS persistence. For HDFS tables, use the EVICTION BY CRITERIA Clause to manage in-memory data.

Syntax

[ EVICTION BY 
   {
      LRUMEMSIZE integer-constant
   |
      LRUHEAPPERCENT
   |
      LRUCOUNT integer-constant 
   } 
[ EVICTACTION { OVERFLOW | DESTROY } ] ]
    

Description

Note: GemFire XD does not support transactions on tables that are configured with the DESTROY evict action. This restriction exists because the requirements of ACID transactions can conflict with the semantics of destroying evicted entries. For example, a transaction may need to update a number of entries that is greater than the amount allowed by the eviction setting. Transactions are supported with the OVERFLOW evict action, because the required entries can be loaded into memory as necessary to support transaction semantics.
Note: If you configure a table with the DESTROY eviction action, you must ensure that all queries against the table filter results using a primary key value. Queries that do not filter on a primary key may yield partial results if rows are destroyed on eviction. This limitation does not apply to tables that are configured with the OVERFLOW eviction action; query completeness is assured using both in-memory and overflow table data as necessary.
Note: GemFire XD does not propagate the DESTROY evict action to configured callback implementations, such as DBSynchronizer. Do not configure eviction with the DESTROY action on a table that has dependent tables (for example, child rows with foreign keys). If a DELETE statement is called for a parent table row that was locally destroyed through eviction, the DELETE succeeds in GemFire XD. However, the DELETE operation can later fail in the backend database when DBSynchronizer asynchronously sends the DELETE command, if dependent rows still exist.

If eviction with the DESTROY action is required for dependent tables, consider using a trigger or a synchronous Writer plug-in to respond to DELETE events on the parent table. The trigger or writer should fail the DELETE operation if child rows are found to exist in the backend database.

See also Limitations of Eviction.

All eviction is performed using a least-recently-used (LRU) algorithm, but you can specify whether GemFire XD performs eviction based on memory size, heap percentage, or LRU count.

You must enable eviction for a table when the table is created; you cannot enable eviction at a later time using ALTER TABLE, although you can use ALTER TABLE to change the LRUMEMSIZE value.

Eviction is performed locally by each member in the server groups that host the table. After the configured memory size, heap size, or LRU count is reached, inserts to the table succeed after a corresponding eviction of least-recently-used rows.

If you chose eviction by heap percentage (LRUHEAPPERCENT), individual GemFire XD members begin evicting data when their heap usage reaches a configured percent. You can configure a global heap percentage for all GemFire XD data stores, or configure different heap percentages for one or more server groups. Procedures describes how to configure the heap percentage.

If you configure eviction by memory size (LRUMEMSIZE), specify the memory size in megabytes. If you configure LRUMEMSIZE for a partitioned table, Pivotal recommends that you set both LRUMEMSIZE and MAXPARTSIZE to the same value. See MAXPARTSIZE Clause.
Note: If you specify an LRUMEMSIZE value that is larger than a partitioned table's MAXPARTSIZE, then GemFire XD automatically sets LRUMEMSIZE equal to MAXPARTSIZE.

The eviction action specified using the EVICTACTION clause specifies whether the least-recently-used data should be destroyed (DESTROY) or whether it should be overflowed to a disk store (OVERFLOW). Overflow tables use the disk-store-name (and ASYNCHRONOUS or SYNCHRONOUS settings) for writing to disk if they are specified in the CREATE TABLE statement. If no disk store configuration is specified, overflow tables use the default disk store for evicted rows. The OVERFLOW_ALL eviction action indicates that both table keys and values (instead of just values) should overflow to disk.

You can optionally persist an overflow table to disk by using the PERSISTENT clause.

EVICTION BY CRITERIA Clause

Use the EVICTION BY CRITERIA clause to define the working (in-memory) data set for HDFS read/write or HDFS write-only tables. If you use the EVICTION BY CRITERIA clause, you must also include the HDFSSTORE clause to enable HDFS persistence.

Note: You cannot use EVICTION BY CRITERIA with non HDFS-persistent tables. Use the EVICTION BY Clause to manage in-memory data for non-HDFS tables.
Note: HDFS read/write tables that use eviction criteria cannot have foreign key constraints, because the table data needed to enforce such constraints would require scanning persisted HDFS data. To improve query performance for these tables, create indexes on the columns where you would normally assign foreign key constraints.

Syntax

[ EVICTION BY CRITERIA ( sql-predicate ) 
  {
    { EVICTION FREQUENCY integer-constant { SECONDS | MINUTES | HOURS | DAYS } 
    [ START { D 'date-constant' | T 'time-constant' | TS 'timestamp-constant' } ] }
  | 
    { EVICT INCOMING }
  } 
]

Description

EVICTION BY CRITERIA uses a SQL predicate to define the table data that GemFire XD can evict from memory. Data that matches the provided sql-predicate is evicted from memory at regular intervals (or immediately, if EVICT INCOMING is specified). The remaining data forms the operational set of data maintained in GemFire XD memory. For example, EVICTION BY CRITERIA ( mycolumn = 'archive') EVICT INCOMING immediately evicts columns on insert or update when the value of the mycolumn is set to 'archive.'

Keep in mind that the EVICTION BY CRITERIA clause defines the data that you do not want to keep in the operational data set. Table rows that match the specified criteria are evicted, while the remaining rows are kept in memory for high-performance querying.

Note: GemFire XD applies eviction criteria only to insert and delete operations against the specified table. Table data that was persisted to HDFS and then retrieved from HDFS to satisfy a query is not returned to the operational data set.

GemFire XD creates and maintains indexes only for the operational data set. Queries against the full table data should reference a primary key, for performance reasons. Non primary key-based queries require a full table scan, which can lead to poor performance when accessing large volumes of data in HDFS.

For HDFS read/write tables, clients can optionally use the queryHDFS=false query hint to limit a query to the operational data (ignoring any data persisted in HDFS). This provides flexibility for performing high-speed querying of operational data in GemFire XD, while still providing the capability to analyze large volumes of historical data. Table data that is persisted in HDFS can also be analyzed using the map/reduce API.

Note: The queryHDFS hint is case-sensitive.
Note: Most DML commands (update and delete operations) always operate against the full data set of the table, even if you do not specify the queryHDFS hint. For HDFS Write-only tables, TRUNCATE TABLE removes only the in-memory data for the table, but leaves the HDFS log files available for later processing with MapReduce and HAWQ. This occurs regardless of the queryHDFS setting.

For HDFS Read-Write tables, TRUNCATE TABLE removes the in-memory data for the table and marks the table's HDFS log files for expiry. This means that GemFire XD queries, MapReduce jobs with CHECKPOINT mode enabled, and HAWQ external tables with CHECKPOINT mode enabled no longer return data for the truncated table. MapReduce jobs and HAWQ external tables that do not use CHECKPOINT mode will continue to return some table values until the log files expire.

When you configure eviction by criteria, you must also include a clause to configure when GemFire XD evicts data from memory:
  • The EVICTION FREQUENCY clause defines a time interval to control when GemFire XD compares table data with the sql-predicate and evicts the matching data from memory. Eviction occurs at the end of each interval that you define. For example, if you specify EVICTION FREQUENCY 2 HOURS, then GemFire XD performs an eviction cycle immediately, and then after every hour.

    You can optionally use the START clause to specify the time when the first eviction cycle starts; eviction then occurs regularly after the specified interval. The START clause accepts a DATE, TIME or TIMESTAMP datatype. You would typically include a START clause to ensure consistent eviction behavior across all members of the distributed system. For example, you could start all eviction on a fixed date, and then perform regular eviction after an interval of 7 days in order to evict data weekly.

    Note: If you use the START clause, you must specify times using the GMT (Greenwich Mean Time) time zone.
  • If the eviction criteria for the table uses only column values (rather than time-based functions), you can use the EVICT INCOMING clause instead of EVICTION FREQUENCY. With EVICT INCOMING, GemFire XD applies the sql-predicate to incoming data immediately to all insert and delete operations. For example, if you want to evict rows when a certain column value is greater than a specified threshold, EVICT INCOMING checks the column criteria on every insert or update, and immediately evicts the row to HDFS when the criteria match.

Local and HDFS Persistence

Tables the are defined with the EVICTION BY CRITERIA clause maintain an operational, in-memory data set as well as data that is persisted in HDFS. The operational data set and associated indexes can be persisted in local GemFire XD disk store files, in order to preserve the operational data set through server restarts. To enable local persistence for the operational data, specify the PERSISTENT Clause. If you do not use the PERSISTENT clause, then operational data must be manually inserted or recreated after you restart the distributed system.

Note: Querying data using queryHDFS=true does not place the data back into operational memory for later queries; operational data can only be re-created using insert statements (subject to the EVICTION BY CRITERIA clause or EXPIRE clause of the table) or by automatic recovery of the operational data from local persistence files. See Using Local Disk Store Persistence to Preserve the In-Memory Operational Data.

Examples

The following statement creates a table that uses HDFS read/write persistence. Every 180 seconds, GemFire XD evicts all table rows that have an ID value less then the specified value:
CREATE TABLE CALL_DATA (
  ID INT NOT NULL,
  ...
)  PARTITION BY PRIMARY KEY REDUNDANCY 1
EVICTION BY CRITERIA (ID < 300000)
EVICTION FREQUENCY 180 SECONDS
PERSISTENT 'hdfsLocalData'
HDFSSTORE (hdfsData)
The operational data for the above table is persisted in a GemFire XD disk store named hdfsLocalData. If no disk store name was specified, the operational data would be persisted in the default disk store. The entire data set for the table is persisted to HDFS using the hdfsData HDFS store.

EXPIRE Clause

Both tables and table entries can be configured to expire using the EXPIRE clause.

Syntax

[ EXPIRE { TABLE | ENTRY } WITH { IDLETIME seconds | TIMETOLIVE seconds} ACTION DESTROY ] *

Description

You can use the EXPIRE clause with HDFS-persistent tables as well as non-HDFS tables to control GemFire XD memory usage. Keep in mind that the DESTROY action is not propagated to data stored in HDFS; only the in-memory table data is destroyed.

For example, if you specify EXPIRE TABLE WITH TIMETOLIVE secs ACTION DESTROY, then GemFire XD destroys the entire table if there have been no writes to the table for the specified number of seconds. Similarly for EXPIRE ENTRY WITH IDLETIME secs ACTION DESTROY, GemFire XD destroys a table entry if the entry has been not updated for the configured number seconds.

Note: EXPIRE ENTRY WITH IDLETIME works only when a primary key-based query is used to access the table data. The system does not modify the access time for queries that use table scans or index scans instead of primary keys. This means that entries may be destroyed as "idle" even if they were recently accessed by a table or index scan.
Note: GemFire XD does not propagate the DESTROY action to configured callback implementations, such as DBSynchronizer, or to configured HDFS stores. Dependent tables (for example, child rows with foreign keys) are not affected by the local destroy action.

PERSISTENT Clause

When you specify the PERSISTENT keyword, GemFire XD persists the in-memory table data to a local GemFire XD disk store configuration. The PERSISTENT keyword is generally used with an existing 'disk-store-name' configuration to use for persisting the data. If you do not specify a named disk store, GemFire XD persists the table in the default disk store.

Syntax

[ PERSISTENT [ 'disk-store-name' ] [ ASYNCHRONOUS | SYNCHRONOUS ] ]
Note: You must create any named disk store using CREATE DISKSTORE, before specifying that disk store in CREATE TABLE.

Description

CAUTION:
When tables are replicated or partitioned with redundancy, GemFire XD persists the data on disk for each copy independently. All disk storage is owned by each member and there is nothing shared at the storage level. The advantage is higher availability and there are no restrictions on the disk storage layer. But when multiple copies are stored on disk, GemFire XD has to ensure the data consistency for each replica at all times. This includes the case when members go down in a arbitrary order and come back up again in a arbitrary order. To ensure consistency, GemFire XD requires each persistent member to come up before the data member becomes available.

See CREATE DISKSTORE.

See Persisting Table Data to GemFire XD Disk Stores.

The SYNCHRONOUS and ASYNCHRONOUS attributes specify whether the data has to be written synchronously or asynchronously to the disk store. For ASYNCHRONOUS writes, GemFire XD uses additional attributes in the associated disk store to configure the write behavior. See CREATE DISKSTORE.

Local, disk store persistence can be combined with HDFS persistence (the HDFSSTORE Clause) for HDFS tables. When you include both clauses, the in-memory, operational data set is persisted to the local disk store, while data that matches the eviction criteria is persisted to HDFS operational logs. See Using Local Disk Store Persistence to Preserve the In-Memory Operational Data.

Examples

This statement persists the FLIGHTS table to the default disk store using asynchronous writes:

CREATE TABLE FLIGHTS
    (
    FLIGHT_ID CHAR(6) NOT NULL,
    SEGMENT_NUMBER INTEGER NOT NULL,
    ORIG_AIRPORT CHAR(3),
    DEPART_TIME TIME,
    DEST_AIRPORT CHAR(3),
    ARRIVE_TIME TIME,
    MEAL CHAR(1) CONSTRAINT MEAL_CONSTRAINT
    CHECK (MEAL IN ('B', 'L', 'D', 'S')),
    PRIMARY KEY (FLIGHT_ID, SEGMENT_NUMBER)
    )
    PERSISTENT ASYNCHRONOUS
This statement creates a table that persists in-memory data to the default, local disk store. Rows older than 5 minutes are evicted from memory and persisted to HDFS:
CREATE TABLE CALL_DATA (
  ID int not null primary key,
  ...
)  PARTITION BY PRIMARY KEY REDUNDANCY 1
EVICTION BY CRITERIA (ID < 300000)
EVICTION FREQUENCY 180 SECONDS
PERSISTENT
HDFSSTORE (hdfsData)

HDFSSTORE Clause

GemFire XD supports persisting table data to a configured HDFSSTORE either for write-only access (streaming data) or read/write access.

SYNTAX

[ HDFSSTORE ( store-name ) [ WRITEONLY ] ]

Description

See Persisting Table Data to Hadoop for more information.

Use CREATE HDFSSTORE to define the HDFS store configuration (store-name) before you execute CREATE TABLE.

Include the WRITEONLY option to use the HDFS write-only model. All table data is persisted to in Pivotal HD using the HDFS store, and you can access the stored data using the MapReduce API. Omit the WRITEONLY option to support read/write access to the persisted Pivotal HD data in GemFire XD. If you omit the WRITEONLY option, also define the working set of in-memory data using the EVICTION BY CRITERIA Clause.

You can combine HDFS persistence with local, disk store persistence for HDFS tables; specify the PERSISTENT Clause in addition to the HDFSSTORE clause. See Using Local Disk Store Persistence to Preserve the In-Memory Operational Data.

OFFHEAP Clause

GemFire XD enables you to store the data for selected tables outside of the JVM heap. Storing a table in off-heap memory can improve performance for the table by reducing the CPU resources required to manage the table's data in the heap (garbage collection).

Syntax

[ OFFHEAP ]
Note: If you include the OFFHEAP clause, then all data stores that host the table must have the same off-heap-memory-size setting.

If you do not include the above clause, then GemFire XD stores all of the table's data in the JVM heap.

See Storing Tables in Off-Heap Memory.