Relaxing Column Constraints During Bulk Import

The import system procedures generally provide the best performance for bulk importing of data, because they can use multiple threads and techniques similar to the "UPSERT" functionality of traditional RDBMS systems to relax constraints during import.

The _EX versions of the import procedures (SYSCS_UTIL.IMPORT_TABLE_EX and SYSCS_UTIL.IMPORT_DATA_EX) enable you to use the GemFire XD PUT INTO DML statement to insert data from text files. The PUT INTO statement is similar to the popular "UPSERT" command or capability used in RDBMS systems to ignore primary key checks during an insert. If a row with the same primary key value exists in the table, PUT INTO simply overwrites the older row value. If no rows with the same primary key exist, PUT INTO operates like a standard INSERT. This behavior ensures that only the last primary key value inserted or updated remains in the system, which preserves the primary key constraint.

If you are certain that the data you want to import does not violate other established column constraints, you can additionally use the skip-constraint-checks and skip-listeners connection properties to relax table constraints and bypass listeners when importing data. However, keep in mind that foreign key and unique constraints can be violated when skip-constraint-checks is set. This can lead to undefined behavior in the system, because other connections that do not enable skip-constraint-checks still require constraint checks for correct operation.

Note: PUT INTO and skip-constraint-checks can also be used when updating HDFS-persistent tables, to avoid loading data from HDFS in order to perform column constraint checks.