Parallel Execution of Data-Aware Stored Procedures

In a traditional relational database, stored procedures are application routines that are stored as part of the data dictionary and executed on the database system itself. Stored procedures generally offer high performance because they execute in close proximity to data required by the application logic. GemFire XD extends this basic stored procedure capability to support parallel execution of application logic on table data that is partitioned across many peers.

GemFire XD applications can execute stored procedures on specific data hosts, in parallel on all the members of a server group, or can target specific members based on the data requirements for the procedure. Essentially, application behavior that is encapsulated in stored procedures is moved to the process that hosts the associated data set, and it is executed there. If the required data set is spread across multiple partitions, the procedure is executed in parallel on the partition members. Results are streamed to a coordinating member and aggregated for the client invoking the procedure.

For example, consider an 'Order' table that is partitioned by its 'customer_id', and an application wanting to execute an expensive 'credit check' for several customers. Assume the credit test requires iteration over all the order history. You can parallelize the execution on all members that manage data for these customers and stream the results to the client. All order history required by each execution is locally available in-process.

// typical procedure call 
   CallableStatement callableStmt = connection.prepareCall("{CALL order_credit_check(?) ");
   callableStmt.setArray(1, <list of customer IDs>); 

// GemFire XD data-aware procedure invocation 
      CallableStatement callableStmt = connection.prepareCall("{CALL order_credit_check() " 
        + "ON TABLE Orders WHERE customerID IN (?)}"); 
      callableStmt.setArray(1, <list of customer IDs>); 

// order_credit_check will be executed in parallel on all members where the orders
// corresponding to the customerIDs are managed