Tune FabricServers

These JVM-related recommendations pertain to the 64-bit version 1.6 JVM running on Linux.

  • Use JDK 1.6.0_26 or higher. This provides significantly better performance than earlier versions for some GemFire XD applications.
  • Use –server for JVMs that will start servers with the FabricServer API.
  • When starting up a GemFire XD server with gfxd, specify the -heap-size parameter to use GemFire XD's default JVM resource management settings. By default, the critical-heap-percentage threshold is set to 90% of the specified heap-size and the eviction-heap-percentage threshold is set to 80% of the critical-heap-percentage threshold. When using the FabricServer API, set –Xms equal to –Xmx and specify all other JVM options required for JVM resource management.
    Note: The -initial-heap and -max-heap parameters, used in the earlier SQLFire product, are no longer supported. Use -heap-size instead.
  • Set –XX:+UseConcMarkSweepGC to use the concurrent low-pause garbage collector and the parallel young generation collector. The low-pause collector sacrifices some throughput in order to minimize stop-the-world GC pauses for tenured collections. It does require more headroom in the heap, so increase the heap size to compensate. The gfxd script starts fabric servers with this collector by default.
  • Set -XX:+DisableExplicitGC to disable full garbage collection. This causes calls to System.gc() to be ignored, avoiding the associated long latencies.
  • Set –XX:CMSInitiatingOccupancyFraction=50 or even lower for high throughput latency-sensitive applications that generate large amounts of garbage in the tenured generation, such as those that have high rates of updates, deletes, or evictions. This setting tells the concurrent collector to start a collection when tenured occupancy is at the given percentage. With the default setting, a high rate of tenured garbage creation can outpace the collector and result in OutOfMemoryError. Too low of a setting can affect throughput by doing unnecessary collections, so test to determine the best setting. The gfxd script sets this for fabric servers by default.
  • If short-lived objects are being promoted to the tenured generation, set –XX:NewSize=<n>, where n is large enough to prevent this from occurring. Increasing this value tends to increase throughput but also latency, so test to find the optimum value, usually somewhere between 64m and 1024m.
  • Tune heap settings so that occupancy stays below 70%. This helps reduce latency.
  • The parallel compactor in JDK 6 is not available with the concurrent low-pause collector. Churn in the tenured generation causes fragmentation that can eventually cause stop-the-world compactions. You can postpone the issue by using the largest heap that fits into memory, after allowing for the operating system.
  • If heap space is an issue, –XX:+UseCompressedOops is turned on by default if you are running with 64-bit JDK 1.6.0_24 or higher. This can reduce heap usage by up to 40% by reducing managed pointers for certain objects to 32-bit. However, this can lower throughput and increase latency. It also limits the application to about four billion objects.
  • Consider moving your table row data to off-heap memory. You will typically benefit from using off-heap memory when data volume is high and you have at least 150GB of memory on each machine. See Storing Tables in Off-Heap Memory for more information.
  • Set conserve-sockets=false in the boot properties. This causes each server to use a dedicated threads to send to and receive from each of its peers. This uses more system resources, but can improve performance by removing socket contention between threads and allowing GemFire XD to optimize certain operations. If your application has very large numbers of servers and/or peer clients, test to see which setting gives the best results. Peer clients that are read-heavy with very high throughput can benefit from conserving sockets while leaving conserve-sockets false in the data stores
  • Set enable-time-statistics=false in the boot properties and set enable-timestats=false in all connections (including client connections) to turn off time statistics. This eliminates a large number of calls to gettimeofday.
  • For applications that will always use a single server, you can make it a "loner" by setting the mcast-port=0 and configuring no locators. Knowing there will be no distribution allows GemFire XD to do a few additional optimizations. Thin clients must then connect directly to the server. Also, peer clients cannot be used in this case.