CREATE QUERY EXECUTOR POOL

Syntax

CREATE QUERY EXECUTOR POOL pool_name ['(' executorArg [, ...] ')'] [IF bool_expr]

executorArg:
      MAX_EXECUTOR_POOL_SIZE ':' int_expr
    | MIN_THREAD_COUNT ':' int_expr
    | MAX_THREAD_COUNT ':' int_expr
    | KEEP_ALIVE_SECS ':' long_expr
    | THREADS_READ_RESULTS ':' bool_expr
    | COMPLETION_QUEUE_SIZE ':' int_expr

Description

Creates a named QueryExecutor pool.

  • A QueryExecutor provides a thread pool for concurrent query result processing.
  • A QueryExecutorPool is assigned using HConnection.setQueryExecutorPoolName(). Multiple connections can share a common QueryExecutorPool name.
  • QueryExecutorPool name assignments are reset when HConnection.close() is called. So if an HConnection is taken from an HConnectionPool, the QueryExecutorPool name assignments should be made when an HConnection is retrieved from a pool.
  • If a connection has a QueryExecutorPool name, each query will require a QueryExecutor from the pool. Thus the maximum number of concurrent queries is bound by the number of QueryExecutors in a QueryExecutorPool.
  • Keys and key ranges correspond to gets and scans in HBase. QueryExecutors are used in queries such that each key or key range is processed by a different thread in the QueryExecutor thread pool. So, the size of a QueryExecutor thread pool determines the number of concurrent gets or scans that can take place.
  • QueryExecutors can assume two behaviors for processing query data: the threads can make a Get or Scan request and return the ResultScanner objects back to the main thread, which can then iterate through the Result values. Or, the threads can iterate through the ResultScanner results and return the Result objects back to the main thread (see THREADS_READ_RESULTS option below).
  • A QueryExecutorPool request will block if all elements in the pool are taken. A pooled QueryExecutor is released back to a pool on query completion (when the ResultSet iterator is completed).
  • QueryExecutorPools work with both sync and async queries.

Parameters

  • A QueryExecutorPool has a maximum size specified by the MAX_EXECUTOR_POOL_SIZE parameter.
  • Each QueryExecutor has a thread pool specified by these parameters:
    • MIN_THREAD_COUNT - the number of threads to keep in the pool, even if they are idle.
    • MAX_THREAD_COUNT - the maximum number of threads to allow in the pool.
    • KEEP_ALIVE_SECS - when the number of threads is greater than the minimum thread count, KEEP_ALIVE_SECS is the maximum number of seconds that excess idle threads will wait for new tasks before terminating.
    • THREADS_READ_RESULTS - determines the thread processing behavior. If THREADS_READ_RESULTS is false, the order of Result values within a Scans will be preserved, but at a cost of less concurrency. If THREADS_READ_RESULTS is true, then Result values across Scan results will be merged in the ResultSet iterator. There is no guarantee on the order in which keys and key ranges results are returned.
    • COMPLETION_QUEUE_SIZE - the size of the queue used by the thread pool to return data to the main thread. If THREADS_READ_RESULTS is false, then it is the size of the queue used to return ResultScanner objects and if THREADS_READ_RESULTS is true, it is the the size of the queue used to Result objects.
  • All the executorArg parameters are optional. If unspecified, the values default to:
    • QueryExecutorPoolManager.defaultMaxPoolSize (5)
    • QueryExecutorPoolManager.defaultMinThreadCount (1)
    • QueryExecutorPoolManager.defaultMaxThreadCount (10)
    • QueryExecutorPoolManager.defaultKeepAliveSecs (Long.MAX_VALUE)
    • QueryExecutorPoolManager.defaultThreadsReadResults (TRUE)
    • QueryExecutorPoolManager.defaultCompletionQueueSize (25)

Example

        HConnection conn = HConnectionManager.newConnection();

        // Create Query Executor Pool named execPool if it doesn't already exist.
        conn.execute("CREATE QUERY EXECUTOR POOL execPool (MAX_EXECUTOR_POOL_SIZE: 5, MAX_THREAD_COUNT: 10) IF NOT queryExecutorPoolExists('execPool')");

        // Or, using the API
        if (!QueryExecutorPoolManager.queryExecutorPoolExists("execPool"))
            QueryExecutorPoolManager.newQueryExecutorPool("execPool", 5, 5, 10, Long.MAX_VALUE, true, 100);

        // Then assign the connection a query executor pool name to use for queries
        conn.setQueryExecutorPoolName("execPool");

        // Now use connection in a query.