BATCH and THREADS

The BATCH and THREADS statements provide the ability to control and to fine-tune the use of multiple threads when multiple CPU cores are available.  They work within SQL statements such as SELECT.   BATCH is used to fine-tune the performance of threads.   

 

Manual use of BATCH is not recommended, as the system does a much better job of optimization on the fly than most humans can do through manual use of BATCH.

 

THREADS <value>

 

Specifies the number of threads that should be used.  A typical use is THREADS SystemCpuCount(), to set the number of threads to the CPU cores reported by the system.

 

BATCH <value>

 

Specifies how many records each thread will fetch for processing when multiple threads are used.  With only a single thread (THREADS not used or setting THREADS to 1) the value of BATCH does not matter.   

 

BATCH is an instrument for fine-tuning threads. If we only have one thread (having no THREADS option or having it set to 1, for example), BATCH doesn't matter. If we have multiple threads, BATCH controls how many records each individual thread will fetch for processing.

 

Example:  Suppose we are using SQL to build buffers on a system with eight CPU cores. We would probably want to run one thread on each core using THREADS 8 to specify eight threads.

 

If we have a drawing with one thousand large objects we would like the batch size for each thread to be relatively small, significantly smaller than one thousand, to distribute the load between the CPU cores.  In that case we might use THREADS 8 BATCH 10 to specify eight threads and ten records per thread to be fetched at a time.  If the objects were really huge and we only had twenty of them then THREADS 8 BATCH 1 would be better.

 

In contrast, if we had a drawing with several million very small objects (for example, only a handful of coordinates each)  then we would like the batch size to be larger to minimize the number of times the threads have to synchronize when they fetch additional records on which to work or to report results. It would make sense in that case to use something like THREADS 8 BATCH 1000.

 

Although Manifold does include internal optimizations that adjust batch size automatically on the fly there are always edge cases where an explicit value specified using BATCH may work better than automatic optimization.  As optimizers have steadily improved, such edge cases have become so rare in real life that using BATCH is not recommended.

 

Notes

CPUs not GPUs - THREADS and BATCH control processing on multiple CPU cores, not on GPUs.  THREADS will only have an indirect effect on GPGPU optimization. It will be useful to have THREADS in queries which make heavy use of both GPGPUs and CPUs, but pragmas are the way to guide GPGPU use.

 

Before ORDER BY - THREADS and BATCH should appear in a query before ORDER BY: both of  these options continue working with ORDER BY. If the expressions in the SELECT list take a long time to compute then using more than one thread will very likely significantly shorten that time whether or not ORDER BY is used.

 

SELECTs and parallelism - If we run a query such as SELECT f(a) FROM t THREADS 8 and start reading its results the query engine creates a single sequence that reads data from t and then  uses this sequence to feed 8 threads computing f(a). There is no multi-threaded fetching in that case.

 

Theoretically the query engine could be written to parallelize fetching data in such cases but that would require at least an index plus some statistics on its values on t so that multiple threads could be started from different places to avoid re-reading the same values.  That would be a very brittle construct in that adding a single WHERE clause would wreck the statistics.

 

In contrast, if we use SELECT as a subquery and the parent query asks it for several pieces of data in parallel, for example, if the parent query is SELECT g(...) and g() invokes our SELECT, then in that case Manifold will launch a separate sequence reading data from t for each such request and those requests will be parallelized, that is, they will proceed in parallel.