Partitioner Summary

The parallel loop templates parallel_for and parallel_reduce take an optional partitioner argument, which specifies a strategy for executing the loop. The following table summarizes partitioners and their effect when used in conjunction with blocked_range.

Partitioners

Partitioner

Description

When Used with blocked_range(i,j,g)

simple_partitioner

Chunksize bounded by grain size.

g/2 ≤ chunksizeg

auto_partitioner (default)[4]

Automatic chunk size.

g/2 ≤ chunksize

affinity_partitioner

Automatic chunk size, cache affinity and uniform distribution of iterations.

static_partitioner

Deterministic chunk size, cache affinity and uniform distribution of iterations without load balancing.

max(g/3, problem_size/num_of_resources) ≤ chunksize

An auto_partitioner is used when no partitioner is specified. In general, the auto_partitioner or affinity_partitioner should be used, because these tailor the number of chunks based on available execution resources. affinity_partitioner and static_partitioner may take advantage of Range ability to split in a given ratio (see "Advanced Topic: Other Kinds of Iteration Spaces") for distributing iterations in nearly equal chunks between computing resources.

simple_partitioner can be useful in the following situations:

See Also

[4] >Prior to Intel® Threading Building Blocks 2.2, the default was simple_partitioner.