Intel® Math Kernel Library 2018 Developer Reference - C

cluster_sparse_solver iparm Parameter

The following table describes all individual components of the Parallel Direct Sparse Solver for Clusters Interface iparm parameter. Components which are not used must be initialized with 0. Default values are denoted with an asterisk (*).

Component Description

iparm[0]

input

Use default values.

0 iparm[1] - iparm(64) are filled with default values.
!=0 You must supply all values in components iparm[1] - iparm(64).

iparm[1]

input

Fill-in reducing ordering for the input matrix.

2* The nested dissection algorithm from the METIS package [Karypis98].
3 The parallel version of the nested dissection algorithm. It can decrease the time of computations on multi-core computers, especially when Phase 1 takes significant time.
10 The MPI version of the nested dissection and symbolic factorization algorithms. The input matrix for the reordering must be distributed among different MPI processes without any intersection. Use iparm[40] and iparm[41]to set the bounds of the domain. During all of Phase 1, the entire matrix is not gathered on any one process, which can decrease computation time (especially when Phase 1 takes significant time) and decrease memory usage for each MPI process on the cluster.

Note

If you set iparm[1] = 10, comm = -1 (MPI communicator), and if there is one MPI process, optimization and full parallelization with the OpenMP version of the nested dissection and symbolic factorization algorithms proceeds. This can decrease computation time on multi-core computers. In this case, set iparm[40] = 1 and iparm[41] = n for one-based indexing, or to 0 and n - 1, respectively, for zero-based indexing.

iparm[2]

Reserved. Set to zero.

iparm[5]

input

Write solution on x.

Note

The array x is always used.

0*

The array x contains the solution; right-hand side vector b is kept unchanged.

1

The solver stores the solution on the right-hand side b.

iparm[6]

output

Number of iterative refinement steps performed.

Reports the number of iterative refinement steps that were actually performed during the solve step.

iparm[7]

input

Iterative refinement step.

On entry to the solve and iterative refinement step, iparm[7] must be set to the maximum number of iterative refinement steps that the solver performs.

0*

The solver automatically performs two steps of iterative refinement when perturbed pivots are obtained during the numerical factorization.

>0

Maximum number of iterative refinement steps that the solver performs. The solver performs not more than the absolute value of iparm[7] steps of iterative refinement. The solver might stop the process before the maximum number of steps if

  • a satisfactory level of accuracy of the solution in terms of backward error is achieved,

  • or if it determines that the required accuracy cannot be reached. In this case Parallel Direct Sparse Solver for Clusters Interface returns -4 in the error parameter.

The number of executed iterations is reported in iparm[6].

<0

Same as above, but the accumulation of the residuum uses extended precision real and complex data types.

Perturbed pivots result in iterative refinement (independent of iparm[7]=0) and the number of executed iterations is reported in iparm[6].

iparm[8]

Reserved. Set to zero.

iparm[9]

input

Pivoting perturbation.

This parameter instructs Parallel Direct Sparse Solver for Clusters Interface how to handle small pivots or zero pivots for nonsymmetric matrices (mtype =11 or mtype =13) and symmetric matrices (mtype =-2, mtype =-4, or mtype =6). For these matrices the solver uses a complete supernode pivoting approach. When the factorization algorithm reaches a point where it cannot factor the supernodes with this pivoting strategy, it uses a pivoting perturbation strategy similar to [Li99], [Schenk04].

Small pivots are perturbed with eps = 10-iparm[9].

The magnitude of the potential pivot is tested against a constant threshold of

alpha = eps*||A2||inf,

where eps = 10(-iparm[9]), A2 = P*PMPS*Dr*A*Dc*P, and ||A2||inf is the infinity norm of the scaled and permuted matrix A. Any tiny pivots encountered during elimination are set to the sign (lII)*eps*||A2||inf, which trades off some numerical stability for the ability to keep pivots from getting too small. Small pivots are therefore perturbed with eps = 10(-iparm[9]).

13*

The default value for nonsymmetric matrices(mtype =11, mtype=13), eps = 10-13.

8*

The default value for symmetric indefinite matrices (mtype =-2, mtype=-4, mtype=6), eps = 10-8.

iparm[10]

input

Scaling vectors.

Parallel Direct Sparse Solver for Clusters Interface uses a maximum weight matching algorithm to permute large elements on the diagonal and to scale.

Use iparm[10] = 1 (scaling) and iparm[12] = 1 (matching) for highly indefinite symmetric matrices, for example, from interior point optimizations or saddle point problems. Note that in the analysis phase (phase=11) you must provide the numerical values of the matrix A in array a in case of scaling and symmetric weighted matching.

0*

Disable scaling. Default for symmetric indefinite matrices.

1*

Enable scaling. Default for nonsymmetric matrices.

Scale the matrix so that the diagonal elements are equal to 1 and the absolute values of the off-diagonal entries are less or equal to 1. This scaling method is applied to nonsymmetric matrices (mtype = 11, mtype = 13). The scaling can also be used for symmetric indefinite matrices (mtype = -2, mtype = -4, mtype = 6) when the symmetric weighted matchings are applied (iparm[12] = 1).

Note that in the analysis phase (phase=11) you must provide the numerical values of the matrix A in case of scaling.

iparm[11]

Reserved. Set to zero.

iparm[12]

input

Improved accuracy using (non-) symmetric weighted matching.

Parallel Direct Sparse Solver for Clusters Interface can use a maximum weighted matching algorithm to permute large elements close the diagonal. This strategy adds an additional level of reliability to the factorization methods and complements the alternative of using more complete pivoting techniques during the numerical factorization.

  

0*

Disable matching. Default for symmetric indefinite matrices.

1*

Enable matching. Default for nonsymmetric matrices.

Maximum weighted matching algorithm to permute large elements close to the diagonal.

It is recommended to use iparm[10] = 1 (scaling) and iparm[12]= 1 (matching) for highly indefinite symmetric matrices, for example from interior point optimizations or saddle point problems.

Note that in the analysis phase (phase=11) you must provide the numerical values of the matrix A in case of symmetric weighted matching.

iparm[13] - iparm[19]

Reserved. Set to zero.

iparm[20]

input

Pivoting for symmetric indefinite matrices.

0

Apply 1x1 diagonal pivoting during the factorization process.

1*

Apply 1x1 and 2x2 Bunch-Kaufman pivoting during the factorization process. Bunch-Kaufman pivoting is available for matrices of mtype=-2, mtype=-4, or mtype=6.

iparm[21] - iparm[25]

Reserved. Set to zero.

iparm[26]

input

Matrix checker.

0*

Do not check the sparse matrix representation for errors.

1

Check integer arrays ia and ja. In particular, check whether the column indices are sorted in increasing order within each row.

iparm[27]

input

Single or double precision Parallel Direct Sparse Solver for Clusters Interface.

See iparm[7] for information on controlling the precision of the refinement steps.

0*

Input arrays (a, x and b) and all internal arrays must be presented in double precision.

1

Input arrays (a, x and b) must be presented in single precision.

In this case all internal computations are performed in single precision.

iparm[28] - iparm[33]

Reserved. Set to zero.

iparm[34]

input

One- or zero-based indexing of columns and rows.

0*

One-based indexing: columns and rows indexing in arrays ia, ja, and perm starts from 1 (Fortran-style indexing).

1

Zero-based indexing: columns and rows indexing in arrays ia, ja, and perm starts from 0 (C-style indexing).

iparm[36]

input

Format for matrix storage.

0*

Use CSR format (see Three Array Variation of BSR Format) for matrix storage.

> 0

Use BSR format (see Three Array Variation of BSR Format) for matrix storage with blocks of size iparm[36].

Note

Intel MKL does not support BSR format in these cases:

iparm[10] > 0

Scaling vectors

iparm[12] > 0

Weighted matching

iparm[30] > 0

Partial solution

iparm[35] > 0

Schur complement

iparm[55] > 0
Pivoting control
iparm[59] > 0
OOC Intel MKL PARDISO
iparm[37] - iparm[38]

Reserved. Set to zero.

iparm[39]

input

Matrix input format.

Note

Performance of the reordering step of the Parallel Direct Sparse Solver for Clusters Interface is slightly better for assembled format (CSR, iparm[39] = 0) than for distributed format (DCSR, iparm[39] > 0) for the same matrices, so if the matrix is assembled on one node do not distribute it before calling cluster_sparse_solver.

0*

Provide the matrix in usual centralized input format: the master MPI process stores all data from matrix A, with rank=0.

1

Provide the matrix in distributed assembled matrix input format. In this case, each MPI process stores only a part (or domain) of the matrix A data. Set the bounds of the domain using iparm[40] and iparm[41]. The solution vector is placed on the master process.

2

Provide the matrix in distributed assembled matrix input format. In this case, each MPI process stores only a part (or domain) of the matrix A data. Set the bounds of the domain using iparm[40] and iparm[41]. The solution vector, A, and RHS elements are distributed between processes in same manner.

3

Provide the matrix in distributed assembled matrix input format. In this case, each MPI process stores only a part (or domain) of the matrix A data. Set the bounds of the domain using iparm[40] and iparm[41]. The A and RHS elements are distributed between processes in same manner and the solution vector is the same on each process

iparm[40]

input

Beginning of input domain.

The number of the matrix A row, RHS element, and, for iparm[39]=2, solution vector that begins the input domain belonging to this MPI process.

Only applicable to the distributed assembled matrix input format (iparm[39]> 0).

See Sparse Matrix Storage Formats for more details.

iparm[41]

input

End of input domain.

The number of the matrix A row, RHS element, and, for iparm[39]=2, solution vector that ends the input domain belonging to this MPI process.

Only applicable to the distributed assembled matrix input format (iparm[39]> 0).

See Sparse Matrix Storage Formats for more details.

iparm[42] - iparm[63]

input

Reserved. Set to zero.

Note

Generally in sparse matrices, components which are equal to zero can be considered non-zero if necessary. For example, in order to make a matrix structurally symmetric, elements which are zero can be considered non-zero. See Sparse Matrix Storage Formats for an example.

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804