Intel® Math Kernel Library 2018 Developer Reference - C
The following table describes all individual components of the Parallel Direct Sparse Solver for Clusters Interface iparm parameter. Components which are not used must be initialized with 0. Default values are denoted with an asterisk (*).
Component | Description | |
---|---|---|
iparm[0] input |
Use default values. |
|
0 | iparm[1] - iparm(64) are filled with default values. | |
!=0 | You must supply all values in components iparm[1] - iparm(64). | |
iparm[1] input |
Fill-in reducing ordering for the input matrix. |
|
2* | The nested dissection algorithm from the METIS package [Karypis98]. | |
3 | The parallel version of the nested dissection algorithm. It can decrease the time of computations on multi-core computers, especially when Phase 1 takes significant time. | |
10 | The MPI version of the nested dissection and
symbolic factorization algorithms. The input matrix for the reordering must be
distributed among different MPI processes without any intersection. Use
iparm[40] and
iparm[41]to
set the bounds of the domain. During all of Phase 1, the entire matrix is not
gathered on any one process, which can decrease computation time (especially
when Phase 1 takes significant time) and decrease memory usage for each MPI
process on the cluster.
NoteIf you set iparm[1] = 10, comm = -1 (MPI communicator), and if there is one MPI process, optimization and full parallelization with the OpenMP version of the nested dissection and symbolic factorization algorithms proceeds. This can decrease computation time on multi-core computers. In this case, set iparm[40] = 1 and iparm[41] = n for one-based indexing, or to 0 and n - 1, respectively, for zero-based indexing. |
|
iparm[2] | Reserved. Set to zero. |
|
iparm[5] input |
Write solution on x. NoteThe array x is always used. |
|
0* | The array x contains the solution; right-hand side vector b is kept unchanged. |
|
1 | The solver stores the solution on the right-hand side b. |
|
iparm[6] output |
Number of iterative refinement steps performed. Reports the number of iterative refinement steps that were actually performed during the solve step. |
|
iparm[7] input |
Iterative refinement step. On entry to the solve and iterative refinement step, iparm[7] must be set to the maximum number of iterative refinement steps that the solver performs. |
|
0* | The solver automatically performs two steps of iterative refinement when perturbed pivots are obtained during the numerical factorization. |
|
>0 | Maximum number of iterative refinement steps that the solver performs. The solver performs not more than the absolute value of iparm[7] steps of iterative refinement. The solver might stop the process before the maximum number of steps if
The number of executed iterations is reported in iparm[6]. |
|
<0 | Same as above, but the accumulation of the residuum uses extended precision real and complex data types. Perturbed pivots result in iterative refinement (independent of iparm[7]=0) and the number of executed iterations is reported in iparm[6]. |
|
iparm[8] | Reserved. Set to zero. |
|
iparm[9] input |
Pivoting perturbation. This parameter instructs Parallel Direct Sparse Solver for Clusters Interface how to handle small pivots or zero pivots for nonsymmetric matrices (mtype =11 or mtype =13) and symmetric matrices (mtype =-2, mtype =-4, or mtype =6). For these matrices the solver uses a complete supernode pivoting approach. When the factorization algorithm reaches a point where it cannot factor the supernodes with this pivoting strategy, it uses a pivoting perturbation strategy similar to [Li99], [Schenk04]. Small pivots are perturbed with eps = 10-iparm[9]. The magnitude of the potential pivot is tested against a constant threshold of alpha = eps*||A2||inf, where eps = 10(-iparm[9]), A2 = P*PMPS*Dr*A*Dc*P, and ||A2||inf is the infinity norm of the scaled and permuted matrix A. Any tiny pivots encountered during elimination are set to the sign (lII)*eps*||A2||inf, which trades off some numerical stability for the ability to keep pivots from getting too small. Small pivots are therefore perturbed with eps = 10(-iparm[9]). |
|
13* | The default value for nonsymmetric matrices(mtype =11, mtype=13), eps = 10-13. |
|
8* | The default value for symmetric indefinite matrices (mtype =-2, mtype=-4, mtype=6), eps = 10-8. |
|
iparm[10] input |
Scaling vectors. Parallel Direct Sparse Solver for Clusters Interface uses a maximum weight matching algorithm to permute large elements on the diagonal and to scale. Use iparm[10] = 1 (scaling) and iparm[12] = 1 (matching) for highly indefinite symmetric matrices, for example, from interior point optimizations or saddle point problems. Note that in the analysis phase (phase=11) you must provide the numerical values of the matrix A in array a in case of scaling and symmetric weighted matching. |
|
0* | Disable scaling. Default for symmetric indefinite matrices. |
|
1* | Enable scaling. Default for nonsymmetric matrices. Scale the matrix so that the diagonal elements are equal to 1 and the absolute values of the off-diagonal entries are less or equal to 1. This scaling method is applied to nonsymmetric matrices (mtype = 11, mtype = 13). The scaling can also be used for symmetric indefinite matrices (mtype = -2, mtype = -4, mtype = 6) when the symmetric weighted matchings are applied (iparm[12] = 1). Note that in the analysis phase (phase=11) you must provide the numerical values of the matrix A in case of scaling. |
|
iparm[11] | Reserved. Set to zero. |
|
iparm[12] input |
Improved accuracy using (non-) symmetric weighted matching. Parallel Direct Sparse Solver for Clusters Interface can use a maximum weighted matching algorithm to permute large elements close the diagonal. This strategy adds an additional level of reliability to the factorization methods and complements the alternative of using more complete pivoting techniques during the numerical factorization.
|
|
0* | Disable matching. Default for symmetric indefinite matrices. |
|
1* | Enable matching. Default for nonsymmetric matrices. Maximum weighted matching algorithm to permute large elements close to the diagonal. It is recommended to use iparm[10] = 1 (scaling) and iparm[12]= 1 (matching) for highly indefinite symmetric matrices, for example from interior point optimizations or saddle point problems. Note that in the analysis phase (phase=11) you must provide the numerical values of the matrix A in case of symmetric weighted matching. |
|
iparm[13] - iparm[19] | Reserved. Set to zero. |
|
iparm[20] input |
Pivoting for symmetric indefinite matrices. |
|
0 | Apply 1x1 diagonal pivoting during the factorization process. |
|
1* | Apply 1x1 and 2x2 Bunch-Kaufman pivoting during the factorization process. Bunch-Kaufman pivoting is available for matrices of mtype=-2, mtype=-4, or mtype=6. |
|
iparm[21] - iparm[25] | Reserved. Set to zero. |
|
iparm[26] input |
Matrix checker. |
|
0* | Do not check the sparse matrix representation for errors. |
|
1 | Check integer arrays ia and ja. In particular, check whether the column indices are sorted in increasing order within each row. |
|
iparm[27] input |
Single or double precision Parallel Direct Sparse Solver for Clusters Interface. See iparm[7] for information on controlling the precision of the refinement steps. |
|
0* | Input arrays (a, x and b) and all internal arrays must be presented in double precision. |
|
1 | Input arrays (a, x and b) must be presented in single precision. In this case all internal computations are performed in single precision. |
|
iparm[28] - iparm[33] | Reserved. Set to zero. |
|
iparm[34] input |
One- or zero-based indexing of columns and rows. |
|
0* | One-based indexing: columns and rows indexing in arrays ia, ja, and perm starts from 1 (Fortran-style indexing). |
|
1 | Zero-based indexing: columns and rows indexing in arrays ia, ja, and perm starts from 0 (C-style indexing). |
|
iparm[36] input |
Format for matrix storage. |
|
0* | Use CSR format (see Three Array Variation of BSR Format) for matrix storage. |
|
> 0 | Use BSR format (see Three Array Variation of BSR Format) for matrix storage with blocks of size iparm[36]. NoteIntel MKL does not support BSR format in these cases:
|
|
iparm[37] - iparm[38] | Reserved. Set to zero. |
|
iparm[39] input |
Matrix input format. NotePerformance of the reordering step of the Parallel Direct Sparse Solver for Clusters Interface is slightly better for assembled format (CSR, iparm[39] = 0) than for distributed format (DCSR, iparm[39] > 0) for the same matrices, so if the matrix is assembled on one node do not distribute it before calling cluster_sparse_solver. |
|
0* | Provide the matrix in usual centralized input format: the master MPI process stores all data from matrix A, with rank=0. |
|
1 | Provide the matrix in distributed assembled matrix input format. In this case, each MPI process stores only a part (or domain) of the matrix A data. Set the bounds of the domain using iparm[40] and iparm[41]. The solution vector is placed on the master process. |
|
2 | Provide the matrix in distributed assembled matrix input format. In this case, each MPI process stores only a part (or domain) of the matrix A data. Set the bounds of the domain using iparm[40] and iparm[41]. The solution vector, A, and RHS elements are distributed between processes in same manner. |
|
3 | Provide the matrix in distributed assembled matrix input format. In this case, each MPI process stores only a part (or domain) of the matrix A data. Set the bounds of the domain using iparm[40] and iparm[41]. The A and RHS elements are distributed between processes in same manner and the solution vector is the same on each process |
|
iparm[40] input |
Beginning of input domain. The number of the matrix A row, RHS element, and, for iparm[39]=2, solution vector that begins the input domain belonging to this MPI process. Only applicable to the distributed assembled matrix input format (iparm[39]> 0). See Sparse Matrix Storage Formats for more details. |
|
iparm[41] input |
End of input domain. The number of the matrix A row, RHS element, and, for iparm[39]=2, solution vector that ends the input domain belonging to this MPI process. Only applicable to the distributed assembled matrix input format (iparm[39]> 0). See Sparse Matrix Storage Formats for more details. |
|
iparm[42] -
iparm[63] input |
Reserved. Set to zero. |
Generally in sparse matrices, components which are equal to zero can be considered non-zero if necessary. For example, in order to make a matrix structurally symmetric, elements which are zero can be considered non-zero. See Sparse Matrix Storage Formats for an example.
Optimization Notice |
---|
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 |