Intel® Math Kernel Library 2018 Developer Reference - C
Computes a matrix-matrix product with general matrices but updates only the upper or lower triangular part of the result matrix.
void cblas_sgemmt (const CBLAS_LAYOUT Layout, const CBLAS_UPLO uplo, const CBLAS_TRANSPOSE transa, const CBLAS_TRANSPOSE transb, const MKL_INT n, const MKL_INT k, const float alpha, const float *a, const MKL_INT lda, const float *b, const MKL_INT ldb, const float beta, float *c, const MKL_INT ldc);
void cblas_dgemmt (const CBLAS_LAYOUT Layout, const CBLAS_UPLO uplo, const CBLAS_TRANSPOSE transa, const CBLAS_TRANSPOSE transb, const MKL_INT n, const MKL_INT k, const double alpha, const double *a, const MKL_INT lda, const double *b, const MKL_INT ldb, const double beta, double *c, const MKL_INT ldc);
void cblas_cgemmt (const CBLAS_LAYOUT Layout, const CBLAS_UPLO uplo, const CBLAS_TRANSPOSE transa, const CBLAS_TRANSPOSE transb, const MKL_INT n, const MKL_INT k, const void *alpha, const void *a, const MKL_INT lda, const void *b, const MKL_INT ldb, const void *beta, void *c, const MKL_INT ldc);
void cblas_zgemmt (const CBLAS_LAYOUT Layout, const CBLAS_UPLO uplo, const CBLAS_TRANSPOSE transa, const CBLAS_TRANSPOSE transb, const MKL_INT n, const MKL_INT k, const void *alpha, const void *a, const MKL_INT lda, const void *b, const MKL_INT ldb, const void *beta, void *c, const MKL_INT ldc);
The ?gemmt routines compute a scalar-matrix-matrix product with general matrices and add the result to the upper or lower part of a scalar-matrix product. These routines are similar to the ?gemm routines, but they only access and update a triangular part of the square result matrix (see Application Notes below).
The operation is defined as
C := alpha*op(A)*op(B) + beta*C,
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,
alpha and beta are scalars,
A, B and C are matrices:
op(A) is an n-by-k matrix,
op(B) is a k-by-n matrix,
C is an n-by-n upper or lower triangular matrix.
Specifies whether two-dimensional array storage is row-major (CblasRowMajor) or column-major (CblasColMajor).
Specifies whether the upper or lower triangular part of the array c is used. If uplo = 'U' or 'u', then the upper triangular part of the array c is used. If uplo = 'L' or 'l', then the lower triangular part of the array c is used.
Specifies the form of op(A) used in the matrix multiplication:
if transa = 'N' or 'n', then op(A) = A;
if transa = 'T' or 't', then op(A) = AT;
if transa = 'C' or 'c', then op(A) = AH.
Specifies the form of op(B) used in the matrix multiplication:
if transb = 'N' or 'n', then op(B) = B;
if transb = 'T' or 't', then op(B) = BT;
if transb = 'C' or 'c', then op(B) = BH.
Specifies the order of the matrix C. The value of n must be at least zero.
Specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B). The value of k must be at least zero.
Specifies the scalar alpha.
Array, size lda by ka, where ka is k when transa = 'N' or 'n', and is n otherwise. Before entry with transa = 'N' or 'n', the leading n-by-k part of the array a must contain the matrix A, otherwise the leading k-by-n part of the array a must contain the matrix A.
Specifies the leading dimension of a as declared in the calling (sub)program. When transa = 'N' or 'n', then lda must be at least max(1, n), otherwise lda must be at least max(1, k).
Array, size ldb by kb, where kb is n when transb = 'N' or 'n', and is k otherwise. Before entry with transb = 'N' or 'n', the leading k-by-n part of the array b must contain the matrix B, otherwise the leading n-by-k part of the array b must contain the matrix B.
Specifies the leading dimension of b as declared in the calling (sub)program. When transb = 'N' or 'n', then ldb must be at least max(1, k), otherwise ldb must be at least max(1, n).
Specifies the scalar beta. When beta is equal to zero, then c need not be set on input.
Array, size ldc by n.
Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular part of the array c must contain the upper triangular part of the matrix C and the strictly lower triangular part of c is not referenced.
Before entry with uplo = 'L' or 'l', the leading n-by-n lower triangular part of the array c must contain the lower triangular part of the matrix C and the strictly upper triangular part of c is not referenced.
When beta is equal to zero, then c need not be set on input.
Specifies the leading dimension of c as declared in the calling (sub)program. The value of ldc must be at least max(1, n).
c |
When uplo = 'U' or 'u', the upper triangular part of the array c is overwritten by the upper triangular part of the updated matrix. When uplo = 'L' or 'l', the lower triangular part of the array c is overwritten by the lower triangular part of the updated matrix. |
These routines only access and update the upper or lower triangular part of the result matrix. This can be useful when the result is known to be symmetric; for example, when computing a product of the form C := alpha*B*S*BT + beta*C , where S and C are symmetric matrices and B is a general matrix. In this case, first compute A := B*S (which can be done using the corresponding ?symm routine), then compute C := alpha*A*BT + beta*C using the ?gemmt routine.