Developer Guide for Intel® Data Analytics Acceleration Library 2019 Update 1
Given a set X = {x 1= (x 11,…,x 1p ), ..., (x n1,…,x np )} of p-dimensional feature vectors or a p x p correlation matrix and the number of principal components p r , the problem is to compute p r principal directions (eigenvectors) for the data set. The library returns the transformation matrix T of size p r x p, which contains eigenvectors in the row-major order and a vector of respective eigenvalues in descending order.
Intel DAAL provides two methods for running PCA:
Eigenvectors computed by PCA are not uniquely defined due to sign ambiguity. PCA supports fast ad-hoc "sign flip" technique described in the paper [Bro07]. It modifies the signs of eigenvectors shown below:
where T-transformation matrix is computed by PCA, T i - i-th row in the matrix, j - column number, sgn - signum function:
You can provide these types of input data to the PCA algorithms of the library:
Original, non-normalized data set
Normalized data set, where each feature has the zero mean and unit variance
Correlation matrix