Developer Guide for Intel® Data Analytics Acceleration Library 2019 Update 5
Given the input dataset
of size
m x
n, where
m is the number of users and
n is the number of items, the problem is to train the Alternating Least Squares (ALS) model represented as two matrices:
X of size
m x
f, and
Y of size
f x
n, where
f is the number of factors. The matrices
X and
Y are the factors of low-rank factorization of matrix
R:
Initialization of the matrix Y can be done using the following method: for each i = 1, ..., n,
and y ki are independent random numbers uniformly distributed on the interval (0,1), k = 2, ..., f.
The ALS model is trained using the implicit ALS algorithm [Hu2008] by minimizing the following cost function:
where:
p ui indicates the preference of user u of item i:
is the threshold used to define the preference values.
is the only threshold value supported so far.
c ui measures the confidence in observing p ui :
α is the rate of confidence
r ui is the element of the matrix R
λ is the parameter of the regularization
n x u and m y i denote the number of ratings of user u and item i respectively
Given the trained ALS model and the matrix D that describes for which pairs of factors X and Y the rating should be computed, the system calculates the matrix of recommended ratings Res: