Developer Guide for Intel® Data Analytics Acceleration Library 2018
Given the input dataset
of size
m x
n, where
m is the number of users and
n is the number of items, the problem is to train the Alternating Least Squares (ALS) model represented as two matrices:
X of size
m x
f, and
Y of size
f x
n, where
f is the number of factors. The matrices
X and
Y are the factors of low-rank factorization of matrix
R:
Initialization of the matrix
Y can be done using the following method: for each
,
and
are independent random numbers uniformly distributed on the interval (0,1),
The ALS model is trained using the implicit ALS algorithm [Hu2008] by minimizing the following cost function:
where:
indicates the preference of user
u of item
i
is the threshold used to define the preference values.
is the only threshold value supported so far.
measures the confidence in observing
p
ui
α is the rate of confidence
r ui is the element of the matrix R
λ is the parameter of the regularization
denote the number of ratings of user
u and item
i respectively
Given the trained ALS model and the matrix D that describes for which pairs of factors X and Y the rating should be computed, the system calculates the matrix of recommended ratings Res: