Developer Guide for Intel® Data Analytics Acceleration Library 2019 Update 1

Cross-entropy Loss

Cross-entropy loss is an objective function minimized in the process of logistic regression training when a dependent variable takes more than two values.

Given n feature vectors X = { x 1 = (x 11 ,…,x 1p ), ..., x n = (x n 1 ,…,x n p ) } of n p-dimensional feature vectors , a vector of class labels y = (y 1,…,y n ) , where y i ∈ {0, K-1} describes the class, to which the feature vector x i belongs, optimization solver optimizes cross-entropy loss objective function by argument θ, it is a matrix of size K × (p + 1). The cross entropy loss objective function F(θ, X, y) has the following format:

For a given set of indices I = { i 1, i 2, ... , i m }, 1 ≤ i r n , r ∈ {1, ..., m} , the value and the gradient of the sum of functions in the argument X respectively have the format:

Hessian matrix is a symmetric matrix of size S × S, where S = K × (p + 1):

For more details, see [Hastie2009]

Implementation of pk(z,θ) computation relies on the numerically stable version of the softmax function (Analysis > Math functions > Softmax).