Developer Guide for Intel® Data Analytics Acceleration Library 2019 Update 4

Cross-entropy Loss

Cross-entropy loss is an objective function minimized in the process of logistic regression training when a dependent variable takes more than two values.

Given n feature vectors X = { x 1 = (x 11 ,…,x 1p ), ..., x n = (x n 1 ,…,x n p ) } of n p-dimensional feature vectors , a vector of class labels y = (y 1,…,y n ) , where y i ∈ {0, T-1} describes the class, to which the feature vector x i belongs, where T is the number of classes, optimization solver optimizes cross-entropy loss objective function by argument θ, it is a matrix of size T × (p + 1). The cross entropy loss objective function K(θ, X, y) has the following format where:

For a given set of indices , the value and the gradient of the sum of functions in the argument X respectively have the format:

where ,

Hessian matrix is a symmetric matrix of size SxS, where

, where - learning rate.

For more details, see [Hastie2009]

Implementation of pt(z,θ) computation relies on the numerically stable version of the softmax function (Analysis > Math functions > Softmax).