Developer Guide for Intel® Data Analytics Acceleration Library 2019 Update 1
Given n feature vectors X = { x 1= (x 11,…,x 1p ), ..., x n = (x n1,…,x np ) } of n p-dimensional feature vectors and vector of dependent variables y = (y 1, … ,y n ), the problem is to build a decision forest regression model that minimizes the Mean-Square Error (MSE) between the predicted and true value.
Decision forest classifier follows the algorithmic framework of decision forest training with variance as impurity metrics, calculated as follows:
Given decision forest regression model and vectors x 1, ... , x r , the problem is to calculate the responses for those vectors. To solve the problem for each given query vector x i , the algorithm finds the leaf node in a tree in the forest that gives the response by that tree as the mean of dependent variables. The forest predicts the response as the mean of responses from trees.
Decision forest regression follows the algorithmic framework for calculating the decision forest out-of-bag (OOB) error, where aggregation of the out-of-bag predictions in all trees and calculation of the OOB error of the decision forest is done as follows: