Developer Guide for Intel® Data Analytics Acceleration Library 2018 Update 2
You can use the distributed processing mode for neural network training. Approaches to neural network training in the distributed mode are based on the following kinds of parallelization:
The library supports data based parallelization.
The data based parallelization approach has the following features:
The library supports the following ways to update the parameters of the neural network model for data based parallelization:
Synchronous.
The master node updates the model only after all local nodes deliver the local derivatives for a given iteration of the training.
Asynchronous.
The master node:
The flow of the neural network model training using data based parallelization involves these steps:
Initialize the neural network model using the initialize() method on the master node and propagate the model to local nodes.
Run the training algorithm on local nodes as described in the Usage Model: Training and Prediction > Training section with the following specifics of the distributed computation mode:
See the figure below to visualize an i-th iteration, corresponding to the i-th data block. After the computations for the i-th data block on a local node are finished, send the derivatives of local weights and biases to the master node.
The training algorithm on local nodes does not require an optimization solver.
Run the training algorithm on the master node by providing the local derivatives from all local nodes. The algorithm uses the optimization solver provided in its optimizationSolver parameter. For available algorithms, see Optimization Solvers. After the computations are completed, send the updated weights and biases parameters of the model to all local nodes.
You can get the latest version of the model by calling the finalizeCompute() method after each run of the training algorithm on the master or local node.
Perform computations 2 - 3 for all data blocks. Call the getPredictionModel() method of the trained model on the master to get the model to be used for validation and prediction after the training process is completed.