Developer Guide for Intel® Data Analytics Acceleration Library 2019 Update 5

Distributed Processing

You can use the distributed processing mode for neural network training. Approaches to neural network training in the distributed mode are based on the following kinds of parallelization:

The library supports data based parallelization.

Data Based Parallelization

The data based parallelization approach has the following features:

The library supports the following ways to update the parameters of the neural network model for data based parallelization:

Computation

The flow of the neural network model training using data based parallelization involves these steps:

  1. Initialize the neural network model using the initialize() method on the master node and propagate the model to local nodes.

  2. Run the training algorithm on local nodes as described in the Usage Model: Training and Prediction > Training section with the following specifics of the distributed computation mode:

    • Provide each j-th node of the neural network with the local data set of localDataSizej size.
    • Specify the required batchSizej parameter.
    • Split the data set on a local node into localDataSizej/batchSizej data blocks, each to be processed by the local algorithm separately.
    • The batchSizej parameters and localDataSizej parameters must be the same on all local nodes for synchronous computations and can be different for asynchronous computations.

    See the figure below to visualize an i-th iteration, corresponding to the i-th data block. After the computations for the i-th data block on a local node are finished, send the derivatives of local weights and biases to the master node.

    Note

    The training algorithm on local nodes does not require an optimization solver.

  3. Run the training algorithm on the master node by providing the local derivatives from all local nodes. The algorithm uses the optimization solver provided in its optimizationSolver parameter. For available algorithms, see Optimization Solvers. After the computations are completed, send the updated weights and biases parameters of the model to all local nodes.

    You can get the latest version of the model by calling the finalizeCompute() method after each run of the training algorithm on the master or local node.

  4. Perform computations 2 - 3 for all data blocks. Call the getPredictionModel() method of the trained model on the master to get the model to be used for validation and prediction after the training process is completed.


Neural Network Training Distributed Processing i-th Iteration Workflow