Deep Learning with Delite and OptiML

Introduction

This page describes how to train deep neural networks using OptiML. For instructions to get OptiML set up, click here. For help, contact optiml@googlegroups.com.

Contents

  1. Example 1: MNIST handwritten digit recognition (convolutional networks)
  2. Example 2: Stock Market Prediction (recurrent networks)
  3. Documentation

MNIST handwritten digit recognition (convolutional networks)

This section describes how to train a convolutional neural network on the MNIST dataset. More complicated datasets, such as ImageNet, can be trained in the same way.

Start from the apps/src/ directory described in the Examples page. From here, navigate to the NeuralNetwork directory:

cd NeuralNetwork/

This directory contains the OptiML neural network library, as well as scripts used to generate the networks. First, get the mnist dataset:

cd examples/mnist/
python get_mnist.py

This should take around one minute, and formats the data in tsv form for OptiML to read. You can also visualize the digits using the script visualize.py, which by default displays images from the training set. To visualize image #1000 in the training set, run (optional) :

python visualize.py 1000
display img_1000.png

Now that the data is generated, return to the NeuralNetwork directory to generate a network:

cd ../../

Networks are specified in an XML file. For an example file, see cnn_example.xml, which shows all of the possible XML tags and their attributes. For mnist, we will use mnist.xml, which contains 2 convolution layers, 2 pooling layers, a fully-connected layer, and a softmax layer.

Generate the network with the following command:

python generate_cnn.py mnist.xml

The output describes that an OptiML source file, mnist_tutorial.scala, has been created describing the network you specified, as well as a number of parameter files. We will modify one of these parameter files to increase the number of training epochs from 1 to 10. This will do 10 passes over the training data. The training data has 50,000 images, and with a mini-batch size of 100, 10 epochs will run 5000 forward/backward passes over the network.

Open the file mnist_tutorial/global_params.txt and modify the first line, replacing 1 with 10. The first few lines of the file should now look like this:

10 ; Num Epochs
100 ; Mini-Batch size
0 ; Read model from file
0 ; Read momentum from file
10 ; Num epochs between saving model (0 to disable)
10 ; Num epochs between testing on training set (0 to disable)
...

Now train the network. This is done by running the application specified by the generated mnist_tutorial.scala (see also the getting started instructions). Change to the published/OptiML directory and run:

sbt compile
bin/delitec mnist_tutorialCompiler --cuda
bin/delite mnist_tutorialCompiler --cuda 1

The cuda flags are optional but accelerate training. Once this completes, the network is partially trained and the weights are saved. To continue training, modify line 3 in mnist_tutorial/global_params.txt to change 0 to 1. This instructs the network to read in the weights where it left off. Then, run the network again using the bin/delite mnist_tutorialCompiler --cuda 1 command above. You can also increase the number of epochs and modify other parameters, for example to train the network and save checkpoints of the weights, or to have the learning rate decrease over time. See the reference guide for more information.

The cifar10 dataset is also provided as an example, and can be run by following the steps above for that dataset instead of mnist.

Stock Market Prediction (recurrent networks)

You can generate recurrent neural networks in the same way as convolutional neural networks for the MNIST example above, but instead using the script generate_rnn.py. Refer to the rnn_example.xml network for the required XML format.

To run the stock market example, first generate the dataset. Note: this script requires numpy/scipy to run. Also, the data is randomly generated each time the script is run, and so the script can optionally display and plot the data. To generate the data without displaying or plotting it, run:

cd examples/stock_market
python make_dataset.py

If you have matplotlib, instead run:

cd examples/stock_market
python make_dataset.py show_data plot_data

This plots one of the stock prices and also shows the required action/class at each time step (see the script for action to class mappings, e.g. 1 = "sell", 3 = "buy").

Next, generate the stock market example.

cd ../../
python generate_rnn.py rnn_example.xml

As before, the training parameters can be modified. Open the file RNNExample/global_params.txt and modify the first line (epochs), replacing 1 with 25.

Now train the network. Change to the published/OptiML directory and run:

sbt compile
bin/delitec RNNExampleCompiler
bin/delite RNNExampleCompiler

Once this completes, the network is partially trained and the weights are saved. To continue training, modify line 3 in RNNExample/global_params.txt to change 0 to 1. This instructs the network to read in the weights where it left off. Then, run the network again using the bin/delite RNNExampleCompiler --cuda 1 command above. You can also increase the number of epochs and modify other parameters, for example to train the network and save checkpoints of the weights, or to have the learning rate decrease over time. See the reference guide for more information.

Documentation

The examples above are the fastest way to become familiar with deep learning in OptiML. Complete documentation is available here.