Advice Use Deep Neural Network learner even for shallow networks

DH 2017-04-06

If you build neural network models in Pipeline Pilot or are interested in doing so, I have the following advice based on some calculations I did recently:

Always use the deep neural network (DNN) learner, Learn R Deep Neural Net Model, rather than Learn R Neural Net Model, even if you build "shallow" networks that contain only a single hidden layer. The reason is two-fold:

The DNN learner supports "dropout" (http://www.jmlr.org/papers/volume15/srivastava14a.old/source/srivastava14a.pdf), which allows you to build networks with much less risk of overfitting.
The DNN learner is much faster, as I recently discovered. In a recent calculation for a single-hidden-layer network with 8 hidden nodes and the same number of training iterations, the DNN learner produced a model in just over 1 minute, while the NN learner took 21 minutes. The predictive performance of the two models was very similar.

I don't know the reason for the dramatic speed difference. I used the same laptop PC in both cases, without any specialized hardware such as a GPU that could give the DNN an advantage. I do know that the DNN learner uses the BLAS library for its matrix multiplications, which is very efficient. Perhaps the NN component does not. (I haven't looked at the R NN source code, so I don't know for sure.)

One other tip if you use the DNN learner: Consider replacing the default R BLAS library with an optimized one, such as OpenBLAS (https://sourceforge.net/projects/openblas/files/). On Windows, this simply requires renaming libopenblas.dll to Rblas.dll and dropping it into the R bin/x64 folder (after saving the default Rblas.dll in case you need to revert!). Depending on the data, I have found a speed increase up to a factor of 2 with this change.

Cheers,
Dana