Trained by facebook using https://github.com/facebook/fb.resnet.torch
The model was converted to nn backend and BatchNorm folded into convolutional layers with this script
https://github.com/szagoruyko/imagine-nn/blob/utils/utils.lua
gradWeight and gradBias were removed from convolutional layers.