- Gradient modifier layer added
- Structured_data_constant_reader added
- Error function layers accept the 3rd optional input layer - mask
- ADAM training algo implemented, use "--momentum_type adam", rate should generally be much smaller than for other methods
- Changed default value for cuda_fixed_working_buffers_ratio to 0.4
I get very nice 5.4 TFLOPS on the whole model when training VGG-A with cuDNN v4 RC.