- Convolutional layer
- strides added
- w/out bias option added
- check_gradient command added
- Imagenet: reproduced ResNet50 result (7.5% Top5 single crop)
- Average subsampling layer allows specifying output size instead of subsampling window sizes
- Added profiling to CUDA backend
- Max subsampling layer:
- round_up mode added
- Strides added
- Step learning rate decay policy added
- Added update_bn_weights action (but calculating mean and invsigma during training works well)
- Spatial Transformer:
- affine_grid_generator_layer added
- linear_sampler layer added
- Utilizing cudnnFindConvolution*AlgorithmEx functions to get maximum perf (cuDNN v5 is required for that)
- Added strides to sparse convolution layer
Jul 5, 2016
nnForge v2.2.0
Hi, nnForge v2.2.0 is published!
Feb 21, 2016
nnForge v2.1.0
2 months passed since the last release, this one is pretty big. A number of layers added, existing layers' functionality is extended. Here is the full list of changes in nnForge v2.1.0:
- New layers added: Concat, Reshape, CDFMax, PrefixSum, Upsampling, Add (element-wise), CDF2PDF, EntryConvolution
- Average and Max subsampling layers are now capable of subsampling in feature map and entry directions
- MSE Layer reworked into generic LError layer (L2 by default)
- Max subsampling can do MIN as well
- Optional scale parameter for AverageSubsampling layer added
- Detailed info on layers in the schema dumped
- Dumping graph with layer configs in debug mode
- Added dumping data in CSV format
- Runtime layer replacement with data layers
- Bug fixes
Dec 20, 2015
nnForge v2.0.2
Small release nnForge v2.0.2 here:
- Gradient modifier layer added
- Structured_data_constant_reader added
- Error function layers accept the 3rd optional input layer - mask
- ADAM training algo implemented, use "--momentum_type adam", rate should generally be much smaller than for other methods
- Changed default value for cuda_fixed_working_buffers_ratio to 0.4
I get very nice 5.4 TFLOPS on the whole model when training VGG-A with cuDNN v4 RC.
Nov 24, 2015
nnForge v2.0.1

I significantly improved performance of CUDA backend recently in nnForge v2.0.1:
- Multiple improvements to reduce total buffer sizes, allows running larger chunks (3x for ImageNet):
- Taking buffer sizes into account when coloring graph
- Maxout, ReLU, and MaxSubsampling layers consume much less memory in CUDA backend
- Action graph is optimized to exclude unnecessary concurrency - taking into account device width here
- Migrated to cuDNN v3
- Reusing CUDA streams
- Allocating chunk of mem for fixed working buffers - improves perf
- Few bug-fixes
See buffer graph coloring for the optimized action graph of VGG-A-like schema to the right. You can get this and other interesting graphs by specifying "--debug_mode 1" option.
Nov 7, 2015
nnForge v2.0.0
Hi all,
6 months passed since last nnForge release and there is a good reason for it: I have been working on a major framework redesign, and now it is out! See nnForge v2.0.0:
6 months passed since last nnForge release and there is a good reason for it: I have been working on a major framework redesign, and now it is out! See nnForge v2.0.0:
- The model is now arbitrary DAG (directed acyclic graph)
- Running independent actions in mutiple streams in CUDA backend
- Memory buffers are heavily reused
The changes are so radical, I had to drop support for the old trained data storage format. Unfortunately this means you will have to re-train your models from scratch.
Expect more goodies in near future!
Apr 30, 2015
nnForge v1.2.0
Hi, this is a pretty big release of nnForge. The most important improvement is that mode schemas are now stored in Protobuf format. You now define the schema via plain text file. Use convert_schema action to convert from old binary format to new one. I also implemented Overfeat functionality - this allows running inference on large input data with fine-frained results efficiently.
All the change are:
All the change are:
- Schema:
- Model schema is now stord in Protobuf format. Use convert_schema to convert schemas in old binary format to new one
- Input and output data normalizers are stored in protobuf format now. Use convert_input_normalizer and convert_output_normalizer to convert existing binary normalizers to new format
- Schema and data are compatible now if non-empty layers match. Now empty-data layers don't matter
- Training data:
- Improvements insupervised_image_stream_reader
- embed_data_transformer added
- Training:
- Nesterov momentum added (see --momentum_type option)
- uniform_intensity_data_transformer added
- Momentum data is kept between epochs (it is save and restored as well)
- ROC result outputs accuracy, precision, recall, and F-score now (in addition to AUC)
- Visualization:
- snapshot_invalid now saves images, including binary classifier case
- Inference:
- Overfeat functionality added (see tiling option of max subsampling layer, and untile layer)
Mar 26, 2015
nnForge v1.1.13
nnForge v1.1.13 is published with a number of improvements:
- Data transformrs:
- Stretch added to distort sampler transformer
- perspective distortions added to distort_2d transformer
- reshape_data_transformer added
- elastic_deformation_2d_data_transformer added
- Mixture of models:
- Added --test_validate_save_output and --test_validate_load_output options
- Running testing and validation from a mixture of output_values
- Readers:
- supervised_shuffle_entries_data_reader is made deterministic
- deterministic image data reader is extended to sampler
- Layers:
- Parametric ReLU added (with CPU and GPU backends)
- Average subsampling is reverted to native implementation (3D and 4D support)
- Others:
- Taking RELUs into account when initializing weights
- validate_progress_network_data_pusher is extended with frequency parameter
- Quasi-random training data randomization is dropped
- Memory consumption reduced during testing
- Resume training (-R) can now be applied with multiple ANNs training (-N)
- VS2013 projects and solution added (using CUDA 7.0)
- Fixed fancy backprop for analyzer
- Bug-fixes
Subscribe to:
Posts (Atom)