Pages

Apr 19, 2014

nnForge v1.1.4

Here is new v1.1.4 release of nnForge. It contains mostly performance improvements but they are rather significant: the performance of convolutional layer in GPU (CUDA) backend improved from 1.15 TFLOPs to 2 TFLOPs for Galaxy Zoo example when running on NVIDIA GeForce GTX Titan, and the whole network performance improved from 1 TFLOPs to 1.55 TFLOPs. See the NSight VSE profiling screenshot:


2 TFLOPs on Titan running at 784 MHz... how efficient is it? Let's see: 2 TFLOPs / (784 Mhz * 14 SM * 192 fma/SM * 2 op/fma) = 47% of theoretical peak, which I consider a pretty good number. And there is certaily room for improvement here. Training could also benefit from these improvements; I plan to port these changes to hessian calculators and updaters soon.

Here are all the changes in this release:
  • C++11 limited support added: you can build everything except for CUDA backend - this is due to NVCC not yet supporting C++11
  • Improved testing and validating (feed forward) performance of convolutional layers in CUDA backend for Kepler at the same time greatly simplifying the code
  • Improved performance of max subsampling 2d tester for CUDA backend. The implementation is far from optimal yet

Apr 12, 2014

Galaxy Zoo

I took the second place in Galazy Zoo competition. Organizers requested the report from all prize winners, here is mine. Sander Dieleman won the challenge with a large margin. He used convolutional neural networks too, although his approach was more sophisticated. Team 6789, which took the thrird place, used convnets too!

Apr 5, 2014

nnForge v1.1.3

I labelled the latest changes in nnForge with v1.1.3:
  • Snapshot functionality is redesigned fully - it is now doing backpropagation, the feature is still in beta
  • Ability to define custom error functions is added
    • Cross-entropy error function is added, use with care - not tested yet
  • Galaxy Zoo example added - see Galaxy Zoo challenge
  • cuda_max_global_memory_usage_ratio is set to 0.8 by default - This should help those running code on a primary videocard
  • per_layer_mu mode is added - More robust training in some cases
  • Fixes:
    • Fixed crash when using output transformer
    • Fixed backprop for local_contrast_subtractive_2d_updater in CUDA backend
    • Fixed build with Boost 1.55

Feb 7, 2014

nnForge v1.1.2

I brushed up parameters for nnForge toolset. I also changed default values for some of them; if you run GTSRB you will probably need to update config file. Here is the full change list:
  • Deterministic transformator added for testing and validating
  • snapshots are made on ANNs from batch directory
  • Toolset parameters changed:
    • learning_rate_decay_rate is exposed as a command line parameter
    • training_speed parameter renamed to learning_rate, training_speed_degradation is dropped
    • training_iteration_count renamed to training_epoch_count
    • train command does batch train, batch_train command is removed
    • validate and test now work in batch mode, validate_batch and test_batch removed
    • mu_increase_factor is set to 1.0 by default
    • max_mu set to 1.0 by default
  • Bug-fixes

Jan 11, 2014

nnForge v1.1.1

I've just published new nnForge release v1.1.1:
  • Using space-filling curve for all the convolutional updaters, testers and hessians in CUDA backend, training large networks performance improved 
  • Improved concurrent training and loading/processing input data for all the stages by loading data in a separate host thread, CUDA backend only
  • In-memory supervised data reader added
  • Added NVTX profiling for reading input data, CUDA backend only
  • Fixed:
    • Binding texture to too large linear buffer
    • Average subsampling backprop in CUDA backend is wrong for non-even configs
    • Fixed performance in Windws with WDDM driver

Dec 27, 2013

Moved to blogger

Moved the blog from Zoho to Blogger platform for a number of reasons including better uptime, design, ability to edit posts in place. All posts are copied to the new platform.

Nov 23, 2013

nnForge v1.1.0

I've just published new nnForge release v1.1.0, which has a lot of new functionality and fixes implemented:
  • Squared Hinge Loss error function added
  • Local contrast subtractive layer hessian and updater implementations added both to CPU and GPU backend
  • Maxout layer added with CPU and GPU backends implemented
  • Added tester functionality for rgb_to_you_convert layer in CUDA backend
  • Learning rate decay functionality for tail iterations is added
  • Fixed:
    • Functionality bug in L2 incoming weights regularizer
    • Functionality bug for rectangular local contrast subtractive
    • Recovered snapshot_invalid functionality