This release contains the single major feature: Performance tuning for Kepler GK110 (GeForce Titan, Tesla K20). I have also improved the performance for Fermi cards.
What about Kepler GK104 (Tesla K10, Geforce 680, 670 e t.c.)? Almost all the optimizations I applied for GK110 are applicable to GK104, though I didn't test it. I don't have GK104 card so I didn't even run the code on it.
Initially I planned to add support for 1D convolutional layers, but ended up adding it for testers and hessian calculators only. The reason is simple: It is better to have an example on which I would be able to test new functionality. Otherwise I might just add a lot of code which doesn't work.