NervanaSystems Neon Versions Save

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

v1.8.0

7 years ago

Skip Thought Vectors (http://arxiv.org/abs/1506.06726) example
Dilated convolution support
Nesterov Accelerated Gradient option to SGD optimizer
MultiMetric class to allow wrapping Metric classes
Support for serializing and deserializing encoder-decoder models
Allow specifying the number of time steps to evaluate during beam search
A new community-contributed Docker image
Improved error messages when a tensor is created with an invalid shape or reshaped to an incompatible size
Fix bugs in MultiCost support
Documentation fixes [#331]

v1.7.0

7 years ago

Update Data Loader to aeon https://github.com/NervanaSystems/aeon for flexible, multi-threaded data loading and transformations
Add Neural Machine Translation model
Remove Fast RCNN model (use Faster RCNN model instead)
Remove music_genres example
Fix super blocking for small N with 1D conv
Fix update-direct conv kernel for small N
Add gradient clipping to Adam optimizer
Documentation updates and bug fixes

v1.6.0

7 years ago

Faster RCNN model
Sequence to Sequence container and char_rae recurrent autoencoder model
Reshape Layer that reshapes the input [#221]
Pip requirements in requirements.txt updated to latest versions [#289]
Remove deprecated data loaders and update docs
Use NEON_DATA_CACHE_DIR envvar as archive dir to store DataLoader ingested data
Eliminate type conversion for FP16 for CUDA compute capability >= 5.2
Use GEMV kernels for batch size 1
Alter delta buffers for nesting of merge-broadcast layers
Support for ncloud real-time logging
Add fast_style Makefile target
Fix Python 3 builds on Ubuntu 16.04
Run setup.py for sysinstall to generate version.py [#282]
Fix broken link in mnist docs
Fix conv/deconv tests for CPU execution and fix i32 data type
Fix for average pooling with batch size 1
Change default scale_min to allow random cropping if omitted
Fix yaml loading
Fix bug with image resize during injest
Update references to the ModelZoo and neon examples to their new locations

v1.5.4

7 years ago

Python2/Python3 compatibility [#191]
Support for Pascal GPUs
Persistent RNN kernels [#262]
Implement Binarized Neural Networks from http://arxiv.org/pdf/1602.02830v3.pdf (added in v1.5.4)
Dataloader enhancements (audio loader with examples)
HDF5 file data iterator
Convolution kernel improvements
API documentation improvements [#234, #244, #263]
Cache directory cleanup
Reorganization of all unit tests
Bug fixes [#182, #183, #231, #241, #252, #253, #257, #259, #267, #268]

v1.5.3

7 years ago

Python2/Python3 compatibility [#191]
Support for Pascal GPUs
Persistent RNN kernels [#262]
Dataloader enhancements (audio loader with examples)
HDF5 file data iterator
Convolution kernel improvements
API documentation improvements [#234, #244, #263]
Cache directory cleanup
Reorganization of all unit tests
Bug fixes [#182, #183, #231, #241, #252, #253, #257, #259, #267]

v1.5.2

7 years ago

Python2/Python3 compatibility [#191]
Support for Pascal GPUs
Persistent RNN kernels [#262]
Dataloader enhancements (audio loader with examples)
HDF5 file data iterator
Convolution kernel improvements
API documentation improvements [#234, #244, #263]
Cache directory cleanup
Reorganization of all unit tests
Bug fixes [#182, #183, #231, #241, #252, #253, #257, #259]

v1.5.1

7 years ago

Python2/Python3 compatibility [#191]
Support for Pascal GPUs
Persistent RNN kernels [#262]
Dataloader enhancements (audio loader with examples)
HDF5 file data iterator
Convolution kernel improvements
API documentation improvements [#234, #244, #263]
Cache directory cleanup
Reorganization of all unit tests
Bug fixes [#182, #183, #231, #241, #252, #253, #257, #259]

v1.4.0

8 years ago

VGG16 based Fast R-CNN model using winograd kernels
new, backward compatible, generic data loader
C3D video loader model trained on UCF101 dataset
Deep Dream example
make conv layer printout more informative [#222]
fix some examples to use new arg override capability
improve performance for relu for small N
better support for arbitrary batch norm layer placement
documentation updates [#210, #213, #236]

v1.3.0

8 years ago

winograd kernels and associated autotuning routines
benchmarking scripts
deprecation of deterministic argument for backend constructor
improve batch norm stability with fp16 backend
allow strided support for dimshuffle kernel
speed up zero momentum gradient descent

v1.2.2

8 years ago

benchmarking enhancements
fast dimshuffle, transpose, other kernel speedups and refactoring
batch norm states fix, deterministic updates
example fixes for fast rcnn and conv_autoencoder
image decoding rescaling method fix
deserialization fixes for RNN's, refactoring
caffe compatibility fixes
documentation updates