Nni Versions Save

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

v2.3

2 years ago

Major Updates

Neural Architecture Search

Retiarii Framework (NNI NAS 2.0) Beta Release with new features:
- Support new high-level APIs: Repeat and Cell (#3481)
- Support pure-python execution engine (#3605)
- Support policy-based RL strategy (#3650)
- Support nested ModuleList (#3652)
- Improve documentation (#3785)
Note: there are more exciting features of Retiarii planned in the future releases, please refer to Retiarii Roadmap for more information.
Add new NAS algorithm: Blockwise DNAS FBNet (#3532, thanks the external contributor @alibaba-yiwuyao)

Model Compression

Support Auto Compression Framework (#3631)
Support slim pruner in Tensorflow (#3614)
Support LSQ quantizer (#3503, thanks the external contributor @chenbohua3)
Improve APIs for iterative pruners (#3507 #3688)

Training service & Rest

Support 3rd-party training service (#3662 #3726)
Support setting prefix URL (#3625 #3674 #3672 #3643)
Improve NNI manager logging (#3624)
Remove outdated TensorBoard code on nnictl (#3613)

Hyper-Parameter Optimization

Add new tuner: DNGO (#3479 #3707)
Add benchmark for tuners (#3644 #3720 #3689)

WebUI

Improve search parameters on trial detail page (#3651 #3723 #3715)
Make selected trials consistent after auto-refresh in detail table (#3597)
Add trial stdout button on local mode (#3653 #3690)

Examples & Documentation

Convert all trial examples' from config v1 to config v2 (#3721 #3733 #3711 #3600)
Add new jupyter notebook examples (#3599 #3700)

Dev Excellent

Upgrade dependencies in Dockerfile (#3713 #3722)
Substitute PyYAML for ruamel.yaml (#3702)
Add pipelines for AML and hybrid training service and experiment config V2 (#3477 #3648)
Add pipeline badge in README (#3589)
Update issue bug report template (#3501)

Bug Fixes & Minor Updates

Fix syntax error on Windows (#3634)
Fix a logging related bug (#3705)
Fix a bug in GPU indices (#3721)
Fix a bug in FrameworkController (#3730)
Fix a bug in export_data_url format (#3665)
Report version check failure as a warning (#3654)
Fix bugs and lints in nnictl (#3712)
Fix bug of optimize_mode on WebUI (#3731)
Fix bug of useActiveGpu in AML v2 config (#3655)
Fix bug of experiment_working_directory in Retiarii config (#3607)
Fix a bug in mask conflict (#3629, thanks the external contributor @Davidxswang)
Fix a bug in model speedup shape inference (#3588, thanks the external contributor @Davidxswang)
Fix a bug in multithread on Windows (#3604, thanks the external contributor @Ivanfangsc)
Delete redundant code in training service (#3526, thanks the external contributor @maxsuren)
Fix typo in DoReFa compression doc (#3693, thanks the external contributor @Erfandarzi)
Update docstring in model compression (#3647, thanks the external contributor @ichejun)
Fix a bug when using Kubernetes container (#3719, thanks the external contributor @rmfan)

v2.2

3 years ago

Major updates

Neural Architecture Search

Improve NAS 2.0 (Retiarii) Framework (Alpha Release)
- Support local debug mode (#3476)
- Support nesting ValueChoice in LayerChoice (#3508)
- Support dict/list type in ValueChoice (#3508)
- Improve the format of export architectures (#3464)
- Refactor of NAS examples (#3513)
- Refer to here <https://github.com/microsoft/nni/issues/3301>__ for Retiarii Roadmap

Model Compression

Support speedup for mixed precision quantization model (Experimental) (#3488 #3512)
Support model export for quantization algorithm (#3458 #3473)
Support model export in model compression for TensorFlow (#3487)
Improve documentation (#3482)

nnictl & nni.experiment

Add native support for experiment config V2 (#3466 #3540 #3552)
Add resume and view mode in Python API nni.experiment (#3490 #3524 #3545)

Training Service

Support umount for shared storage in remote training service (#3456)
Support Windows as the remote training service in reuse mode (#3500)
Remove duplicated env folder in remote training service (#3472)
Add log information for GPU metric collector (#3506)
Enable optional Pod Spec for FrameworkController platform (#3379, thanks the external contributor @mbu93)

WebUI

Support launching TensorBoard on WebUI (#3454 #3361 #3531)
Upgrade echarts-for-react to v5 (#3457)
Add wrap for dispatcher/nnimanager log monaco editor (#3461)

Bug Fixes

Fix bug of FLOPs counter (#3497)
Fix bug of hyper-parameter Add/Remove axes and table Add/Remove columns button conflict (#3491)
Fix bug that monaco editor search text is not displayed completely (#3492)
Fix bug of Cream NAS (#3498, thanks the external contributor @AliCloud-PAI)
Fix typos in docs (#3448, thanks the external contributor @OliverShang)
Fix typo in NAS 1.0 (#3538, thanks the external contributor @ankitaggarwal23)

v2.1

3 years ago

Major updates

Neural architecture search

Improve NAS 2.0 (Retiarii) Framework (Improved Experimental)
- Improve the robustness of graph generation and code generation for PyTorch models (#3365)
- Support the inline mutation API ValueChoice (#3349 #3382)
- Improve the design and implementation of Model Evaluator (#3359 #3404)
- Support Random/Grid/Evolution exploration strategies (i.e., search algorithms) (#3377)
- Refer to here for Retiarii Roadmap

Training service

Support shared storage for reuse mode (#3354)
Support Windows as the local training service in hybrid mode (#3353)
Remove PAIYarn training service (#3327)
Add "recently-idle" scheduling algorithm (#3375)
Deprecate preCommand and enable pythonPath for remote training service (#3284 #3410)
Refactor reuse mode temp folder (#3374)

nnictl & nni.experiment

Migrate nnicli to new Python API nni.experiment (#3334)
Refactor the way of specifying tuner in experiment Python API (nni.experiment), more aligned with nnictl (#3419)

WebUI

Support showing the assigned training service of each trial in hybrid mode on WebUI (#3261 #3391)
Support multiple selection for filter status in experiments management page (#3351)
Improve overview page (#3316 #3317 #3352)
Support copy trial id in the table (#3378)

Documentation

Improve model compression examples and documentation (#3326 #3371)
Add Python API examples and documentation (#3396)
Add SECURITY doc (#3358)
Add 'What's NEW!' section in README (#3395)
Update English contributing doc (#3398, thanks external contributor @Yongxuanzhang)

Bug fixes

Fix AML outputs path and python process not killed (#3321)
Fix bug that an experiment launched from Python cannot be resumed by nnictl (#3309)
Fix import path of network morphism example (#3333)
Fix bug in the tuple unpack (#3340)
Fix bug of security for arbitrary code execution (#3311, thanks external contributor @huntr-helper)
Fix NoneType error on jupyter notebook (#3337, thanks external contributor @tczhangzhi)
Fix bugs in Retiarii (#3339 #3341 #3357, thanks external contributor @tczhangzhi)
Fix bug in AdaptDL mode example (#3381, thanks external contributor @ZeyaWang)
Fix the spelling mistake of assessor (#3416, thanks external contributor @ByronCHAO)
Fix bug in ruamel import (#3430, thanks external contributor @rushtehrani)

v2.0

3 years ago

Major updates

Neural architecture search

Support an improved NAS framework: Retiarii (experimental)
Support a new NAS algorithm: Cream (#2705)
Add a new NAS benchmark for NLP model search (#3140)

Training service

Support hybrid training service (#3097 #3251 #3252)
Support AdlTrainingService, a new training service based on Kubernetes (#3022, thanks external contributors Petuum @pw2393)

Model compression

Support pruning schedule for fpgm pruning algorithm (#3110)
ModelSpeedup improvement: support torch v1.7 (updated graph_utils.py) (#3076)
Improve model compression utility: model flops counter (#3048 #3265)

WebUI & nnictl

Support experiments management on WebUI, add a web page for it (#3081 #3127)
Improve the layout of overview page (#3046 #3123)
Add navigation bar on the right for logs and configs; add expanded icons for table (#3069 #3103)

Others

Support launching an experiment from Python code (#3111 #3210 #3263)
Refactor builtin/customized tuner installation (#3134)
Support new experiment configuration V2 (#3138 #3248 #3251)
Reorganize source code directory hierarchy (#2962 #2987 #3037)
Change SIGKILL to SIGTERM in local mode when cancelling trial jobs (#3173)
Refector hyperband (#3040)

Documentation

Port markdown docs to reStructuredText docs and introduce githublink (#3107)
List related research and publications in doc (#3150)
Add tutorial of saving and loading quantized model (#3192)
Remove paiYarn doc and add description of reuse config in remote mode (#3253)
Update EfficientNet doc to clarify repo versions (#3158, thanks external contributor @ahundt)

Bug fixes

Fix exp-duration pause timing under NO_MORE_TRIAL status (#3043)
Fix bug in NAS SPOS trainer, apply_fixed_architecture (#3051, thanks external contributor @HeekangPark)
Fix _compute_hessian bug in NAS DARTS (PyTorch version) (#3058, thanks external contributor @hroken)
Fix bug of conv1d in the cdarts utils (#3073, thanks external contributor @athaker)
Fix the handling of unknown trials when resuming an experiment (#3096)
Fix bug of kill command under Windows (#3106)
Fix lazy logging (#3108, thanks external contributor @HarshCasper)
Fix checkpoint load and save issue in QAT quantizer (#3124, thanks external contributor @eedalong)
Fix quant grad function calculation error (#3160, thanks external contributor @eedalong)
Fix device assignment bug in quantization algorithm (#3212, thanks external contributor @eedalong)
Fix bug in ModelSpeedup and enhance UT for it (#3279)
and others

v1.9

3 years ago

Release 1.9 - 10/22/2020

Major updates

Neural architecture search

Support regularized evolution algorithm for NAS scenario (#2802)
Add NASBench201 in search space zoo (#2766)

Model compression

AMC pruner improvement: support resnet, support reproduction of the experiments (default parameters in our example code) in AMC paper (#2876 #2906)
Support constraint-aware on some of our pruners to improve model compression efficiency (#2657)
Support "tf.keras.Sequential" in model compression for TensorFlow (#2887)
Support customized op in the model flops counter (#2795)
Support quantizing bias in QAT quantizer (#2914)

Training service

Support configuring python environment using "preCommand" in remote mode (#2875)
Support AML training service in Windows (#2882)
Support reuse mode for remote training service (#2923)

WebUI & nnictl

The "Overview" page on WebUI is redesigned with new layout (#2914)
Upgraded node, yarn and FabricUI, and enabled Eslint (#2894 #2873 #2744)
Add/Remove columns in hyper-parameter chart and trials table in "Trials detail" page (#2900)
JSON format utility beautify on WebUI (#2863)
Support nnictl command auto-completion (#2857)

UT & IT

Add integration test for experiment import and export (#2878)
Add integration test for user installed builtin tuner (#2859)
Add unit test for nnictl (#2912)

Documentation

Refactor of the document for model compression (#2919)

Bug fixes

Bug fix of naïve evolution tuner, correctly deal with trial fails (#2695)
Resolve the warning "WARNING (nni.protocol) IPC pipeline not exists, maybe you are importing tuner/assessor from trial code?" (#2864)
Fix search space issue in experiment save/load (#2886)
Fix bug in experiment import data (#2878)
Fix annotation in remote mode (python 3.8 ast update issue) (#2881)
Support boolean type for "choice" hyper-parameter when customizing trial configuration on WebUI (#3003)

v1.8

3 years ago

Release 1.8 - 8/27/2020

Major updates

Training service

Access trial log directly on WebUI (local mode only) (#2718)
Add OpenPAI trial job detail link (#2703)
Support GPU scheduler in reusable environment (#2627) (#2769)
Add timeout for web_channel in trial_runner (#2710)
Show environment error message in AzureML mode (#2724)
Add more log information when copying data in OpenPAI mode (#2702)

WebUI, nnictl and nnicli

Improve hyper-parameter parallel coordinates plot (#2691) (#2759)
Add pagination for trial job list (#2738) (#2773)
Enable panel close when clicking overlay region (#2734)
Remove support for Multiphase on WebUI (#2760)
Support save and restore experiments (#2750)
Add intermediate results in export result (#2706)
Add command to list trial results with highest/lowest metrics (#2747)
Improve the user experience of nnicli with examples (#2713)

Neural architecture search

Search space zoo: ENAS and DARTS (#2589)
API to query intermediate results in NAS benchmark (#2728)

Model compression

Support the List/Tuple Construct/Unpack operation for TorchModuleGraph (#2609)
Model speedup improvement: Add support of DenseNet and InceptionV3 (#2719)
Support the multiple successive tuple unpack operations (#2768)
Doc of comparing the performance of supported pruners (#2742)
New pruners: Sensitivity pruner (#2684) and AMC pruner (#2573) (#2786)
TensorFlow v2 support in model compression (#2755)

Backward incompatible changes

Update the default experiment folder from $HOME/nni/experiments to $HOME/nni-experiments. If you want to view the experiments created by previous NNI releases, you can move the experiments folders from $HOME/nni/experiments to $HOME/nni-experiments manually. (#2686) (#2753)
Dropped support for Python 3.5 and scikit-learn 0.20 (#2778) (#2777) (2783) (#2787) (#2788) (#2790)

Others

Upgrade TensorFlow version in Docker image (#2732) (#2735) (#2720)

Examples

Remove gpuNum in assessor examples (#2641)

Documentation

Improve customized tuner documentation (#2628)
Fix several typos and grammar mistakes in documentation (#2637 #2638, thanks @tomzx)
Improve AzureML training service documentation (#2631)
Improve CI of Chinese translation (#2654)
Improve OpenPAI training service documenation (#2685)
Improve documentation of community sharing (#2640)
Add tutorial of Colab support (#2700)
Improve documentation structure for model compression (#2676)

Bug fixes

Fix mkdir error in training service (#2673)
Fix bug when using chmod in remote training service (#2689)
Fix dependency issue by making _graph_utils imported inline (#2675)
Fix mask issue in SimulatedAnnealingPruner (#2736)
Fix intermediate graph zooming issue (#2738)
Fix issue when dict is unordered when querying NAS benchmark (#2728)
Fix import issue for gradient selector dataloader iterator (#2690)
Fix support of adding tens of machines in remote training service (#2725)
Fix several styling issues in WebUI (#2762 #2737)
Fix support of unusual types in metrics including NaN and Infinity (#2782)
Fix nnictl experiment delete (#2791)

v1.7.1

3 years ago

Release 1.7.1 - 8/1/2020

Bug Fixes

Fix pai training service error handling #2692
Fix pai training service codeDir copying issue #2673
Upgrade training service to support latest pai restful API #2722

v1.7

3 years ago

Release 1.7 - 7/8/2020

Major Features

Training Service

Support AML(Azure Machine Learning) platform as NNI training service.
OpenPAI job can be reusable. When a trial is completed, the OpenPAI job won't stop, and wait next trial. refer to reuse flag in OpenPAI config.
Support ignoring files and folders in code directory with .nniignore when uploading code directory to training service.

Neural Architecture Search (NAS)

Model Compression

Improve Model Speedup: track more dependencies among layers and automatically resolve mask conflict, support the speedup of pruned resnet.
Added new pruners, including three auto model pruning algorithms: NetAdapt Pruner, SimulatedAnnealing Pruner, AutoCompress Pruner, and ADMM Pruner.
Added model sensitivity analysis tool to help users find the sensitivity of each layer to the pruning.
Easy flops calculation for model compression and NAS.
Update lottery ticket pruner to export winning ticket.

Examples

Automatically optimize tensor operators on NNI with a new customized tuner OpEvo.

Built-in tuners/assessors/advisors

Allow customized tuners/assessor/advisors to be installed as built-in algorithms.

WebUI

Support visualizing nested search space more friendly.
Show trial's dict keys in hyper-parameter graph.
Enhancements to trial duration display.

Others

Provide utility function to merge parameters received from NNI
Support setting paiStorageConfigName in pai mode

Documentation

Improve documentation for model compression
Improve documentation and examples for NAS benchmarks.
Improve documentation for AzureML training service
Homepage migration to readthedoc.

Bug Fixes

Fix bug for model graph with shared nn.Module
Fix nodejs OOM when make build
Fix NASUI bugs
Fix duration and intermediate results pictures update issue.
Fix minor WebUI table style issues.

v1.6

3 years ago

Release 1.6 - 5/26/2020

Major Features

New Features and improvement

support __version__ for SDK version
support windows dev install
Improve IPC limitation to 100W
improve code storage upload logic among trials in non-local platform

HPO Updates

Improve PBT on failure handling and support experiment resume for PBT

NAS Updates

NAS support for TensorFlow 2.0 (preview) TF2.0 NAS examples
Use OrderedDict for LayerChoice
Prettify the format of export
Replace layer choice with selected module after applied fixed architecture

Model Compression Updates

Model compression PyTorch 1.4 support

Training Service Updates

update pai yaml merge logic
support windows as remote machine in remote mode Remote Mode

Web UI new supports or improvements

Show trial error message
finalize homepage layout
Refactor overview's best trials module
Remove multiphase from webui
add tooltip for trial concurrency in the overview page
Show top trials for hyper-parameter graph

Bug Fix

fix dev install
SPOS example crash when the checkpoints do not have state_dict
Fix table sort issue when experiment had failed trial
Support multi python env (conda, pyenv etc)

v1.5

4 years ago

New Features and Documentation

Hyper-Parameter Optimizing

New tuner: Population Based Training (PBT)
Trials can now report infinity and NaN as result

Neural Architecture Search

New NAS algorithm: TextNAS
ENAS and DARTS now support visualization through web UI.

Model Compression

New Pruner: GradientRankFilterPruner
Compressors will validate configuration by default
Refactor: Adding optimizer as an input argument of pruner, for easy support of DataParallel and more efficient iterative pruning. This is a broken change for the usage of iterative pruning algorithms.
Model compression examples are refactored and improved
Added documentation for implementing compressing algorithm

Training Service

Kubeflow now supports pytorchjob crd v1 (thanks external contributor @jiapinai)
Experimental DLTS support

Overall Documentation Improvement

Documentation is significantly improved on grammar, spelling, and wording (thanks external contributor @AHartNtkn)

Fixed Bugs

ENAS cannot have more than one LSTM layers (thanks external contributor @marsggbo)
NNI manager's timers will never unsubscribe (thanks external contributor @guilhermehn)
NNI manager may exhaust head memory (thanks external contributor @Sundrops)
Batch tuner does not support customized trials (#2075)
Experiment cannot be killed if it failed on start (#2080)
Non-number type metrics break web UI (#2278)
A bug in lottery ticket pruner
Other minor glitches