Nni Versions Save

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

v2.3

2 years ago

Major Updates

  • Retiarii Framework (NNI NAS 2.0) Beta Release with new features:

    • Support new high-level APIs: Repeat and Cell (#3481)
    • Support pure-python execution engine (#3605)
    • Support policy-based RL strategy (#3650)
    • Support nested ModuleList (#3652)
    • Improve documentation (#3785)

    Note: there are more exciting features of Retiarii planned in the future releases, please refer to Retiarii Roadmap for more information.

  • Add new NAS algorithm: Blockwise DNAS FBNet (#3532, thanks the external contributor @alibaba-yiwuyao)

Model Compression

  • Support Auto Compression Framework (#3631)
  • Support slim pruner in Tensorflow (#3614)
  • Support LSQ quantizer (#3503, thanks the external contributor @chenbohua3)
  • Improve APIs for iterative pruners (#3507 #3688)

Training service & Rest

  • Support 3rd-party training service (#3662 #3726)
  • Support setting prefix URL (#3625 #3674 #3672 #3643)
  • Improve NNI manager logging (#3624)
  • Remove outdated TensorBoard code on nnictl (#3613)

Hyper-Parameter Optimization

  • Add new tuner: DNGO (#3479 #3707)
  • Add benchmark for tuners (#3644 #3720 #3689)

WebUI

  • Improve search parameters on trial detail page (#3651 #3723 #3715)
  • Make selected trials consistent after auto-refresh in detail table (#3597)
  • Add trial stdout button on local mode (#3653 #3690)

Examples & Documentation

  • Convert all trial examples' from config v1 to config v2 (#3721 #3733 #3711 #3600)
  • Add new jupyter notebook examples (#3599 #3700)

Dev Excellent

  • Upgrade dependencies in Dockerfile (#3713 #3722)
  • Substitute PyYAML for ruamel.yaml (#3702)
  • Add pipelines for AML and hybrid training service and experiment config V2 (#3477 #3648)
  • Add pipeline badge in README (#3589)
  • Update issue bug report template (#3501)

Bug Fixes & Minor Updates

  • Fix syntax error on Windows (#3634)
  • Fix a logging related bug (#3705)
  • Fix a bug in GPU indices (#3721)
  • Fix a bug in FrameworkController (#3730)
  • Fix a bug in export_data_url format (#3665)
  • Report version check failure as a warning (#3654)
  • Fix bugs and lints in nnictl (#3712)
  • Fix bug of optimize_mode on WebUI (#3731)
  • Fix bug of useActiveGpu in AML v2 config (#3655)
  • Fix bug of experiment_working_directory in Retiarii config (#3607)
  • Fix a bug in mask conflict (#3629, thanks the external contributor @Davidxswang)
  • Fix a bug in model speedup shape inference (#3588, thanks the external contributor @Davidxswang)
  • Fix a bug in multithread on Windows (#3604, thanks the external contributor @Ivanfangsc)
  • Delete redundant code in training service (#3526, thanks the external contributor @maxsuren)
  • Fix typo in DoReFa compression doc (#3693, thanks the external contributor @Erfandarzi)
  • Update docstring in model compression (#3647, thanks the external contributor @ichejun)
  • Fix a bug when using Kubernetes container (#3719, thanks the external contributor @rmfan)

v2.2

3 years ago

Major updates

  • Improve NAS 2.0 (Retiarii) Framework (Alpha Release)

    • Support local debug mode (#3476)
    • Support nesting ValueChoice in LayerChoice (#3508)
    • Support dict/list type in ValueChoice (#3508)
    • Improve the format of export architectures (#3464)
    • Refactor of NAS examples (#3513)
    • Refer to here <https://github.com/microsoft/nni/issues/3301>__ for Retiarii Roadmap

Model Compression

  • Support speedup for mixed precision quantization model (Experimental) (#3488 #3512)
  • Support model export for quantization algorithm (#3458 #3473)
  • Support model export in model compression for TensorFlow (#3487)
  • Improve documentation (#3482)

nnictl & nni.experiment

  • Add native support for experiment config V2 (#3466 #3540 #3552)
  • Add resume and view mode in Python API nni.experiment (#3490 #3524 #3545)

Training Service

  • Support umount for shared storage in remote training service (#3456)
  • Support Windows as the remote training service in reuse mode (#3500)
  • Remove duplicated env folder in remote training service (#3472)
  • Add log information for GPU metric collector (#3506)
  • Enable optional Pod Spec for FrameworkController platform (#3379, thanks the external contributor @mbu93)

WebUI

  • Support launching TensorBoard on WebUI (#3454 #3361 #3531)
  • Upgrade echarts-for-react to v5 (#3457)
  • Add wrap for dispatcher/nnimanager log monaco editor (#3461)

Bug Fixes

  • Fix bug of FLOPs counter (#3497)
  • Fix bug of hyper-parameter Add/Remove axes and table Add/Remove columns button conflict (#3491)
  • Fix bug that monaco editor search text is not displayed completely (#3492)
  • Fix bug of Cream NAS (#3498, thanks the external contributor @AliCloud-PAI)
  • Fix typos in docs (#3448, thanks the external contributor @OliverShang)
  • Fix typo in NAS 1.0 (#3538, thanks the external contributor @ankitaggarwal23)

v2.1

3 years ago

Major updates

  • Improve NAS 2.0 (Retiarii) Framework (Improved Experimental)

    • Improve the robustness of graph generation and code generation for PyTorch models (#3365)
    • Support the inline mutation API ValueChoice (#3349 #3382)
    • Improve the design and implementation of Model Evaluator (#3359 #3404)
    • Support Random/Grid/Evolution exploration strategies (i.e., search algorithms) (#3377)
    • Refer to here for Retiarii Roadmap

Training service

  • Support shared storage for reuse mode (#3354)
  • Support Windows as the local training service in hybrid mode (#3353)
  • Remove PAIYarn training service (#3327)
  • Add "recently-idle" scheduling algorithm (#3375)
  • Deprecate preCommand and enable pythonPath for remote training service (#3284 #3410)
  • Refactor reuse mode temp folder (#3374)

nnictl & nni.experiment

  • Migrate nnicli to new Python API nni.experiment (#3334)
  • Refactor the way of specifying tuner in experiment Python API (nni.experiment), more aligned with nnictl (#3419)

WebUI

  • Support showing the assigned training service of each trial in hybrid mode on WebUI (#3261 #3391)
  • Support multiple selection for filter status in experiments management page (#3351)
  • Improve overview page (#3316 #3317 #3352)
  • Support copy trial id in the table (#3378)

Documentation

  • Improve model compression examples and documentation (#3326 #3371)
  • Add Python API examples and documentation (#3396)
  • Add SECURITY doc (#3358)
  • Add 'What's NEW!' section in README (#3395)
  • Update English contributing doc (#3398, thanks external contributor @Yongxuanzhang)

Bug fixes

  • Fix AML outputs path and python process not killed (#3321)
  • Fix bug that an experiment launched from Python cannot be resumed by nnictl (#3309)
  • Fix import path of network morphism example (#3333)
  • Fix bug in the tuple unpack (#3340)
  • Fix bug of security for arbitrary code execution (#3311, thanks external contributor @huntr-helper)
  • Fix NoneType error on jupyter notebook (#3337, thanks external contributor @tczhangzhi)
  • Fix bugs in Retiarii (#3339 #3341 #3357, thanks external contributor @tczhangzhi)
  • Fix bug in AdaptDL mode example (#3381, thanks external contributor @ZeyaWang)
  • Fix the spelling mistake of assessor (#3416, thanks external contributor @ByronCHAO)
  • Fix bug in ruamel import (#3430, thanks external contributor @rushtehrani)

v2.0

3 years ago

Major updates

Training service

  • Support hybrid training service (#3097 #3251 #3252)
  • Support AdlTrainingService, a new training service based on Kubernetes (#3022, thanks external contributors Petuum @pw2393)

Model compression

  • Support pruning schedule for fpgm pruning algorithm (#3110)
  • ModelSpeedup improvement: support torch v1.7 (updated graph_utils.py) (#3076)
  • Improve model compression utility: model flops counter (#3048 #3265)

WebUI & nnictl

  • Support experiments management on WebUI, add a web page for it (#3081 #3127)
  • Improve the layout of overview page (#3046 #3123)
  • Add navigation bar on the right for logs and configs; add expanded icons for table (#3069 #3103)

Others

  • Support launching an experiment from Python code (#3111 #3210 #3263)
  • Refactor builtin/customized tuner installation (#3134)
  • Support new experiment configuration V2 (#3138 #3248 #3251)
  • Reorganize source code directory hierarchy (#2962 #2987 #3037)
  • Change SIGKILL to SIGTERM in local mode when cancelling trial jobs (#3173)
  • Refector hyperband (#3040)

Documentation

  • Port markdown docs to reStructuredText docs and introduce githublink (#3107)
  • List related research and publications in doc (#3150)
  • Add tutorial of saving and loading quantized model (#3192)
  • Remove paiYarn doc and add description of reuse config in remote mode (#3253)
  • Update EfficientNet doc to clarify repo versions (#3158, thanks external contributor @ahundt)

Bug fixes

  • Fix exp-duration pause timing under NO_MORE_TRIAL status (#3043)
  • Fix bug in NAS SPOS trainer, apply_fixed_architecture (#3051, thanks external contributor @HeekangPark)
  • Fix _compute_hessian bug in NAS DARTS (PyTorch version) (#3058, thanks external contributor @hroken)
  • Fix bug of conv1d in the cdarts utils (#3073, thanks external contributor @athaker)
  • Fix the handling of unknown trials when resuming an experiment (#3096)
  • Fix bug of kill command under Windows (#3106)
  • Fix lazy logging (#3108, thanks external contributor @HarshCasper)
  • Fix checkpoint load and save issue in QAT quantizer (#3124, thanks external contributor @eedalong)
  • Fix quant grad function calculation error (#3160, thanks external contributor @eedalong)
  • Fix device assignment bug in quantization algorithm (#3212, thanks external contributor @eedalong)
  • Fix bug in ModelSpeedup and enhance UT for it (#3279)
  • and others

v1.9

3 years ago

Release 1.9 - 10/22/2020

Major updates

  • Support regularized evolution algorithm for NAS scenario (#2802)
  • Add NASBench201 in search space zoo (#2766)

Model compression

  • AMC pruner improvement: support resnet, support reproduction of the experiments (default parameters in our example code) in AMC paper (#2876 #2906)
  • Support constraint-aware on some of our pruners to improve model compression efficiency (#2657)
  • Support "tf.keras.Sequential" in model compression for TensorFlow (#2887)
  • Support customized op in the model flops counter (#2795)
  • Support quantizing bias in QAT quantizer (#2914)

Training service

  • Support configuring python environment using "preCommand" in remote mode (#2875)
  • Support AML training service in Windows (#2882)
  • Support reuse mode for remote training service (#2923)

WebUI & nnictl

  • The "Overview" page on WebUI is redesigned with new layout (#2914)
  • Upgraded node, yarn and FabricUI, and enabled Eslint (#2894 #2873 #2744)
  • Add/Remove columns in hyper-parameter chart and trials table in "Trials detail" page (#2900)
  • JSON format utility beautify on WebUI (#2863)
  • Support nnictl command auto-completion (#2857)

UT & IT

  • Add integration test for experiment import and export (#2878)
  • Add integration test for user installed builtin tuner (#2859)
  • Add unit test for nnictl (#2912)

Documentation

  • Refactor of the document for model compression (#2919)

Bug fixes

  • Bug fix of naïve evolution tuner, correctly deal with trial fails (#2695)
  • Resolve the warning "WARNING (nni.protocol) IPC pipeline not exists, maybe you are importing tuner/assessor from trial code?" (#2864)
  • Fix search space issue in experiment save/load (#2886)
  • Fix bug in experiment import data (#2878)
  • Fix annotation in remote mode (python 3.8 ast update issue) (#2881)
  • Support boolean type for "choice" hyper-parameter when customizing trial configuration on WebUI (#3003)

v1.8

3 years ago

Release 1.8 - 8/27/2020

Major updates

Training service

  • Access trial log directly on WebUI (local mode only) (#2718)
  • Add OpenPAI trial job detail link (#2703)
  • Support GPU scheduler in reusable environment (#2627) (#2769)
  • Add timeout for web_channel in trial_runner (#2710)
  • Show environment error message in AzureML mode (#2724)
  • Add more log information when copying data in OpenPAI mode (#2702)

WebUI, nnictl and nnicli

  • Improve hyper-parameter parallel coordinates plot (#2691) (#2759)
  • Add pagination for trial job list (#2738) (#2773)
  • Enable panel close when clicking overlay region (#2734)
  • Remove support for Multiphase on WebUI (#2760)
  • Support save and restore experiments (#2750)
  • Add intermediate results in export result (#2706)
  • Add command to list trial results with highest/lowest metrics (#2747)
  • Improve the user experience of nnicli with examples (#2713)

Model compression

Backward incompatible changes

  • Update the default experiment folder from $HOME/nni/experiments to $HOME/nni-experiments. If you want to view the experiments created by previous NNI releases, you can move the experiments folders from $HOME/nni/experiments to $HOME/nni-experiments manually. (#2686) (#2753)
  • Dropped support for Python 3.5 and scikit-learn 0.20 (#2778) (#2777) (2783) (#2787) (#2788) (#2790)

Others

  • Upgrade TensorFlow version in Docker image (#2732) (#2735) (#2720)

Examples

  • Remove gpuNum in assessor examples (#2641)

Documentation

  • Improve customized tuner documentation (#2628)
  • Fix several typos and grammar mistakes in documentation (#2637 #2638, thanks @tomzx)
  • Improve AzureML training service documentation (#2631)
  • Improve CI of Chinese translation (#2654)
  • Improve OpenPAI training service documenation (#2685)
  • Improve documentation of community sharing (#2640)
  • Add tutorial of Colab support (#2700)
  • Improve documentation structure for model compression (#2676)

Bug fixes

  • Fix mkdir error in training service (#2673)
  • Fix bug when using chmod in remote training service (#2689)
  • Fix dependency issue by making _graph_utils imported inline (#2675)
  • Fix mask issue in SimulatedAnnealingPruner (#2736)
  • Fix intermediate graph zooming issue (#2738)
  • Fix issue when dict is unordered when querying NAS benchmark (#2728)
  • Fix import issue for gradient selector dataloader iterator (#2690)
  • Fix support of adding tens of machines in remote training service (#2725)
  • Fix several styling issues in WebUI (#2762 #2737)
  • Fix support of unusual types in metrics including NaN and Infinity (#2782)
  • Fix nnictl experiment delete (#2791)

v1.7.1

3 years ago

Release 1.7.1 - 8/1/2020

Bug Fixes

  • Fix pai training service error handling #2692
  • Fix pai training service codeDir copying issue #2673
  • Upgrade training service to support latest pai restful API #2722

v1.7

3 years ago

Release 1.7 - 7/8/2020

Major Features

Training Service

Neural Architecture Search (NAS)

Model Compression

Examples

Built-in tuners/assessors/advisors

WebUI

  • Support visualizing nested search space more friendly.
  • Show trial's dict keys in hyper-parameter graph.
  • Enhancements to trial duration display.

Others

  • Provide utility function to merge parameters received from NNI
  • Support setting paiStorageConfigName in pai mode

Documentation

Bug Fixes

  • Fix bug for model graph with shared nn.Module
  • Fix nodejs OOM when make build
  • Fix NASUI bugs
  • Fix duration and intermediate results pictures update issue.
  • Fix minor WebUI table style issues.

v1.6

3 years ago

Release 1.6 - 5/26/2020

Major Features

New Features and improvement

  • support __version__ for SDK version
  • support windows dev install
  • Improve IPC limitation to 100W
  • improve code storage upload logic among trials in non-local platform

HPO Updates

  • Improve PBT on failure handling and support experiment resume for PBT

NAS Updates

  • NAS support for TensorFlow 2.0 (preview) TF2.0 NAS examples
  • Use OrderedDict for LayerChoice
  • Prettify the format of export
  • Replace layer choice with selected module after applied fixed architecture

Model Compression Updates

  • Model compression PyTorch 1.4 support

Training Service Updates

  • update pai yaml merge logic
  • support windows as remote machine in remote mode Remote Mode

Web UI new supports or improvements

  • Show trial error message
  • finalize homepage layout
  • Refactor overview's best trials module
  • Remove multiphase from webui
  • add tooltip for trial concurrency in the overview page
  • Show top trials for hyper-parameter graph

Bug Fix

  • fix dev install
  • SPOS example crash when the checkpoints do not have state_dict
  • Fix table sort issue when experiment had failed trial
  • Support multi python env (conda, pyenv etc)

v1.5

4 years ago

New Features and Documentation

Hyper-Parameter Optimizing

Model Compression

  • New Pruner: GradientRankFilterPruner
  • Compressors will validate configuration by default
  • Refactor: Adding optimizer as an input argument of pruner, for easy support of DataParallel and more efficient iterative pruning. This is a broken change for the usage of iterative pruning algorithms.
  • Model compression examples are refactored and improved
  • Added documentation for implementing compressing algorithm

Training Service

  • Kubeflow now supports pytorchjob crd v1 (thanks external contributor @jiapinai)
  • Experimental DLTS support

Overall Documentation Improvement

  • Documentation is significantly improved on grammar, spelling, and wording (thanks external contributor @AHartNtkn)

Fixed Bugs

  • ENAS cannot have more than one LSTM layers (thanks external contributor @marsggbo)
  • NNI manager's timers will never unsubscribe (thanks external contributor @guilhermehn)
  • NNI manager may exhaust head memory (thanks external contributor @Sundrops)
  • Batch tuner does not support customized trials (#2075)
  • Experiment cannot be killed if it failed on start (#2080)
  • Non-number type metrics break web UI (#2278)
  • A bug in lottery ticket pruner
  • Other minor glitches