Pytorch Lightning Versions Save

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

2.0.4

10 months ago

App

Fixed

bumped several dependencies to address security vulnerabilities.

Fabric

Fixed

Fixed validation of parameters of plugins.precision.MixedPrecision (#17687)
Fixed an issue with HPU imports leading to performance degradation (#17788)

PyTorch

Changed

Changes to the NeptuneLogger (#16761):
- It now supports neptune-client 0.16.16 and neptune >=1.0, and we have replaced the log() method with append() and extend().
- It now accepts a namespace Handler as an alternative to Run for the run argument. This means that you can call it NeptuneLogger(run=run["some/namespace"]) to log everything to the some/namespace/ location of the run.

Fixed

Fixed validation of parameters of plugins.precision.MixedPrecisionPlugin (#17687)
Fixed deriving default map location in LightningModule.load_from_checkpoint when there is an extra state (#17812)

Contributors

@akreuzer, @awaelchli, @borda, @jerome-habana, @kshitij12345

If we forgot someone due to not matching commit email with GitHub account, let us know :]

2.0.3

11 months ago

App

Added

Added the property LightningWork.public_ip that exposes the public IP of the LightningWork instance (#17742)
Add missing python-multipart dependency (#17244)

Changed

Made type hints public (#17100)

Fixed

Fixed LightningWork.internal_ip that was mistakenly exposing the public IP instead; now exposes the private/internal IP address (#17742)
Fixed resolution of the latest version in CLI (#17351)
Fixed property raised instead of returned (#17595)
Fixed get project (#17617, #17666)

Fabric

Added

Added support for Callback registration through entry points (#17756)

Changed

Made type hints public (#17100)
Support compiling a module after it was set up by Fabric (#17529)

Fixed

Fixed computing the next version folder in CSVLogger (#17139)
Fixed inconsistent settings for FSDP Precision (#17670)

PyTorch

Changed

Made type hints public (#17100)

Fixed

CombinedLoader only starts DataLoader workers when necessary when operating in sequential mode (#17639)
Fixed a potential bug with uploading model checkpoints to Neptune.ai by uploading files from stream (#17430)
Fixed signature inspection of decorated hooks (#17507)
The WandbLogger no longer flattens dictionaries in the hyperparameters logged to the dashboard (#17574)
Fixed computing the next version folder in CSVLogger (#17139)
Fixed a formatting issue when the filename in ModelCheckpoint contained metrics that were substrings of each other (#17610)
Fixed WandbLogger ignoring the WANDB_PROJECT environment variable (#16222)
Fixed inconsistent settings for FSDP Precision (#17670)
Fixed an edge case causing overlapping samples in DDP when no global seed is set (#17713)
Fallback to module available check for mlflow (#17467)
Fixed LR finder max val batches (#17636)
Fixed multithreading checkpoint loading (#17678)

Contributors

@adamjstewart, @AleksanderWWW, @awaelchli, @baskrahmer, @bkiat1123, @borda, @carmocca, @ethanwharris, @leng-yue, @lightningforever, @manangoel99, @mukhery, @Quasar-Kim, @water-vapor, @yurijmikhalevich

If we forgot someone due to not matching commit email with GitHub account, let us know :]

2.0.2

1 year ago

App

Fixed

Resolved Lightning App with remote storage (#17426)
Fixed AppState, streamlit example (#17452)

Fabric

Changed

Enable precision autocast for LightningModule step methods in Fabric (#17439)

Fixed

Fixed an issue with LightningModule.*_step methods bypassing the DDP/FSDP wrapper (#17424)
Fixed device handling in Fabric.setup() when the model has no parameters (#17441)

PyTorch

Fixed

Fixed Model.load_from_checkpoint("checkpoint.ckpt", map_location=map_location) would always return model on CPU (#17308)
Fixed Sync module states during non-fit (#17370)
Fixed an issue that caused num_nodes not to be set correctly for FSDPStrategy (#17438)

Contributors

@awaelchli, @borda, @carmocca, @ethanwharris, @ryan597, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

1.9.5

1 year ago

App

Changed

Added healthz endpoint to plugin server (#16882)
System customization syncing for jobs run (#16932)

Fabric

Changed

Let TorchCollective works on the torch.distributed WORLD process group by default (#16995)

Fixed

fixed for all _cuda_clearCublasWorkspaces on teardown (#16907)
Improved the error message for installing tensorboard or tensorboardx (#17053)

PyTorch

Changed

Changed to the NeptuneLogger (#16761):
- It now supports neptune-client 0.16.16 and neptune >=1.0, and we have replaced the log() method with append() and extend().
- It now accepts a namespace Handler as an alternative to Run for the run argument. This means that you can call it like NeptuneLogger(run=run["some/namespace"]) to log everything to the some/namespace/ location of the run.
Allow sys.argv and args in LightningCLI (#16808)
Moveed HPU broadcast override to the HPU strategy file (#17011)

Depercated

Removed registration of ShardedTensor state dict hooks in LightningModule.__init__ with torch>=2.1 (#16892)
Removed the lightning.pytorch.core.saving.ModelIO class interface (#16974)

Fixed

Fixed num_nodes not being set for DDPFullyShardedNativeStrategy (#17160)
Fixed parsing the precision config for inference in DeepSpeedStrategy (#16973)
Fixed the availability check for rich that prevented Lightning to be imported in Google Colab (#17156)
Fixed for all _cuda_clearCublasWorkspaces on teardown (#16907)
The psutil package is now required for CPU monitoring (#17010)
Improved the error message for installing tensorboard or tensorboardx (#17053)

Contributors

@awaelchli, @belerico, @carmocca, @colehawkins, @dmitsf, @Erotemic, @ethanwharris, @kshitij12345, @borda

If we forgot someone due to not matching commit email with GitHub account, let us know :]

2.0.1.post0

1 year ago

App

Fixed

Fix frontend hosts when running with multi-process in the cloud (#17324)

Fabric

No changes.

PyTorch

Fixed

Make the is_picklable function more robust (#17270)

Contributors

@eng-yue @ethanwharris @Borda @awaelchli @carmocca

If we forgot someone due to not matching commit email with GitHub account, let us know :]

2.0.1

1 year ago

App

No changes

Fabric

Changed

Generalized Optimizer validation to accommodate both FSDP 1.x and 2.x (#16733)

PyTorch

Changed

Pickling the LightningModule no longer pickles the Trainer (#17133)
Generalized Optimizer validation to accommodate both FSDP 1.x and 2.x (#16733)
Disable torch.inference_mode with torch.compile in PyTorch 2.0 (#17215)

Fixed

Fixed issue where pickling the module instance would fail with a DataLoader error (#17130)
Fixed WandbLogger not showing "best" aliases for model checkpoints when ModelCheckpoint(save_top_k>0) is used (#17121)
Fixed the availability check for rich that prevented Lightning to be imported in Google Colab (#17156)
Fixed parsing the precision config for inference in DeepSpeedStrategy (#16973)
Fixed issue where torch.compile would fail when logging to WandB (#17216)

Contributors

@Borda @williamFalcon @lightningforever @adamjstewart @carmocca @tshu-w @saryazdi @parambharat @awaelchli @colehawkins @woqidaideshi @md-121 @yhl48 @gkroiz @idc9 @speediedan

If we forgot someone due to not matching commit email with GitHub account, let us know :]

2.0.0

1 year ago

1.9.4

1 year ago

App

Removed

Removed implicit ui testing with testing.run_app_in_cloud in favor of headless login and app selection (#16741)

Fabric

Added

Added Fabric(strategy="auto") support (#16916)

Fixed

Fixed edge cases in parsing device ids using NVML (#16795)
Fixed DDP spawn hang on TPU Pods (#16844)
Fixed an error when passing find_usable_cuda_devices(num_devices=-1) (#16866)

PyTorch

Added

Added Fabric(strategy="auto") support. It will choose DDP over DDP-spawn, contrary to strategy=None (default) (#16916)

Fixed

Fixed DDP spawn hang on TPU Pods (#16844)
Fixed edge cases in parsing device ids using NVML (#16795)
Fixed backwards compatibility for lightning.pytorch.utilities.parsing.get_init_args (#16851)

Contributors

@ethanwharris, @carmocca, @awaelchli, @justusschock , @dtuit, @Liyang90

If we forgot someone due to not matching commit email with GitHub account, let us know :]

2.0.0rc0

1 year ago

Full Changelog: https://github.com/Lightning-AI/lightning/compare/1.9.0...2.0.0rc0

1.9.3

1 year ago

App

Fixed

Fixed lightning open command and improved redirects (#16794)

Fabric

Fixed

Fixed an issue causing a wrong environment plugin to be selected when accelerator=tpu and devices > 1 (#16806)
Fixed parsing of defaults for --accelerator and --precision in Fabric CLI when accelerator and precision are set to non-default values in the code (#16818)

PyTorch

Fixed

Fixed an issue causing a wrong environment plugin to be selected when accelerator=tpu and devices > 1 (#16806)

Contributors

@ethanwharris, @carmocca, @awaelchli, @borda, @tchaton, @yurijmikhalevich

If we forgot someone due to not matching commit email with GitHub account, let us know :]