Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
plugins.precision.MixedPrecision
(#17687)NeptuneLogger
(#16761):
log()
method with append()
and extend()
.Handler
as an alternative to Run
for the run
argument. This means that you can call it NeptuneLogger(run=run["some/namespace"])
to log everything to the some/namespace/
location of the run.plugins.precision.MixedPrecisionPlugin
(#17687)LightningModule.load_from_checkpoint
when there is an extra state (#17812)@akreuzer, @awaelchli, @borda, @jerome-habana, @kshitij12345
If we forgot someone due to not matching commit email with GitHub account, let us know :]
LightningWork.public_ip
that exposes the public IP of the LightningWork
instance (#17742)LightningWork.internal_ip
that was mistakenly exposing the public IP instead; now exposes the private/internal IP address (#17742)Callback
registration through entry points (#17756)CSVLogger
(#17139)CombinedLoader
only starts DataLoader workers when necessary when operating in sequential mode (#17639)WandbLogger
no longer flattens dictionaries in the hyperparameters logged to the dashboard (#17574)CSVLogger
(#17139)ModelCheckpoint
contained metrics that were substrings of each other (#17610)WandbLogger
ignoring the WANDB_PROJECT
environment variable (#16222)@adamjstewart, @AleksanderWWW, @awaelchli, @baskrahmer, @bkiat1123, @borda, @carmocca, @ethanwharris, @leng-yue, @lightningforever, @manangoel99, @mukhery, @Quasar-Kim, @water-vapor, @yurijmikhalevich
If we forgot someone due to not matching commit email with GitHub account, let us know :]
AppState
, streamlit example (#17452)LightningModule.*_step
methods bypassing the DDP/FSDP wrapper (#17424)Fabric.setup()
when the model has no parameters (#17441)Model.load_from_checkpoint("checkpoint.ckpt", map_location=map_location)
would always return model on CPU (#17308)num_nodes
not to be set correctly for FSDPStrategy
(#17438)@awaelchli, @borda, @carmocca, @ethanwharris, @ryan597, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
healthz
endpoint to plugin server (#16882)TorchCollective
works on the torch.distributed
WORLD process group by default (#16995)_cuda_clearCublasWorkspaces
on teardown (#16907)NeptuneLogger
(#16761):
log()
method with append()
and extend()
.Handler
as an alternative to Run
for the run
argument. This means that you can call it like NeptuneLogger(run=run["some/namespace"])
to log everything to the some/namespace/
location of the run.sys.argv
and args in LightningCLI
(#16808)ShardedTensor
state dict hooks in LightningModule.__init__
with torch>=2.1
(#16892)lightning.pytorch.core.saving.ModelIO
class interface (#16974)num_nodes
not being set for DDPFullyShardedNativeStrategy
(#17160)DeepSpeedStrategy
(#16973)rich
that prevented Lightning to be imported in Google Colab (#17156)_cuda_clearCublasWorkspaces
on teardown (#16907)psutil
package is now required for CPU monitoring (#17010)@awaelchli, @belerico, @carmocca, @colehawkins, @dmitsf, @Erotemic, @ethanwharris, @kshitij12345, @borda
If we forgot someone due to not matching commit email with GitHub account, let us know :]
No changes.
is_picklable
function more robust (#17270)@eng-yue @ethanwharris @Borda @awaelchli @carmocca
If we forgot someone due to not matching commit email with GitHub account, let us know :]
No changes
Optimizer
validation to accommodate both FSDP 1.x and 2.x (#16733)LightningModule
no longer pickles the Trainer
(#17133)Optimizer
validation to accommodate both FSDP 1.x and 2.x (#16733)torch.inference_mode
with torch.compile
in PyTorch 2.0 (#17215)ModelCheckpoint(save_top_k>0)
is used (#17121)rich
that prevented Lightning to be imported in Google Colab (#17156)DeepSpeedStrategy
(#16973)torch.compile
would fail when logging to WandB (#17216)@Borda @williamFalcon @lightningforever @adamjstewart @carmocca @tshu-w @saryazdi @parambharat @awaelchli @colehawkins @woqidaideshi @md-121 @yhl48 @gkroiz @idc9 @speediedan
If we forgot someone due to not matching commit email with GitHub account, let us know :]
testing.run_app_in_cloud
in favor of headless login and app selection (#16741)Fabric(strategy="auto")
support (#16916)find_usable_cuda_devices(num_devices=-1)
(#16866)Fabric(strategy="auto")
support. It will choose DDP over DDP-spawn, contrary to strategy=None
(default) (#16916)lightning.pytorch.utilities.parsing.get_init_args
(#16851)@ethanwharris, @carmocca, @awaelchli, @justusschock , @dtuit, @Liyang90
If we forgot someone due to not matching commit email with GitHub account, let us know :]
lightning open
command and improved redirects (#16794)accelerator=tpu
and devices > 1
(#16806)--accelerator
and --precision
in Fabric CLI when accelerator
and precision
are set to non-default values in the code (#16818)accelerator=tpu
and devices > 1
(#16806)@ethanwharris, @carmocca, @awaelchli, @borda, @tchaton, @yurijmikhalevich
If we forgot someone due to not matching commit email with GitHub account, let us know :]