A scikit-learn compatible neural network library that wraps PyTorch
This is a smaller release, but it still contains changes which will be interesting to some of you.
We added the possibility to store weights using safetensors. This can have several advantages, listed here. When calling net.save_params
and net.load_params
, just pass use_safetensors=True
to use safetensors instead of pickle.
Moreover, there is a new argument on NeuralNet
: You can now pass use_caching=False
or True
to disable or enable caching for all callbacks at once. This is useful if you have a lot of scoring callbacks and don't want to toggle caching on each individually.
Finally, we fixed a few issues related to using skorch with accelerate.
Thanks Zach Mueller (@muellerzr) for his first contribution to skorch.
Find the full list of changes here: https://github.com/skorch-dev/skorch/compare/v0.14.0...v0.15.0
This release offers a new interface for scikit-learn to do zero-shot and few-shot classification using open source large language models (Jump right into the example notebook).
skorch.llm.ZeroShotClassifier
and skorch.llm.FewShotClassifier
allow the user to do classification using open-source language models that are compatible with the huggingface generation interface. This allows you to do all sort of interesting things in your pipelines. From simply plugging a LLM into your classification pipeline to get preliminary results quickly, to using these classifiers to generate training data candidates for downstream models. This is a first draft of the interface, therefore it is not unlikely that the interface will change a bit in the future, so please, let us know about any potential issues you have.
Other items of this release are
NeptuneLogger
now logs the skorch version thanks to @AleksanderWWWNeuralNetRegressor
can now be fitted with 1-dimensional y
, which is necessary in some specific circumstances (e.g. in conjunction with sklearn's BaggingRegressor
, see #972); for this to work correctly, the output of the of the PyTorch module should also be 1-dimensional; the existing default, i.e. having y
and y_pred
be 2-dimensional, remains the recommended way of using NeuralNetRegressor
Full Changelog: https://github.com/skorch-dev/skorch/compare/v0.13.0...v0.14.0
The new skorch release is here and it has some changes that will be exiting for some users.
compile=True
when initializing the net to enable compilation.accelerate
package has been improved by fixing some bugs and providing a dedicated history class. Our documentation contains more information on what to consider when training on multiple GPUs.SkorchDoctor
class will simplify the diagnosis of underlying issues. Take a look at the accompanying notebook.Apart from that, a few bugs have been fixed and the included notebooks have been updated to properly install requirements on Google Colab.
We are grateful for external contributors, many thanks to:
Find below the list of all changes since v0.12.1 below:
torch.compile
function, introduced in PyTorch 2.0 release, which can greatly improve performance on new GPU architectures; to use it, initialize your net with the compile=True
argument, further compilation arguments can be specified using the dunder notation, e.g. compile__dynamic=True
DistributedHistory
which should be used when training in a multi GPU setting (#955)SkorchDoctor
: A helper class that assists in understanding and debugging the neural net training, see this notebook (#912)AccelerateMixin
, it is now possible to prevent unwrapping of the modules by setting unwrap_after_train=True
(#963)AccelerateMixin
in a multi-GPU setup (#947)_get_param_names
returns a list instead of a generator so that subsequent error messages return useful information instead of a generator repr
string (#925)AccelerateMixin
, which could prevent them from being pickleable (#963)This is a small release which consists mostly of a couple of bug fixes. The standout feature here is the update of the NeptuneLogger, which makes it work with the latest Neptune client versions and adds many useful features, check it out. Big thanks to @twolodzko and colleagues for this update.
Here is the list of all changes:
iterator_valid__shuffle=False
#908We're pleased to announce a new skorch release, bringing new features that might interest you.
The main changes relate to better integration with the Hugging Face ecosystem:
AccelerateMixin
.HuggingfaceTokenizer
and HuggingfacePretrainedTokenizer
; you can even put Hugging Face tokenizers into an sklearn Pipeline
and perform a grid search to find the best tokenizer hyperparameters.HfHubStorage
.But this is not all. We have added the possibility to load the best model parameters at the end of training when using the EarlyStopping
callback. We also added the possibility to remove unneeded attributes from the net after training when it is intended to be only used for prediction by calling the trim_for_prediction
method. Moreover, we now show how to use skorch with PyTorch Geometric in this notebook.
As always, this release was made possible by outside contributors. Many thanks to:
Find below the list of all changes:
load_best
attribute to EarlyStopping
callback to automatically load module weights of the best result at the end of trainingtrim_for_prediction
, on the net classes, which trims the net from everything not required for using it for prediction; call this after fitting to reduce the size of the netskorch.hf.HuggingfaceTokenizer
to train a Huggingface tokenizer on your custom data; use skorch.hf.HuggingfacePretrainedTokenizer
to load a pre-trained Huggingface tokenizerHfHubStorage
np.asarray
with SliceDataset
s (#858)SliceDataset
that prevented it to be used with to_numpy
(#858)We are happy to announce the new skorch 0.11 release:
Two basic but very useful features have been added to our collection of callbacks. First, by setting load_best=True
on the Checkpoint
callback, the snapshot of the network with the best score will be loaded automatically when training ends. Second, we added a callback InputShapeSetter
that automatically adjusts your input layer to have the size of your input data (useful e.g. when that size is not known beforehand).
When it comes to integrations, the MlflowLogger
now allows to automatically log to MLflow. Thanks to a contributor, some regressions in net.history
have been fixed and it even runs faster now.
On top of that, skorch now offers a new module, skorch.probabilistic
. It contains new classes to work with Gaussian Processes using the familiar skorch API. This is made possible by the fantastic GPyTorch library, which skorch uses for this. So if you want to get started with Gaussian Processes in skorch, check out the documentation and this notebook. Since we're still learning, it's possible that we will change the API in the future, so please be aware of that.
Morever, we introduced some changes to make skorch more customizable. First of all, we changed the signature of some methods so that they no longer assume the dataset to always return exactly 2 values. This way, it's easier to work with custom datasets that return e.g. 3 values. Normal users should not notice any difference, but if you often create custom nets, take a look at the migration guide.
And finally, we made a change to how custom modules, criteria, and optimizers are handled. They are now "first class citizens" in skorch land, which means: If you add a second module to your custom net, it is treated exactly the same as the normal module. E.g., skorch takes care of moving it to CUDA if needed and of switching it to train or eval mode. This way, customizing your networks architectures with skorch is easier than ever. Check the docs for more details.
Since these are some big changes, it's possible that you encounter issues. If that's the case, please check our issue page or create a new one.
As always, this release was made possible by outside contributors. Many thanks to:
Find below the list of all changes:
load_best
attribute to Checkpoint
callback to automatically load state of the best result at the end of trainingget_all_learnable_params
method to retrieve the named parameters of all PyTorch modules defined on the net, including of criteria if applicableMlflowLogger
callback for logging to Mlflow (#769)InputShapeSetter
callback for automatically setting the input dimension of the PyTorch modulevalidation_step
, train_step_single
, train_step
, evaluation_step
, on_batch_begin
, and on_batch_end
such that instead of receiving X
and y
, they receive the whole batch; this makes it easier to deal with datasets that don't strictly return an (X, y)
tuple, which is true for quite a few PyTorch datasets; please refer to the migration guide if you encounter problems (#699)NeuralNet
is now during .initialize()
, not during __init__
, to avoid raising false positives for yet unknown module or optimizer attributesCVSplit
is renamed to ValidSplit
to avoid confusion (#752)net.history
implementation (#776)TrainEndCheckpoint
that prevented it from being unpickled (#773)This one is a smaller release, but we have some bigger additions waiting for the next one.
First we added support for Sacred to help you better organize your experiments. The CLI helper now also works with non-skorch estimators, as long as they are sklearn compatible. Some issues related to learning rate scheduling have been solved.
A big topic this time was also working on performance. First of all, we added a performance section to the docs. Furthermore, we facilitated switching off callbacks completely if performance is absolutely critical. Finally, we improved the speed of some internals (history logging). In sum, that means that skorch should be much faster for small network architectures.
We are grateful to the contributors, new and recurring:
This release of skorch contains a few minor improvements and some nice additions. As always, we fixed a few bugs and improved the documentation. Our learning rate scheduler now optionally logs learning rate changes to the history; moreover, it now allows the user to choose whether an update step should be made after each batch or each epoch.
If you always longed for a metric that would just use whatever is defined by your criterion, look no further than loss_scoring
. Also, skorch now allows you to easily change the kind of nonlinearity to apply to the module's output when predict
and predict_proba
are called, by passing the predict_nonlinearity
argument.
Besides these changes, we improved the customization potential of skorch. First of all, the criterion
is now set to train
or valid
, depending on the phase -- this is useful if the criterion should act differently during training and validation. Next we made it easier to add custom modules, optimizers, and criteria to your neural net; this should facilitate implementing architectures like GANs. Consult the docs for more on this. Conveniently, net.save_params
can now persist arbitrary attributes, including those custom modules.
As always, these improvements wouldn't have been possible without the community. Please keep asking questions, raising issues, and proposing new features. We are especially grateful to those community members, old and new, who contributed via PRs:
Aaron Berk
guybuk
kqf
Michał Słapek
Scott Sievert
Yann Dubois
Zhao Meng
Here is the full list of all changes:
event_name
argument for LRScheduler
for optional recording of LR changes inside net.history
. NOTE: Supported only in Pytorch>=1.4step_every
argument for LRScheduler
to set whether the scheduler step should be taken on every epoch or on every batch.scoring
module with loss_scoring
function, which computes the net's loss (using get_loss
) on provided input data.predict_nonlinearity
to NeuralNet
which allows users to control the nonlinearity to be applied to the module output when calling predict
and predict_proba
(#637, #661)save_params
and with checkpoint callbackssave_params
and with checkpoint callbacksbatch_step()
method in LRScheduler
.FutureWarning
in CVSplit
when random_state
is not used. Will raise an exception in a future (#620)net.get_params
changed to make it more consistent with sklearn: it will no longer return "learned" attributes like module_
; therefore, functions like sklearn.base.clone
, when called with a fitted net, will no longer return a fitted net but instead an uninitialized net; if you want a copy of a fitted net, use copy.deepcopy
instead;net.get_params
is used under the hood by many sklearn functions and classes, such as GridSearchCV
, whose behavior may thus be affected by the change. (#521, #527)FutureWarning
when using CyclicLR
scheduler, because the default behavior has changed from taking a step every batch to taking a step every epoch. (#626)y=None
to NeuralNet.train_split
to enable the direct use of split functions without positional y
in their signatures. This is useful when working with unsupervised data (#605).to_numpy
is now able to unpack dicts and lists/tuples (#657, #658)CrossEntropyLoss
, softmax is now automatically applied to the output when calling predict
or predict_proba
CyclicLR
scheduler would update during both training and validation rather than just during training.optimizer.zero_grad()
call outside of the train step function, making it incompatible with LBFGS and other optimizers that call the train step several times per batch (#636)ProgressBar
callback (#656)This release contains improvements on the callback side of things. Thanks to new contributors, skorch now integrates with neptune through NeptuneLogger
and Weights & Biases through WandbLogger
. We also added PassthroughScoring
, which automatically creates epoch level scores based on computed batch level scores.
If you want skorch not to meddle with moving modules and data to certain devices, you can now pass device=None
and thus have full control. And if you would like to pass pandas DataFrame
s as input data but were unhappy with how skorch currently handles them, take a look at DataFrameTransformer
. Moreover, we cleaned up duplicate code in the fit loop, which should make it easier for users to make their own changes to it. Finally, we improved skorch compatibility with sklearn 0.22 and added minor performance improvements.
As always, we're very thankful for everyone who opened issues and asked questions on diverse channels; all forms of feedback and questions are welcome. We're also very grateful for all contributors, some old but many new:
Alexander Kolb
Benjamin Ajayi-Obe
Boris Dayma
Jakub Czakon
Riccardo Di Maio
Thomas Fan
Yann Dubois
Here is a list of all the changes and their corresponding ticket numbers in detail:
NeptuneLogger
callback for logging experiment metadata to neptune.ai (#586)DataFrameTransformer
, an sklearn compatible transformer that helps working with pandas DataFrames by transforming the DataFrame into a representation that works well with neural networks (#507)WandbLogger
callback for logging to Weights & Biases (#607)None
option to device
which leaves the device(s) unmodified (#600)PassthroughScoring
, a scoring callback that just calculates the average score of a metric determined at batch level and then writes it to the epoch level (#595)fit_loop
(#564)set_params
) attribute was added to NeuralNet
whose name starts the same as an existing attribute's name (#590)Notable additions are TensorBoard support through a callback and several improvements to the NeuralNetClassifier
and NeuralNetBinaryClassifier
to make them more compatible with sklearn metrics and packages by adding support for class inference among other things. We are actively pursuing some bigger topics which did not fit in this release such as scoring caching improvements (#557), a DataFrameTransformer (#507) and improvements to the training loop layout (#564) which we hope to bring to the next release.
WARNING: In a future release, the behavior of method net.get_params
will change to make it more consistent with sklearn: it will no longer return "learned" attributes like module_
. Therefore, functions like sklearn.base.clone
, when called with a fitted net, will no longer return a fitted net but instead an uninitialized net. If you want a copy of a fitted net, use copy.deepcopy
instead. Note that net.get_params
is used under the hood by many sklearn functions and classes, such as GridSearchCV
, whose behavior may thus be affected by the change. (#521, #527)
We had an influx of new contributors and users whom we thank for their support by adding pull requests and filing issues! Most notably, thanks to the individual contributors that made this release possible:
Here is a list of all the changes and their coresponding ticket numbers in detail:
NeuralNet
(#500)TensorBoard
callback for automatic logging to tensorboardNeuralNetBinaryClassifier
work with sklearn.calibration.CalibratedClassifierCV
NeuralNetBinaryClassifier
compatibility with certain sklearn metrics (#515)NeuralNetBinaryClassifier
automatically squeezes module output if necessary (#515)NeuralNetClassifier
now has a classes_
attribute after fit is called, which is inferred from y by default (#465, #486)NeuralNet.load_params
with a checkpoint now initializes when needed (#497)NLLLoss
in NeuralNetClassifer
(#491)NeuralNetBinaryClassifier.predict_proba
now returns a 2-dim array; to access the "old" y_proba
, take y_proba[:, 1]
(#515)net.history
is now a property that accesses net.history_
, which stores the History
object (#527)skorch.callbacks.CyclicLR
, use torch.optim.lr_scheduler.CyclicLR
insteadnet.get_params
will change to make it more consistent with sklearn: it will no longer return "learned" attributes like module_
. Therefore, functions like sklearn.base.clone
, when called with a fitted net, will no longer return a fitted net but instead an uninitialized net. If you want a copy of a fitted net, use copy.deepcopy
instead. Note that net.get_params
is used under the hood by many sklearn functions and classes, such as GridSearchCV
, whose behavior may thus be affected by the change. (#521, #527)LoadInitState
not to work with TrainEndCheckpoint
(#528)NeuralNetBinaryClassifier
wrongly squeezing the batch dimension when using batch_size = 1
(#558)