Weakly Supervised End-to-End Learning (NeurIPS 2021)
This is a PyTorch-Lightning-based framework, based on our End-to-End Weak Supervision paper (NeurIPS 2021), that allows you to train your favorite neural network for weakly-supervised classification1
1 This includes learning from crowdsourced labels or annotations!
2 LFs are labeling heuristics, that output noisy labels for (subsets of) the training data
(e.g. crowdworkers or keyword detectors).
If you use this code, please consider citing our work
End-to-End Weak Supervision
Salva Rühling Cachay, Benedikt Boecking, and Artur Dubrawski
Advances in Neural Information Processing Systems (NeurIPS), 2021
arXiv:2107.02233v3
Credits
The following template was extremely useful as source of inspiration and for getting started with the PL+Hydra implementation: ashleve/lightning-hydra-template
Weasel image credits go to Rohan Chang for this Unsplash-licensed image
This library assumes familiarity with (multi-source) weak supervision, if that's not the case you may want to first learn its basics in e.g. this overview slides from Stanford or this Snorkel tutorial.
That being said, have a look at our examples and the notebooks therein showing you how to use Weasel for your own dataset, LF set, or end-model. E.g.:
A high-level starter tutorial, with few code, many explanations and including Snorkel as a baseline (so that if you are familiar with Snorkel you can see the similarities and differences to Weasel).
See how the whole WeaSEL pipeline works with all details, necessary steps and definitions for a new dataset & custom end-model. This notebook will probably make you learn the most about WeaSEL and how to apply it to your own problem.
A realistic ML experiment script with all that's part of a ML pipeline, including logging to Weight&Biases, arbitrary callbacks, and eventually retrieving your fully trained end-model.
Please have a look at the research code branch, which operates on pure PyTorch.
conda create --name weasel python=3.9
conda activate weasel
python -m pip install git+https://github.com/autonlab/weasel#egg=weasel[all]
git clone https://github.com/autonlab/weasel.git
cd weasel
pip install -e .[all]
Minimal dependencies
Minimal dependencies, in particular not using Hydra, can be installed with
python -m pip install git+https://github.com/autonlab/weasel
The needed environment corresponds to conda env create -f env_gpu_minimal.yml
.
If you choose to use this variant, you won't be able to run some of the examples: You may want to have a look at this notebook that walks you through how to use Weasel without Hydra as the config manager.
Note: Weasel is under active development, some uncovered edge cases might exist, and any feedback is very welcomed!
Optional: This template config will help you get started with your own application, an analogous config is used in this tutorial script that you may want to check out.
Please have a look at the detailed instructions in this Readme.
Please have a look at the detailed instructions in this Readme.
@article{cachay2021endtoend,
author={R{\"u}hling Cachay, Salva and Boecking, Benedikt and Dubrawski, Artur},
journal={Advances in Neural Information Processing Systems},
title={End-to-End Weak Supervision},
year={2021}
}