Image-to-image regression with uncertainty quantification in PyTorch. Take any dataset and train a model to regress images to images with rigorous, distribution-free uncertainty quantification.
A platform for image-to-image regression with rigorous, distribution-free uncertainty quantification.
This repository provides a convenient way to train deep-learning models in PyTorch for image-to-image regression---any task where the input and output are both images---along with rigorous uncertainty quantification. The uncertainty quantification takes the form of an interval for each pixel which is guaranteed to contain most true pixel values with high-probability no matter the choice of model or the dataset used (it is a risk-controlling prediction set). The training pipeline is already built to handle more than one GPU and all training/calibration should run automatically.
The basic workflow is
core/datasets/
.experiments/new_experiment
, along with a file experiments/new_experiment/config.yml
defining the model architecture, hyperparameters, and method of uncertainty quantification. You can use experiments/fastmri_test/config.yml
as a template.core/scripts/router.py
to point to your data directory.wandb sweep experiments/new_experiment/config.yml
, and run the resulting sweep.experiments/new_experiment/checkpoints
, the metrics will be printed to the terminal, and outputs will be in experiments/new_experiment/output/
. See experiments/fastmri_test/plot.py
for an example of how to make plots from the raw outputs.Following this procedure will train one or more models (depending on config.yml
) that perform image-to-image regression with rigorous uncertainty quantification.
There are two pre-baked examples that you can run on your own after downloading the open-source data: experiments/fastmri_test/config.yml
and experiments/temca_test/config.yml
.
The third pre-baked example, experiments/bsbcm_test/config.yml
, reiles on data collected at Berkeley that has not yet been publicly released (but will be soon).
@article{angelopoulos2022image,
title={Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging},
author={Angelopoulos, Anastasios N and Kohli, Amit P and Bates, Stephen and Jordan, Michael I and Malik, Jitendra and Alshaabi, Thayer and Upadhyayula, Srigokul and Romano, Yaniv},
journal={arXiv preprint arXiv:2202.05265},
year={2022}
}
You will need to execute
conda env create -f environment.yml
conda activate im2im-uq
You will also need to go through the Weights and Biases setup process that initiates when you run your first sweep. You may need to make an account on their website.
knee_singlecoil_train
dataset.core/scripts/router
to point to the your local dataset.wandb sweep experiments/fastmri_test/config.yml
cd experiments/fastmri_test/plot.py
to plot the results.core/scripts/router
to point to the your local dataset.wandb sweep experiments/temca_test/config.yml
cd experiments/temca_test/plot.py
to plot the results.If you want to extend this code to a new experiment, you will need to write some code compatible with our infrastructure. If adding a new dataset, you will need to write a valid PyTorch dataset object; you need to add a new model architecture, you will need to specify it; and so on.
Usually, you will want to start by creating a folder experiments/new_experiment
along with a config file experiments/new_experiment/config.yml
.
The easiest way is to start from an existing config, like experiments/fastmri_test/config.yml
.
To add a new dataset, use the following procedure.
core/datasets
, make a new folder for your dataset core/datasets/new_dataset
.Dataset
class for your new dataset. The most critical part is writing a __get_item__
method that returns an image-image pair in CxHxW order; see core/datasets/bsbcm/BSBCMDataset.py
for a simple example.core/datasets/new_dataset/__init__.py
and export your dataset by adding the line from .NewDataset.py import NewDatasetClass
(substituting in your filename and classname appropriately).core/scripts/router.py
to load your new dataset, near Line 64, following the pattern therein. You will also need to import your dataset object.experiments/new_experiment/config.yml
with the correct directories and experiment name.wandb sweep experiments/new_experiment/config.yml
and proceed as normal!In our system, there are two parts to a model---the base architecture, which we call a trunk
(e.g. a U-Net), and the final layer.
Defining a trunk is as simple as writing a regular PyTorch nn.module
and adding it near Line 87 of core/scripts/router.py
(you will also need to import it); see core/models/trunks/unet.py
for an example.
The process for adding a final layer is a bit more involved.
The final layer is simply a Pytorch nn.module
, but it also must come with two functions: a loss function and a nested prediction set function.
See core/models/finallayers/quantile_layer.py
for an example.
The steps are:
nn.module
object. The final layer should also have a heuristic notion of uncertainty built in, like quantile outputs.lam
, which will later be calibrated. The function should have the same prototype as that on Line 34 of core/models/finallayers/quantile_layer.py
for an example.core/models/add_uncertainty.py
as in Line 59.wandb sweep experiments/new_experiment/config.yml
to include your new final layer, and run the sweep as normal!