Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)
NOTE: Please see our follow-up work in CVPR 2022, which further extends this repository.
@article{chen2022self,
title={Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology},
author={Chen, Richard J and Krishnan, Rahul G},
journal={Learning Meaningful Representations of Life, NeurIPS 2021},
year={2021}
}
Summary / Main Findings:
We use Git LFS to version-control large files in this repository (e.g. - images, embeddings, checkpoints). After installing, to pull these large files, please run:
git lfs pull
SIMCLR and DINO models were trained for 100 epochs using their vanilla training recipes in their respective papers. These models were developed on 2,055,742 patches (256 x 256
resolution at 20X
magnification) extracted from diagnostic slides in the TCGA-BRCA dataset, and evaluated via K-NN on patch-level datasets in histopathology.
Note: Results should be taken-in w.r.t. to the size of dataset and duraration of training epochs. Ideally, longer training with larger batch sizes would demonstrate larger gains in SSL performance.
Arch | SSL Method | Dataset | Epochs | Dim | K-NN | Download |
---|---|---|---|---|---|---|
ResNet-50 | Transfer | ImageNet | N/A | 1024 | 0.935 | N/A |
ResNet-50 | SimCLR | TCGA-BRCA | 100 | 2048 | 0.938 | Backbone |
ViT-S/16 | DINO | TCGA-BRCA | 100 | 384 | 0.941 | Backbone |
For CRC-100K and BreastPathQ, pre-extracted embeddings are already available and processed in ./embeddings_patch_library. See patch_extraction_utils.py on how these patch datasets were processed.
Additional Datasets + Custom Implementation: This codebase is flexible for feature extraction on a variety of different patch datasets. To extend this work, simply modify patch_extraction_utils.py with a custom Dataset Loader for your dataset. As an example, we include BCSS (results not yet updated in this work).
Run the notebook patch_extraction.ipynb, followed by patch_evaluation.ipynb. The evaluation notebook should run "out-of-the-box" with Git LFS.
Install the CLAM Package, followed by using the 10-fold cross-validation splits made available in ./slide_evaluation/10foldcv_subtype/tcga_brca
. Tensorboard train + validation logs can visualized via:
tensorboard --logdir ./slide_evaluation/results/
Install umap-learn (can be tricky to install if you have incompatible dependencies), followed by using the following code snippet in patch_extraction_utils.py, and is used in patch_extraction.ipynb to create Figure 4.
Attention visualizations (reproducing Figure 3) can be performed via walking through the following notebook at attention_visualization_256.ipynb.
@article{chen2022self,
title={Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology},
author={Chen, Richard J and Krishnan, Rahul G},
journal={Learning Meaningful Representations of Life, NeurIPS 2021},
year={2021}
}
@inproceedings{chen2022scaling,
title={Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning},
author={Chen, Richard J and Chen, Chengkuan and Li, Yicong and Chen, Tiffany Y and Trister, Andrew D and Krishnan, Rahul G and Mahmood, Faisal},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2022}
}
© This code is made available under the GPLv3 License and is available for non-commercial academic purposes.