(ICML 2020) This repo contains code for our paper "Revisiting Training Strategies and Generalization Performance in Deep Metric Learning" (https://arxiv.org/abs/2002.08473) to facilitate consistent research in the field of Deep Metric Learning.
This repository contains all code and implementations used in:
Revisiting Training Strategies and Generalization Performance in Deep Metric Learning
accepted to ICML 2020.
Link: https://arxiv.org/abs/2002.08473
The code is meant to serve as a research starting point in Deep Metric Learning. By implementing key baselines under a consistent setting and logging a vast set of metrics, it should be easier to ensure that method gains are not due to implementational variations, while better understanding driving factors.
It is set up in a modular way to allow for fast and detailed prototyping, but with key elements written in a way that allows the code to be directly copied into other pipelines. In addition, multiple training and test metrics are logged in W&B to allow for easy and large-scale evaluation.
Finally, please find a public W&B repo with key runs performed in the paper here: https://app.wandb.ai/confusezius/RevisitDML.
Contact: Karsten Roth, [email protected]
Suggestions are always welcome!
If you use this code in your research, please cite
@misc{roth2020revisiting,
title={Revisiting Training Strategies and Generalization Performance in Deep Metric Learning},
author={Karsten Roth and Timo Milbich and Samarth Sinha and Prateek Gupta and Björn Ommer and Joseph Paul Cohen},
year={2020},
eprint={2002.08473},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This repository contains (in parts) code that has been adapted from:
Make sure to also check out the following repo with a great plug-and-play implementation of DML methods:
All implemented methods and metrics are listed at the bottom!
Revisit_Runs.sh
.Result_Evaluations.py
. This also allows for potential introspection of other relations. It also converts results directly into Latex-table format with mean and standard deviations.--data_sampler
to the method of choice. Allowed flags are listed in datasampler/__init__.py
.--batch_mining rho_distance
with flip probability --miner_rho_distance_cp e.g. 0.2
.toy_experiments
.Note: There may be small deviations in results based on the Hardware (e.g. between P100 and RTX GPUs) and Software (different PyTorch/Cuda versions) used to run these experiments, but they should be covered in the standard deviations reported in the paper.
An exemplary setup of a virtual environment containing everything needed:
(1) wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
(2) bash Miniconda3-latest-Linux-x86_64.sh (say yes to append path to bashrc)
(3) source .bashrc
(4) conda create -n DL python=3.6
(5) conda activate DL
(6) conda install matplotlib scipy scikit-learn scikit-image tqdm pandas pillow
(7) conda install pytorch torchvision faiss-gpu cudatoolkit=10.0 -c pytorch
(8) pip install wandb pretrainedmodels
(9) Run the scripts!
Data for
can be downloaded either from the respective project sites or directly via Dropbox:
The latter ensures that the folder structure is already consistent with this pipeline and the dataloaders.
Otherwise, please make sure that the datasets have the following internal structure:
cub200/cars196
└───images
| └───001.Black_footed_Albatross
| │ Black_Footed_Albatross_0001_796111
| │ ...
| ...
online_products
└───images
| └───bicycle_final
| │ 111085122871_0.jpg
| ...
|
└───Info_Files
| │ bicycle.txt
| │ ...
Assuming your folder is placed in e.g. <$datapath/cub200>
, pass $datapath
as input to --source
.
Training is done by using main.py
and setting the respective flags, all of which are listed and explained in parameters.py
. A vast set of exemplary runs is provided in Revisit_Runs.sh
.
[I.] A basic sample run using default parameters would like this:
python main.py --loss margin --batch_mining distance --log_online \
--project DML_Project --group Margin_with_Distance --seed 0 \
--gpu 0 --bs 112 --data_sampler class_random --samples_per_class 2 \
--arch resnet50_frozen_normalize --source $datapath --n_epochs 150 \
--lr 0.00001 --embed_dim 128 --evaluate_on_gpu
The purpose of each flag explained:
--loss <loss_name>
: Name of the training objective used. See folder criteria
for implementations of these methods.--batch_mining <batchminer_name>
: Name of the batch-miner to use (for tuple-based ranking methods). See folder batch_mining
for implementations of these methods.--log_online
: Log metrics online via either W&B (Default) or CometML. Regardless, plots, weights and parameters are all stored offline as well.--project
, --group
: Project name as well as name of the run. Different seeds will be logged into the same --group
online. The group as well as the used seed also define the local savename.--seed
, --gpu
, --source
: Basic Parameters setting the training seed, the used GPU and the path to the parent folder containing the respective Datasets.--arch
: The utilized backbone, e.g. ResNet50. You can append _frozen
and _normalize
to the name to ensure that BatchNorm layers are frozen and embeddings are normalized, respectively.--data_sampler
, --samples_per_class
: How to construct a batch. The default method, class_random
, selects classes at random and places <samples_per_class>
samples into the batch until the batch is filled.--lr
, --n_epochs
, --bs
,--embed_dim
: Learning rate, number of training epochs, the batchsize and the embedding dimensionality.--evaluate_on_gpu
: If set, all metrics are computed using the gpu - requires Faiss-GPU and may need additional GPU memory.--evaluation_metrics
will be logged for both training and validation/test set. If you do not care about detailed training metric logging, simply set the flag --no_train_metrics
. A checkpoint is saved for improvements in metrics listed in --storage_metrics
on training, validation or test sets. Detailed information regarding the available metrics can be found at the bottom of this README
.--use_tv_split
and --tv_split_perc <train/val split percentage>
.[II.] Advanced Runs:
python main.py --loss margin --batch_mining distance --loss_margin_beta 0.6 --miner_distance_lower_cutoff 0.5 ... (basic parameters)
--loss_<lossname>_<parameter_name>
, see parameters.py
.Here some information on using W&B (highly encouraged!)
parameters.py
under --wandb_key
.wandb on
in the folder pointed to by --save_path
.Result_Evaluations.py
to download all data, create named metric and correlation plots and output a summary in the form of a latex-ready table with mean and standard deviations of all metrics. This ensures that there are no errors between computed and reported results.
criteria/margin.py
, and ensure that the used methods has the following properties:torch.nn.Module
and define a custom forward()
function.self.lr
to set the learning rate of the loss-specific parameters, or set self.optim_dict_list
, which is a list containing optimization dictionaries passed to the optimizer (see e.g criteria/proxynca.py
). If both are set, self.optim_dict_list
has priority.ALLOWED_MINING_OPS = None or list of allowed mining operations
, REQUIRES_BATCHMINER = False or True
, REQUIRES_OPTIM = False or True
to denote if the method needs a batchminer or optimization of internal parameters.Create custom batchminer: Simply take a look at e.g. batch_mining/distance.py
- The miner needs to be a class with a defined __call__()
-function, taking in a batch and labels and returning e.g. a list of triplets.
Create custom datasamplers:Simply take a look at e.g. datasampler/class_random_sampler.py
. The sampler needs to inherit from torch.utils.data.sampler.Sampler
and has to provide a __iter__()
and a __len__()
function. It has to yield a set of indices that are used to create the batch.
For a detailed explanation of everything, please refer to the supplementary of our paper!
--loss angular
--loss arcface
--loss contrastive
--loss lifted
--loss histogram
--loss margin
--loss multisimilarity
--loss npair
--loss proxynca
--loss quadruplet
--loss snr
--loss softtriplet
--loss softmax
--loss triplet
--batch_mining random
--batch_mining semihard
--batch_mining softhard
--batch_mining distance
--batch_mining rho_distance
--batch_mining parametric
--arch resnet50_frozen_normalize
.--arch bninception_normalize_frozen
.--arch googlenet
.--dataset cub200
.--dataset cars196
.--dataset online_products
.Metrics based on Euclidean Distances
e_recall@1
into the list of evaluation metrics --evaluation_metrics
.nmi
.f1
.mAP_lim
. You may also include mAP_1000
for mAP limited to Recall@1000, and mAP_c
limited to mAP at Recall@Max_Num_Samples_Per_Class. Note that all of these are heavily correlated.Metrics based on Cosine Similarities (not included by default)
c_recall@k
in --evaluation_metrics
.c_nmi
.c_f1
.c_mAP_lim
. You may also include c_mAP_1000
for mAP limited to Recall@1000, and c_mAP_c
limited to mAP at Recall@Max_Num_Samples_Per_Class.Embedding Space Metrics
rho_spectrum@1
. To exclude the k
largest spectral values for a more robust estimate, simply include rho_spectrum@k+1
. Adding rho_spectrum@0
logs the whole singular value distribution, and rho_spectrum@-1
computes KL(q,p) instead of KL(p,q).dists@intra
.dists@inter
.dists@intra_over_inter
.