The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
Thanks @mkmenta !
This is identical to v2.4.0, but includes the LICENSE file which was missing from v2.4.0.
symmetric
flag to SelfSupervisedLoss. If True
, then the embeddings in both embeddings
and ref_emb
are used as anchors. If False
, then only the embeddings in embeddings
are used as anchors. The previous behavior was equivalent to symmetric=False
. Now the default is symmetric=True
, because this is usually what is done in self supervised papers (e.g. SimCLR).You don't have to create labels for self-supervised learning anymore:
from pytorch_metric_learning.losses import SelfSupervisedLoss
loss_func = SelfSupervisedLoss(TripletMarginLoss())
embeddings = model(data)
augmented = model(augmented_data)
loss = loss_func(embeddings, augmented)
Thanks @cwkeam!
The order and naming of arguments has changed.
get_accuracy(
query,
reference,
query_labels,
reference_labels,
embeddings_come_from_same_source=False
)
get_accuracy(
query,
query_labels,
reference=None,
reference_labels=None
ref_includes_query=False
)
The benefits of this change are:
query is reference
, then you only need to pass in query, query_labels
ref_includes_query
is shorter and clearer in meaning than embeddings_come_from_same_source
Some example usage of the new format:
# Accuracy of a query set, where the query set is also the reference set:
get_accuracy(query, query_labels)
# Accuracy of a query set with a separate reference set:
get_accuracy(query, query_labels, ref, ref_labels)
# Accuracy of a query set with a reference set that includes the query set:
get_accuracy(query, query_labels, ref, ref_labels, ref_includes_query=True)
BaseMiner
instead of BaseTupleMiner
Miners must extend BaseMiner
because BaseTupleMiner
no longer exists
enqueue_idx
is now enqueue_mask
Before, enqueue_idx
specified the indices of embeddings
that should be added to the memory bank.
Now, enqueue_mask[i]
should be True
if embeddings[i]
should be added to the memory bank.
The benefit of this change is that it fixed an issue in distributed training.
Here's an example of the new usage:
# enqueue the second half of a batch
enqueue_mask = torch.zeros(batch_size).bool()
enqueue_mask[batch_size/2:] = True
Before:
loss_fn = VICRegLoss()
loss_fn(emb, ref_emb)
Now:
loss_fn = VICRegLoss()
loss_fn(emb, ref_emb=ref_emb)
The reason is that VICRegLoss now uses the forward
method of BaseMetricLossFunction
, to allow for possible generalizations in the future without causing more breaking changes.
mining_funcs
and dataset
have swapped orderThis is to allow mining_funcs
to be optional.
Before if you didn't want to use miners:
MetricLossOnly(
models,
optimizers,
batch_size,
loss_funcs,
mining_funcs = {},
dataset = dataset,
)
Now:
MetricLossOnly(
models,
optimizers,
batch_size,
loss_funcs,
dataset,
)
The following classes/functions were removed
losses.CentroidTripletLoss
(it contained a bug that I don't have time to figure out)miners.BaseTupleMiner
(use miners.BaseMiner
instead)miners.BaseSubsetBatchMiner
(rarely used)miners.MaximumLossMiner
(rarely used)trainers.UnsupervisedEmbeddingsUsingAugmentations
(rarely used)utils.common_functions.Identity
(use torch.nn.Identity
instead)