pytorchvideo.models.memory_bank¶

class pytorchvideo.models.memory_bank.MemoryBank(backbone, mlp=None, neg_size=4096, temperature=0.07, bank_size=1280000, dim=2048, mmt=0.999)[source]¶

Performs Non-Parametric Instance Discrimination for self supervised learning on video. A memory bank is built to keep and update the historical feature embedding and use them for contrastive learning.

The original paper is: Unsupervised Feature Learning via Non-Parametric Instance Discrimination https://arxiv.org/pdf/1805.01978.pdf

More details can be found from the memory bank part in the following paper: Momentum Contrast for Unsupervised Visual Representation Learning https://arxiv.org/pdf/1911.05722.pdf

__init__(backbone, mlp=None, neg_size=4096, temperature=0.07, bank_size=1280000, dim=2048, mmt=0.999)[source]¶

Parameters

backbone (nn.Module) – backbone used to forward the input.
mlp (nn.Module) – multi-layer perception used in memory bank instance discrimination model.
neg_size (int) – size of negative samples per instance.
temperature (float) – temperature to use for contrastive learning.
bank_size (int) – size of the memory bank, expected to be the same size as the training set.
dim (int) – dimension of the channel.
mmt (float) – momentum to use.

Return type

None

forward(x, x_ind)[source]¶

Perform contrastive learning with random sampled negative instance from the: memory bank. During training, update the memory bank with latest feature embedding.

Parameters

x (torch.tensor) – a batch of image with augmentation. The input tensor shape should able to be feed into the backbone.
x_ind (torch.tensor) – the index of the image x from the dataset. Expected shape is B.

Return type

torch.Tensor