An official implementation for "CLIP4Clip: An Empirical Study of CLIP fo...
An official implementation for " UniVL: A Unified Video and Language Pre...
Pytorch code for Language Models with Image Descriptors are Strong Few-S...
A PyTorch implementation of state of the art video captioning models fro...
Source code for Semantics-Assisted Video Captioning Model Trained with S...
An end-to-end masked contrastive video-and-language pre-training framework