An official implementation for " UniVL: A Unified Video and Language Pre...
[CVPR2022] Official Implementation of ReferFormer
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
[NeurIPS2022] Egocentric Video-Language Pretraining
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
Pytorch code for Language Models with Image Descriptors are Strong Few-S...
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://...
An end-to-end masked contrastive video-and-language pre-training framework