An official implementation for "CLIP4Clip: An Empirical Study of CLIP fo...
An VideoQA dataset based on the videos from ActivityNet
Adversarial Background-Aware Loss for Weakly-supervised Temporal Activit...
An end-to-end masked contrastive video-and-language pre-training framework