Video Foundation Models & Data for Multimodal Understanding
Reproducible scaling laws for contrastive language-image learning (https...