Best 79 Vision And Language Open Source Projects

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transfor...

This repository contains my solutions to the assignments for Stanford's ...

Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT f...

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirection...

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist...

An ever-growing playground of notebooks showcasing CLIP's impressive zer...

[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grou...

Code for the ACL paper "No Metrics Are Perfect: Adversarial Reward Learn...

A PyTorch implementation of VIOLET

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Ima...

PyTorch code for CVPR 2019 paper: The Regretful Agent: Heuristic-Aided N...

Evaluating Vision & Language Pretraining Models with Objects, Attributes...

Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial ...

PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2...

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Mil...