A modular framework for vision & language multimodal research from Faceb...
InternGPT (iGPT) is an open source demo platform where you can easily sh...
Bottom-up attention model for image captioning and VQA, based on Faster ...
The implementation of "Prismer: A Vision-Language Model with Multi-Task ...
Oscar and VinVL
An efficient PyTorch implementation of the winning entry of the 2017 VQA...
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-...
Visual Question Answering in Pytorch
A curated list of Visual Question Answering(VQA)(Image/Video Question An...
Implementation for the paper "Compositional Attention Networks for Machi...
Visual Q&A reading list
PyTorch implementation for the Neuro-Symbolic Concept Learner (NS-CL).
Open-source evaluation toolkit of large vision-language models (LVLMs), ...
PyTorch implementation of "Transparency by Design: Closing the Gap Betwe...