LAVIS - A One-stop Library for Language-Vision Intelligence
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Uni...
InternGPT (iGPT) is an open source demo platform where you can easily sh...
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectur...
Caption-Anything is a versatile tool combining image segmentation, visua...
Bottom-up attention model for image captioning and VQA, based on Faster ...
Simple Swift class to provide all the configurations you need to create ...
The implementation of "Prismer: A Vision-Language Model with Multi-Task ...
Oscar and VinVL
X-modaler is a versatile and high-performance codebase for cross-modal a...
Unofficial pytorch implementation for Self-critical Sequence Training fo...
TensorFlow Implementation of "Show, Attend and Tell"
[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotat...