Best 6 Vision Language Transformer Open Source Projects

LAVIS - A One-stop Library for Language-Vision Intelligence

Official implementation of the paper "Grounding DINO: Marrying DINO with...

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Uni...

[ICCV2021 & TPAMI2023] Vision-Language Transformer and Query Generation ...

[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision...

Instruction Following Agents with Multimodal Transforemrs