PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Uni...
Chinese version of CLIP which achieves Chinese cross-modal retrieval and...
[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Te...
Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Ou...
PyTorch code for BagFormer: Better Cross-Modal Retrieval via bag-wise in...
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision...
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal S...
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and...
Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Pro...