Best 16 Multi Modal Learning Open Source Projects

An open source implementation of CLIP.

Chinese version of CLIP which achieves Chinese cross-modal retrieval and...

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and T...

The implementation of "Prismer: A Vision-Language Model with Multi-Task ...

A concise but complete implementation of CLIP with various experimental ...

A curated list of Visual Question Answering(VQA)(Image/Video Question An...

[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Tow...

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Pres...

A detection/segmentation dataset with labels characterized by intricate ...

[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Dis...

Pytorch version of the HyperDenseNet deep neural network for multi-modal...

[ICCV 2021] Official implementation of the paper "TRAR: Routing the Atte...

Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Un...

【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object R...