[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Languag...
ChatGPT爆火,开启了通往AGI的关键一步,本项目旨在汇总那些ChatGPT的开源平...
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference wit...
Open-source evaluation toolkit of large vision-language models (LVLMs), ...
Paddle Multimodal Integration and eXploration, supporting mainstream mul...
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robus...