Multimodal GPT Save

Multimodal-GPT

Project README

🤖 Multi-modal GPT

Train a multi-modal chatbot with visual and language instructions!

Based on the open-source multi-modal model OpenFlamingo, we create various visual instruction data with open datasets, including VQA, Image Captioning, Visual Reasoning, Text OCR, and Visual Dialogue. Additionally, we also train the language model component of OpenFlamingo using only language-only instruction data.

The joint training of visual and language instructions effectively improves the performance of the model! For more details please refer to our technical report.

Welcome to join us!

Open Source Agenda is not affiliated with "Multimodal GPT" Project. README Source: open-mmlab/Multimodal-GPT

Stars

1,402

Open Issues

Last Commit

10 months ago

Repository

open-mmlab/Multimodal-GPT

License

Apache-2.0

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/multimodal-gpt"><img src="https://www.opensourceagenda.com/projects/multimodal-gpt/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022