Best 55 Multi Modal Open Source Projects

ModelScope: bring the notion of Model-as-a-Service to life.

Implementation / replication of DALL-E, OpenAI's Text to Image Transform...

a state-of-the-art-level open visual language model | 多模态预训练模型

Open Source Routing Engine for OpenStreetMap

Chinese version of CLIP which achieves Chinese cross-modal retrieval and...

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construc...

Represent, send, store and search multimodal data

Video-LLaVA: Learning United Visual Representation by Alignment Before P...

A C#/.NET library to run LLM models (🦙LLaMA/LLaVA) on your local device...

A one-stop data processing system to make data higher-quality, juicier, ...

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Recent Transformer-based CV and related works.

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified ...

Use PEFT or Full-parameter to fine-tuning LLMs or MLLMs

SALMONN: Speech Audio Language Music Open Neural Network