Best 173 Multimodal Open Source Projects

Foundation Architecture for (M)LLMs

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Represent, send, store and search multimodal data

SDK for interacting with stability.ai APIs (e.g. stable diffusion infere...

Generative AI SDK for Web to build AI Assistants for apps built with Jav...

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectur...

Easily compute clip embeddings and build a clip retrieval system with them

Conversational AI SDK for iOS to enable text and voice conversations wit...

mPLUG-Owl & mPLUG-Owl2: Modularized Multimodal Large Language Model

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

Conversational AI SDK for Android to enable text and voice conversations...

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Per...

Actionable AI SDK for Flutter to enable text and voice conversations wit...

Actionable AI SDK for Ionic to enable text and voice conversations with ...

Meta-Transformer for Unified Multimodal Learning