Best 35 Multimodality Open Source Projects

A simple command line tool for text to image generation, using OpenAI's ...

Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA...

Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA...

An official implementation for "CLIP4Clip: An Empirical Study of CLIP fo...

[TPAMI 2023] Multimodal Image Synthesis and Editing: The Generative AI Era

Automated modeling and machine learning framework FEDOT

Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG...

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language M...

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language M...

The Cradle framework is a first attempt at General Computer Control (GCC...

A CLI tool/python module for generating images from text using guided di...

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

A knowledge base construction engine for richly formatted data

Sequence-to-Sequence Framework in PyTorch

An official implementation for " UniVL: A Unified Video and Language Pre...