Best 26 Multi Modality Open Source Projects

☁️ Build multimodal AI applications with cloud-native stack

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V...

🏄 Scalable embedding, reasoning, ranking for images and sentences with ...

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Langu...

Simple command line tool for text to image generation using OpenAI's CLI...

Feed PDFs, URLs, Slides, YouTube, and more into Vision-Language models w...

Algorithms and Publications on 3D Object Tracking

Collaborative Diffusion (CVPR 2023)

Effortless plugin and play Optimizer to cut model training costs by 50%...

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint ...

[ICCV2019] Robust Multi-Modality Multi-Object Tracking

Unifying Voxel-based Representation with Transformer for 3D Object Detec...

This repo contains the official code of our work SAM-SLR which won the C...

Seed, Code, Harvest: Grow Your Own App with Tree of Thoughts!

Official code for NeurIPS2023 paper: CoDA: Collaborative Novel Box Disco...