Best 57 Multimodal Deep Learning Open Source Projects

LAVIS - A One-stop Library for Language-Vision Intelligence

The most flexible way to serve AI/ML models in production - Build Model ...

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language...

A flexible package for multimodal-deep-learning to combine tabular data ...

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

awesome grounding: A curated list of research papers in visual grounding

[TPAMI 2023] Multimodal Image Synthesis and Editing: The Generative AI Era

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Foreca...

NSMusicS，Multi platform Multi mode Music Software ，Electron

This repository contains various models targetting multimodal representa...

A collection of parameter-efficient transfer learning papers focusing on...

Reference mapping for single-cell genomics

收集 ECCV 最新的成果，包括论文、代码和demo视频等，欢迎大家推荐！

List of academic resources on Multimodal ML for Music