Best 52 Multimodal Learning Open Source Projects

Papers and resources on Controllable Generation using Diffusion Models, ...

List of academic resources on Multimodal ML for Music

This repo contains evaluation code for the paper "MMMU: A Massive Multi-...

ICASSP 2023-2024 Papers: A complete collection of influential and exciti...

[CVPR 2022] Show Me What and Tell Me How: Video Synthesis via Multimodal...

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transfor...

A PyTorch implementation of "Multimodal Generative Models for Scalable W...

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirection...

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist...

Multimodal Prompting with Missing Modalities for Visual Recognition, CVP...

Interface for easier topic modelling.

Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SM...

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Mil...

[AAAI 2018] Memory Fusion Network for Multi-view Sequential Learning

Implementation of CVPR 2020 paper "MMTM: Multimodal Transfer Module for ...