Code accompanying our ECCV-2020 paper on 3D Neural Listeners.
As a part of HAKE project (HAKE-3D). Code for our CVPR2020 paper "Detail...
A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Qu...
Robust multimodal integration method implemented in PyTorch and TensorFlow
Referring Video Object Segmentation / Multi-Object Tracking Repo
Compose multimodal datasets 🎹
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)
Code on selecting an action based on multimodal inputs. Here in this cas...
Offical implementation of paper "MSAF: Multimodal Split Attention Fusion"
This repository provides a comprehensive collection of research papers f...
Code for the InterSpeech 2023 paper: MMER: Multimodal Multi-task learnin...
Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches...
The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-t...
A two stage multi-modal loss model along with rigid body transformations...
Learning Cross-Modal Retrieval with Noisy Labels (CVPR 2021, PyTorch Code)