DANCE: a deep learning library and benchmark platform for single-cell an...
An open source implementation of "Scaling Autoregressive Multi-Modal Mod...
Attention-based multimodal fusion for sentiment analysis
Towards Generalist Biomedical AI
This repo contains evaluation code for the paper "MMMU: A Massive Multi-...
Pytorch implementation of CVPR2020 paper “VectorNet: Encoding HD Maps an...
This repository contains code and metadata of How2 dataset
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SM...
A deep learning framework for building multimodal multi-task learning sy...
This repo contains evaluation code for the paper "Are We on the Right Wa...
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out o...
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object ...
Code on selecting an action based on multimodal inputs. Here in this cas...
A library of transformer models for computer vision and multi-modality r...
中文领域的多模态Bert