PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-trai...
This repo lists relevant papers summarized in our survey paper: A Syste...
This repository contains examples of fine-tuning Harmonized Landsat and ...
This repo contains evaluation code for the paper "MMMU: A Massive Multi-...
Platform for General Robot Intelligence Development
Official Code for "Language Models as Zero-Shot Planners: Extracting Act...
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robus...
Official Implementation of "GiT: Towards Generalist Vision Transformer t...
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrar...
RS5M: a large-scale vision language dataset for remote sensing
PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
Modular and scalable computational imaging in Python with GPU/out-of-cor...
The official repo for "MTP: Advancing Remote Sensing Foundation Model vi...
A Large Short-video Recommendation Dataset with Raw Text/Audio/Image/Vid...