[MICCAI 2019] [MEDIA 2020] Models Genesis
A professional list on Large (Language) Models and Foundation Models (LL...
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Tr...
Emu: An Open Multimodal Generalist
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-it...
Official implementation for HyenaDNA, a long-range genomic foundation mo...
PyTorch Implementation of EmerNeRF: Emergent Spatial-Temporal Scene Deco...
Tokenize Anything via Prompting
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
[arXiv 2023] PointLLM: Empowering Large Language Models to Understand Po...
Vision AI Solution Accelerator
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Languag...
Segment-anything related awesome extensions/projects/repos.
Grounded Multimodal Large Language Model with Localized Visual Tokenization
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-trai...