ModelScope: bring the notion of Model-as-a-Service to life.
Implementation / replication of DALL-E, OpenAI's Text to Image Transform...
a state-of-the-art-level open visual language model | 多模态预训练模型
Open Source Routing Engine for OpenStreetMap
Chinese version of CLIP which achieves Chinese cross-modal retrieval and...
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construc...
Represent, send, store and search multimodal data
Video-LLaVA: Learning United Visual Representation by Alignment Before P...
A C#/.NET library to run LLM models (🦙LLaMA/LLaVA) on your local device...
A one-stop data processing system to make data higher-quality, juicier, ...
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Recent Transformer-based CV and related works.
[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified ...
Use PEFT or Full-parameter to fine-tuning LLMs or MLLMs
SALMONN: Speech Audio Language Music Open Neural Network