ModelScope: bring the notion of Model-as-a-Service to life.
Implementation / replication of DALL-E, OpenAI's Text to Image Transform...
a state-of-the-art-level open visual language model | 多模态预训练模型
Open Source Routing Engine for OpenStreetMap
Chinese version of CLIP which achieves Chinese cross-modal retrieval and...
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construc...
Represent, send, store and search multimodal data
Video-LLaVA: Learning United Visual Representation by Alignment Before P...
Start building LLM-empowered multi-agent applications in an easier way.
A C#/.NET library to run LLM models (🦙LLaMA/LLaVA) on your local device...
A one-stop data processing system to make data higher-quality, juicier, ...
ms-swift: Use PEFT or Full-parameter to finetune 200+ LLMs or 15+ MLLMs
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Recent Transformer-based CV and related works.
[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified ...