DeepSpeed is a deep learning optimization library that makes distributed...
Unify Efficient Fine-Tuning of 100+ LLMs
Run Mixtral-8x7B models in Colab or consumer desktops
Decentralized deep learning in PyTorch. Built to train models on thousan...
A TensorFlow Keras implementation of "Modeling Task Relationships in Mul...
Surrogate Modeling Toolbox
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
From scratch implementation of a sparse mixture of experts language mode...
A library for easily merging multiple LLM experts, and efficiently train...
GMoE could be the next backbone model for many kinds of generalization t...
Inferflow is an efficient and highly configurable inference engine for l...
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Fast Inference of MoE Models with CPU-GPU Orchestration
A curated reading list of research in Adaptive Computation, Dynamic Comp...
RealCompo: Dynamic Equilibrium between Realism and Compositionality Impr...