DeepSpeed is a deep learning optimization library that makes distributed...
Unify Efficient Fine-Tuning of 100+ LLMs
Run Mixtral-8x7B models in Colab or consumer desktops
Decentralized deep learning in PyTorch. Built to train models on thousan...
A TensorFlow Keras implementation of "Modeling Task Relationships in Mul...
Surrogate Modeling Toolbox
From scratch implementation of a sparse mixture of experts language mode...
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
GMoE could be the next backbone model for many kinds of generalization t...
Inferflow is an efficient and highly configurable inference engine for l...
A library for easily merging multiple LLM experts, and efficiently train...
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Fast Inference of MoE Models with CPU-GPU Orchestration
A curated reading list of research in Adaptive Computation, Dynamic Comp...
RealCompo: Dynamic Equilibrium between Realism and Compositionality Impr...