📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM...
Shush is an app that deploys a WhisperV3 model with Flash Attention v2 o...