Pure Javascript manually written :ok_hand: implementation of BLAS, Man...
Kokkos C++ Performance Portability Programming Ecosystem: Math Kernels -...
The HPC toolbox: fused matrix multiplication, convolution, data-parallel...
Mir (backports): Sparse tensors, Hoffman
💥 Fast matrix-multiplication as a self-contained Python library – no sy...
Stretching GPU performance for GEMMs and tensor contractions.
monolish: MONOlithic LInear equation Solvers for Highly-parallel archite...
A library of fortran modules and routines for scientific calculations (*...
Sparse matrix formats for linear algebra supporting scientific and machi...
DBCSR: Distributed Block Compressed Sparse Row matrix library
Divide and Conquer Linear Algebra
A compile-time linear algebra system for C++
Single file libraries for C/C++
[Experimental] LLVM-accelerated Generic Linear Algebra Subprograms
ROCm BLAS marshalling library