In StructuredMatrix interface, construct_from_elements now takes
point geometry for faster HSS compression
In StructuredMatrix interface, BLR compress_and_factor is now
supported
v7.1.4
10 months ago
Memory leak fix from MPI_Datatypes
Set RPATH in strumpack library
Add counting of subnormal number in the sparse factors
Changes in the BLR compression tolerances in the sparse solver,
using a scaled absolute tolerance
Fixes in the sparse solver using MAGMA, resetting error codes
which show up in larger systems, but are not actually errors.
Set SOVERSION
Update CMake for HIP, using enable_language(HIP), requires CMake 3.21
v7.1.3
1 year ago
Workaround for SLATE <= 20220700
v7.1.2
1 year ago
Small bugfix
v7.1.1
1 year ago
ROCm compilation fix
v7.1.0
1 year ago
Bugfix in matrix equilibration code
Several bugfixes, especially for SLATE and large problems on GPU
Sparse triangular solve on the GPU when using MAGMA
Other MAGMA fixes for the sparse direct solver (MAGMA still optional)
New HSS random sketching operators based on sparse Johnson Lindestrauss
Fix for HODLRMatrix construction from elements, or blocks
Compilation fixes for NVHPC compiler
SYCL updates
Add lapmr routines, which are not available in Mac LAPACK implementations
Support newer ( >= 1.0) ZFP versions
Fixes for clang 15
Add NDBFS GPU matrix ordering code
v7.0.1
1 year ago
v7.0.0
1 year ago
Many bugfixes and general improvements.
Important fixes in the GPU code, and in the usage of SLATE
(GPU capable ScaLAPACK replacement).
The default ordering now uses METIS_NodeND, instead of the
(undocumented) METIS_NodeNDP routine. This can impact performance,
or for some problems lead to stack overflow, but for others it
drastically reduces memory usage. The old behavior can be restored
with --sp_enable_METIS_NodeNDP.
v6.3.1
2 years ago
Fix for setting CUDA/HIP device when there are multiple, but MPI was not initialized
Memory leak fix in distributed memory GPU code
Fixed small memory leaks from MPI datatypes
Change in BLR algorithm selection options
Changed default blocksize for 2D block cyclic distribution when using SLATE to 512
Add 64bit support in the matching (MC64)
Fix installation of Fortran modules
v6.3.0
2 years ago
Change default sparsity reducing ordering to use METIS_NodeNDP
(from METIS_NodeND)
Significant performance improvements in the GPU code for the
direct solver, from the NERSC December 2021 Hackathon
Performance fix in symbolic phase
(affecting only some MPI implementations)
Bump minimum CMake version to 3.17
CMake fix for Perlmutter
Compilation fix for GCC 8
Add support for single precision HODLR/Butterfly
(now requiring ButterflyPACK >= 2.1.0)