Rmm Versions Save

RAPIDS Memory Manager

v24.04.00

1 month ago

🚨 Breaking Changes

  • Accept stream argument in DeviceMemoryResource allocate/deallocate (#1494) @wence-
  • Replace all internal usage of get_upstream with get_upstream_resource (#1491) @miscco
  • Deprecate rmm::mr::device_memory_resource::supports_streams() (#1452) @harrism
  • Remove deprecated rmm::detail::available_device_memory (#1438) @harrism
  • Make device_memory_resource::supports_streams() not pure virtual. Remove derived implementations and calls in RMM (#1437) @harrism
  • Deprecate rmm::mr::device_memory_resource::get_mem_info() and supports_get_mem_info(). (#1436) @harrism

🐛 Bug Fixes

  • Fix search path for torch allocator in editable installs and ensure CUDA support is available (#1498) @vyasr
  • Accept stream argument in DeviceMemoryResource allocate/deallocate (#1494) @wence-
  • Run STATISTICS_TEST and TRACKING_TEST in serial to avoid OOM errors. (#1487) @bdice

📖 Documentation

  • Pin to recent breathe, to prevent getting an unsupported sphinx version. (#1495) @bdice

🚀 New Features

  • Replace all internal usage of get_upstream with get_upstream_resource (#1491) @miscco
  • Add complete set of resource ref aliases (#1479) @nvdbaranec
  • Automate include grouping using clang-format (#1463) @harrism
  • Add get_upstream_resource to resource adaptors (#1456) @miscco
  • Deprecate rmm::mr::device_memory_resource::supports_streams() (#1452) @harrism
  • Remove duplicated memory_resource_tests (#1451) @miscco
  • Change rmm::exec_policy to take async_resource_ref (#1449) @miscco
  • Change device_scalar to take async_resource_ref (#1447) @miscco
  • Add device_async_resource_ref convenience alias (#1441) @harrism
  • Remove deprecated rmm::detail::available_device_memory (#1438) @harrism
  • Make device_memory_resource::supports_streams() not pure virtual. Remove derived implementations and calls in RMM (#1437) @harrism
  • Deprecate rmm::mr::device_memory_resource::get_mem_info() and supports_get_mem_info(). (#1436) @harrism
  • Support CUDA 12.2 (#1419) @jameslamb

🛠️ Improvements

  • Use conda env create --yes instead of --force (#1509) @bdice
  • Add upper bound to prevent usage of NumPy 2 (#1501) @bdice
  • Remove hard-coding of RAPIDS version where possible (#1496) @KyleFromNVIDIA
  • Requre NumPy 1.23+ (#1488) @jakirkham
  • Use rmm::device_async_resource_ref in multi_stream_allocation benchmark (#1482) @miscco
  • Update devcontainers to CUDA Toolkit 12.2 (#1470) @trxcllnt
  • Add support for Python 3.11 (#1469) @jameslamb
  • target branch-24.04 for GitHub Actions workflows (#1468) @jameslamb
  • [FEA]: Use std::optional instead of thrust::optional (#1464) @miscco
  • Add environment-agnostic scripts for running ctests and pytests (#1462) @trxcllnt
  • Ensure that ctest is called with --no-tests=error. (#1460) @bdice
  • Update ops-bot.yaml (#1458) @AyodeAwe
  • Adopt the rmm::device_async_resource_ref alias (#1454) @miscco
  • Refactor error.hpp out of detail (#1439) @lamarrr

v24.06.00a

1 month ago

🚨 Breaking Changes

  • Remove deprecated cuda_async_memory_resource constructor that takes thrust::optional parameters (#1535) @harrism
  • Remove deprecated supports_streams and get_mem_info methods. (#1519) @harrism

🐛 Bug Fixes

  • Explicitly use the current device resource in DeviceBuffer (#1514) @wence-

📖 Documentation

  • Update multi-gpu discussion for device_buffer and device_vector dtors (#1524) @wence-
  • Fix ordering / heading levels in README.md and python example in guide.md (#1513) @harrism

🚀 New Features

  • Always use a static gtest (#1532) @robertmaynard
  • Remove deprecated supports_streams and get_mem_info methods. (#1519) @harrism

🛠️ Improvements

  • Remove deprecated cuda_async_memory_resource constructor that takes thrust::optional parameters (#1535) @harrism
  • Make thrust_allocator deallocate safe in multi-device setting (#1533) @wence-
  • Move rmm Python package to subdirectory (#1526) @vyasr
  • Remove a file not being used (#1521) @galipremsagar
  • Enable all tests for arm arch (#1510) @galipremsagar

v24.02.00

3 months ago

🚨 Breaking Changes

  • Make device_memory_resource::do_get_mem_info() and supports_get_mem_info() not pure virtual. Remove derived implementations and calls in RMM (#1430) @harrism
  • Deprecate detail::available_device_memory, most detail/aligned.hpp utilities, and optional pool_memory_resource initial size (#1424) @harrism
  • Require explicit pool size in pool_memory_resource and move some things out of detail namespace (#1417) @harrism
  • Remove HTML builds of librmm (#1415) @vyasr
  • Update to CCCL 2.2.0. (#1404) @bdice
  • Switch to scikit-build-core (#1287) @vyasr

🐛 Bug Fixes

  • Exclude tests from builds (#1459) @vyasr
  • Update CODEOWNERS (#1410) @raydouglass
  • Correct signatures for torch allocator plug in (#1407) @wence-
  • Fix Arena MR to support simultaneous access by PTDS and other streams (#1395) @tgravescs
  • Fix else-after-throw clang tidy error (#1391) @harrism

📖 Documentation

  • remove references to setup.py in docs (#1420) @jameslamb
  • Remove HTML builds of librmm (#1415) @vyasr
  • Update GPU support docs to drop Pascal (#1413) @harrism

🚀 New Features

  • Make device_memory_resource::do_get_mem_info() and supports_get_mem_info() not pure virtual. Remove derived implementations and calls in RMM (#1430) @harrism
  • Deprecate detail::available_device_memory, most detail/aligned.hpp utilities, and optional pool_memory_resource initial size (#1424) @harrism
  • Add a host-pinned memory resource that can be used as upstream for pool_memory_resource. (#1392) @harrism

🛠️ Improvements

  • Remove usages of rapids-env-update (#1423) @KyleFromNVIDIA
  • Refactor CUDA versions in dependencies.yaml. (#1422) @bdice
  • Require explicit pool size in pool_memory_resource and move some things out of detail namespace (#1417) @harrism
  • Update dependencies.yaml to support CUDA 12.*. (#1414) @bdice
  • Define python dependency range as a matrix fallback. (#1409) @bdice
  • Use latest cuda-python within CUDA major version. (#1406) @bdice
  • Update to CCCL 2.2.0. (#1404) @bdice
  • Remove RMM_BUILD_WHEELS and standardize Python builds (#1401) @vyasr
  • Update to fmt 10.1.1 and spdlog 1.12.0. (#1374) @bdice
  • Switch to scikit-build-core (#1287) @vyasr

v24.04.00a

3 months ago

🚨 Breaking Changes

  • Accept stream argument in DeviceMemoryResource allocate/deallocate (#1494) @wence-
  • Replace all internal usage of get_upstream with get_upstream_resource (#1491) @miscco
  • Deprecate rmm::mr::device_memory_resource::supports_streams() (#1452) @harrism
  • Remove deprecated rmm::detail::available_device_memory (#1438) @harrism
  • Make device_memory_resource::supports_streams() not pure virtual. Remove derived implementations and calls in RMM (#1437) @harrism
  • Deprecate rmm::mr::device_memory_resource::get_mem_info() and supports_get_mem_info(). (#1436) @harrism

🐛 Bug Fixes

  • Fix search path for torch allocator in editable installs and ensure CUDA support is available (#1498) @vyasr
  • Accept stream argument in DeviceMemoryResource allocate/deallocate (#1494) @wence-
  • Run STATISTICS_TEST and TRACKING_TEST in serial to avoid OOM errors. (#1487) @bdice

📖 Documentation

  • Pin to recent breathe, to prevent getting an unsupported sphinx version. (#1495) @bdice

🚀 New Features

  • Replace all internal usage of get_upstream with get_upstream_resource (#1491) @miscco
  • Add complete set of resource ref aliases (#1479) @nvdbaranec
  • Automate include grouping using clang-format (#1463) @harrism
  • Add get_upstream_resource to resource adaptors (#1456) @miscco
  • Deprecate rmm::mr::device_memory_resource::supports_streams() (#1452) @harrism
  • Remove duplicated memory_resource_tests (#1451) @miscco
  • Change rmm::exec_policy to take async_resource_ref (#1449) @miscco
  • Change device_scalar to take async_resource_ref (#1447) @miscco
  • Add device_async_resource_ref convenience alias (#1441) @harrism
  • Remove deprecated rmm::detail::available_device_memory (#1438) @harrism
  • Make device_memory_resource::supports_streams() not pure virtual. Remove derived implementations and calls in RMM (#1437) @harrism
  • Deprecate rmm::mr::device_memory_resource::get_mem_info() and supports_get_mem_info(). (#1436) @harrism
  • Support CUDA 12.2 (#1419) @jameslamb

🛠️ Improvements

  • Use conda env create --yes instead of --force (#1509) @bdice
  • Add upper bound to prevent usage of NumPy 2 (#1501) @bdice
  • Remove hard-coding of RAPIDS version where possible (#1496) @KyleFromNVIDIA
  • Requre NumPy 1.23+ (#1488) @jakirkham
  • Use rmm::device_async_resource_ref in multi_stream_allocation benchmark (#1482) @miscco
  • Update devcontainers to CUDA Toolkit 12.2 (#1470) @trxcllnt
  • Add support for Python 3.11 (#1469) @jameslamb
  • target branch-24.04 for GitHub Actions workflows (#1468) @jameslamb
  • [FEA]: Use std::optional instead of thrust::optional (#1464) @miscco
  • Add environment-agnostic scripts for running ctests and pytests (#1462) @trxcllnt
  • Ensure that ctest is called with --no-tests=error. (#1460) @bdice
  • Update ops-bot.yaml (#1458) @AyodeAwe
  • Adopt the rmm::device_async_resource_ref alias (#1454) @miscco
  • Refactor error.hpp out of detail (#1439) @lamarrr

v23.12.00

5 months ago

🚨 Breaking Changes

  • Document minimum CUDA version of 11.4 (#1385) @harrism
  • Store and set the correct CUDA device in device_buffer (#1370) @harrism
  • Use cuda::mr::memory_resource instead of raw device_memory_resource (#1095) @miscco

🐛 Bug Fixes

  • Update actions/labeler to v4 (#1397) @raydouglass
  • Backport arena MR fix for simultaneous access by PTDS and other streams (#1396) @bdice
  • Deliberately leak PTDS thread_local events in stream ordered mr (#1375) @wence-
  • Add missing CUDA 12 dependencies and fix dlopen library names (#1366) @vyasr

📖 Documentation

  • Document minimum CUDA version of 11.4 (#1385) @harrism
  • Fix more doxygen issues (#1367) @vyasr
  • Add groups to the doxygen docs (#1358) @vyasr
  • Enable doxygen XML and fix issues (#1348) @vyasr

🚀 New Features

  • Make internally stored default argument values public (#1373) @vyasr
  • Store and set the correct CUDA device in device_buffer (#1370) @harrism
  • Update rapids-cmake functions to non-deprecated signatures (#1357) @robertmaynard
  • Generate unified Python/C++ docs (#1324) @vyasr
  • Use cuda::mr::memory_resource instead of raw device_memory_resource (#1095) @miscco

🛠️ Improvements

  • Silence false gcc warning (#1381) @miscco
  • Build concurrency for nightly and merge triggers (#1380) @bdice
  • Update shared-action-workflows references (#1363) @AyodeAwe
  • Use branch-23.12 workflows. (#1360) @bdice
  • Update devcontainers to 23.12 (#1355) @raydouglass
  • Generate proper, consistent nightly versions for pip and conda packages (#1347) @vyasr
  • RMM: Build CUDA 12.0 ARM conda packages. (#1330) @bdice

v24.02.00a

5 months ago

🚨 Breaking Changes

  • Make device_memory_resource::do_get_mem_info() and supports_get_mem_info() nonvirtual. Remove derived implementations and calls in RMM (#1430) @harrism
  • Deprecate detail::available_device_memory, most detail/aligned.hpp utilities, and optional pool_memory_resource initial size (#1424) @harrism
  • Require explicit pool size in pool_memory_resource and move some things out of detail namespace (#1417) @harrism
  • Remove HTML builds of librmm (#1415) @vyasr
  • Update to CCCL 2.2.0. (#1404) @bdice
  • Switch to scikit-build-core (#1287) @vyasr

🐛 Bug Fixes

  • Update CODEOWNERS (#1410) @raydouglass
  • Correct signatures for torch allocator plug in (#1407) @wence-
  • Fix Arena MR to support simultaneous access by PTDS and other streams (#1395) @tgravescs
  • Fix else-after-throw clang tidy error (#1391) @harrism

📖 Documentation

  • remove references to setup.py in docs (#1420) @jameslamb
  • Remove HTML builds of librmm (#1415) @vyasr
  • Update GPU support docs to drop Pascal (#1413) @harrism

🚀 New Features

  • Make device_memory_resource::do_get_mem_info() and supports_get_mem_info() nonvirtual. Remove derived implementations and calls in RMM (#1430) @harrism
  • Deprecate detail::available_device_memory, most detail/aligned.hpp utilities, and optional pool_memory_resource initial size (#1424) @harrism
  • Add a host-pinned memory resource that can be used as upstream for pool_memory_resource. (#1392) @harrism

🛠️ Improvements

  • Remove usages of rapids-env-update (#1423) @KyleFromNVIDIA
  • Refactor CUDA versions in dependencies.yaml. (#1422) @bdice
  • Require explicit pool size in pool_memory_resource and move some things out of detail namespace (#1417) @harrism
  • Update dependencies.yaml to support CUDA 12.*. (#1414) @bdice
  • Define python dependency range as a matrix fallback. (#1409) @bdice
  • Use latest cuda-python within CUDA major version. (#1406) @bdice
  • Update to CCCL 2.2.0. (#1404) @bdice
  • Remove RMM_BUILD_WHEELS and standardize Python builds (#1401) @vyasr
  • Update to fmt 10.1.1 and spdlog 1.12.0. (#1374) @bdice
  • Switch to scikit-build-core (#1287) @vyasr

v23.10.00

7 months ago

🚨 Breaking Changes

  • Update to Cython 3.0.0 (#1313) @vyasr

🐛 Bug Fixes

  • Compile cdef public functions from torch_allocator with C ABI (#1350) @wence-
  • Make doxygen only a conda dependency. (#1344) @bdice
  • Use conda mambabuild not mamba mambabuild (#1338) @wence-
  • Fix stream_ordered_memory_resource attempt to record event in stream from another device (#1333) @harrism

📖 Documentation

  • Clean up headers in CMakeLists.txt. (#1341) @bdice
  • Add pre-commit hook to validate doxygen (#1334) @vyasr
  • Fix doxygen warnings (#1317) @vyasr
  • Treat warnings as errors in Python documentation (#1316) @vyasr

🚀 New Features

  • Enable RMM Debug Logging via Python (#1339) @harrism

🛠️ Improvements

  • Update image names (#1346) @AyodeAwe
  • Update to clang 16.0.6. (#1343) @bdice
  • Update doxygen to 1.9.1 (#1337) @vyasr
  • Simplify wheel build scripts and allow alphas of RAPIDS dependencies (#1335) @divyegala
  • Use copy-pr-bot (#1329) @ajschmidt8
  • Add RMM devcontainers (#1328) @trxcllnt
  • Add Python bindings for limiting_resource_adaptor (#1327) @pentschev
  • Fix missing jQuery error in docs (#1321) @AyodeAwe
  • Use fetch_rapids.cmake. (#1319) @bdice
  • Update to Cython 3.0.0 (#1313) @vyasr
  • Branch 23.10 merge 23.08 (#1312) @vyasr
  • Branch 23.10 merge 23.08 (#1309) @vyasr

v23.12.00a

7 months ago

🚨 Breaking Changes

  • Store and set the correct CUDA device in device_buffer (#1370) @harrism
  • Use cuda::mr::memory_resource instead of raw device_memory_resource (#1095) @miscco

🐛 Bug Fixes

  • Deliberately leak PTDS thread_local events in stream ordered mr (#1375) @wence-
  • Add missing CUDA 12 dependencies and fix dlopen library names (#1366) @vyasr

📖 Documentation

  • Fix more doxygen issues (#1367) @vyasr
  • Add groups to the doxygen docs (#1358) @vyasr
  • Enable doxygen XML and fix issues (#1348) @vyasr

🚀 New Features

  • Make internally stored default argument values public (#1373) @vyasr
  • Store and set the correct CUDA device in device_buffer (#1370) @harrism
  • Update rapids-cmake functions to non-deprecated signatures (#1357) @robertmaynard
  • Generate unified Python/C++ docs (#1324) @vyasr
  • Use cuda::mr::memory_resource instead of raw device_memory_resource (#1095) @miscco

🛠️ Improvements

  • Silence false gcc warning (#1381) @miscco
  • Build concurrency for nightly and merge triggers (#1380) @bdice
  • Update shared-action-workflows references (#1363) @AyodeAwe
  • Use branch-23.12 workflows. (#1360) @bdice
  • Update devcontainers to 23.12 (#1355) @raydouglass
  • Generate proper, consistent nightly versions for pip and conda packages (#1347) @vyasr
  • RMM: Build CUDA 12.0 ARM conda packages. (#1330) @bdice

v23.08.00

9 months ago

🚨 Breaking Changes

  • Stop invoking setup.py (#1300) @vyasr
  • Remove now-deprecated top-level allocator functions (#1281) @wence-
  • Remove padding from device_memory_resource (#1278) @vyasr

🐛 Bug Fixes

  • Fix typo in wheels-test.yaml. (#1310) @bdice
  • Add a missing '#include <array>' in logger.hpp (#1295) @valgur
  • Use gbench thread_index() accessor to fix replay bench compilation (#1293) @harrism
  • Ensure logger tests don't generate temp directories in build dir (#1289) @robertmaynard

🚀 New Features

  • Remove now-deprecated top-level allocator functions (#1281) @wence-

🛠️ Improvements

  • Switch to new CI wheel building pipeline (#1305) @vyasr
  • Revert CUDA 12.0 CI workflows to branch-23.08. (#1303) @bdice
  • Update linters: remove flake8, add ruff, update cython-lint (#1302) @vyasr
  • Adding identify minimum version requirement (#1301) @hyperbolic2346
  • Stop invoking setup.py (#1300) @vyasr
  • Use cuda-version to constrain cudatoolkit. (#1296) @bdice
  • Update to CMake 3.26.4 (#1291) @vyasr
  • use rapids-upload-docs script (#1288) @AyodeAwe
  • Reorder parameters in RMM_EXPECTS (#1286) @vyasr
  • Remove documentation build scripts for Jenkins (#1285) @ajschmidt8
  • Remove padding from device_memory_resource (#1278) @vyasr
  • Unpin scikit-build upper bound (#1275) @vyasr
  • RMM: Build CUDA 12 packages (#1223) @bdice

v23.10.00a

9 months ago

🚨 Breaking Changes

  • Update to Cython 3.0.0 (#1313) @vyasr

📖 Documentation

  • Treat warnings as errors in Python documentation (#1316) @vyasr

🛠️ Improvements

  • Add Python bindings for limiting_resource_adaptor (#1327) @pentschev
  • Fix missing jQuery error in docs (#1321) @AyodeAwe
  • Use fetch_rapids.cmake. (#1319) @bdice
  • Update to Cython 3.0.0 (#1313) @vyasr
  • Branch 23.10 merge 23.08 (#1312) @vyasr
  • Branch 23.10 merge 23.08 (#1309) @vyasr