Thrust Versions Save

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

2.1.0

1 year ago

New Features

  • NVIDIA/thrust#1805: Add default constructors to transform_output_iterator and transform_input_output_iterator. Thanks to Mark Harris (@harrism) for this contribution.
  • NVIDIA/thrust#1836: Enable constructions of vectors from std::initializer_list.

Bug Fixes

  • NVIDIA/thrust#1768: Fix type conversion warning in the thrust::complex utilities. Thanks to Zishi Wu (@zishiwu123) for this contribution.
  • NVIDIA/thrust#1809: Fix some warnings about usage of __host__ functions in __device__ code.
  • NVIDIA/thrust#1825: Fix Thrust’s CMake install rules. Thanks to Robert Maynard (@robertmaynard) for this contribution.
  • NVIDIA/thrust#1827: Fix thrust::reduce_by_key when using non-default-initializable iterators.
  • NVIDIA/thrust#1832: Fix bug in device-side CDP thrust::reduce when using a large number of inputs.

Other Enhancements

  • NVIDIA/thrust#1815: Update Thrust’s libcu++ git submodule to version 1.8.1.
  • NVIDIA/thrust#1841: Fix invalid code in execution policy documentation example. Thanks to Raphaël Frantz (@Eren121) for this contribution.
  • NVIDIA/thrust#1848: Improve error messages when attempting to launch a kernel on a device that is not supported by compiled PTX versions. Thanks to Zahra Khatami (@zkhatami) for this contribution.
  • NVIDIA/thrust#1855: Remove usage of deprecated CUDA error codes.

2.0.1

1 year ago

Other Enhancements

  • Disable CDP parallelization of device-side invocations of Thrust algorithms on SM90+. The removal of device-side synchronization support in recent architectures makes Thrust’s fork-join model unimplementable on device, so a serial implementation will be used instead. Host-side invocations of Thrust algorithms are not affected.

1.17.2

1 year ago

Summary

Thrust 1.17.2 is a minor bugfix release that provides an updated version of CUB.

2.0.0

1 year ago

Summary

The Thrust 2.0.0 major release adds a dependency on libcu++ and contains several breaking changes. These include new diagnostics when inspecting device-only lambdas from the host, removal of the cub symlink in the Thrust repository root, and removal of the deprecated THRUST_*_BACKEND macros. It also includes several minor bugfixes and cleanups.

Breaking Changes

  • NVIDIA/thrust#1605: Add libcu++ dependency.
    • A suitable version of libcu++ is provided through the ${THRUST_ROOT}/dependencies/libcudacxx/ submodule.
    • Non-cmake users may need to add the libcu++ include path to their builds (-I ${THRUST_ROOT}/dependencies/libcudacxx/include/).
    • The Thrust CMake packages have been updated to add this include path.
  • NVIDIA/thrust#1605: The following macros are no longer defined by default. They can be re-enabled by defining THRUST_PROVIDE_LEGACY_ARCH_MACROS. These will be removed completely in a future release.
    • THRUST_IS_HOST_CODE: Replace with NV_IF_TARGET.
    • THRUST_IS_DEVICE_CODE: Replace with NV_IF_TARGET.
    • THRUST_INCLUDE_HOST_CODE: Replace with NV_IF_TARGET.
    • THRUST_INCLUDE_DEVICE_CODE: Replace with NV_IF_TARGET.
    • THRUST_DEVICE_CODE: Replace with NV_IF_TARGET.
  • NVIDIA/thrust#1661: Thrust’s CUDA Runtime support macros have been updated to support NV_IF_TARGET. They are now defined consistently across all host/device compilation passes. This should not affect most usages of these macros, but may require changes for some edge cases.
    • THRUST_RUNTIME_FUNCTION: Execution space annotations for functions that invoke CUDA Runtime APIs.
      • Old behavior:
        • RDC enabled: Defined to __host__ __device__
        • RDC not enabled:
          • NVCC host pass: Defined to __host__ __device__
          • NVCC device pass: Defined to __host__
      • New behavior:
        • RDC enabled: Defined to __host__ __device__
        • RDC not enabled: Defined to __host__
    • __THRUST_HAS_CUDART__: No change in behavior, but no longer used in Thrust. Provided for legacy support only. Legacy behavior:
      • RDC enabled: Defined to 1.
      • RDC not enabled:
        • NVCC host pass: Defined to 1.
        • NVCC device pass: Defined to 0.
    • THRUST_RDC_ENABLED: New macro, may be combined with NV_IF_TARGET to replace most usages of __THRUST_HAS_CUDART__. Behavior:
      • RDC enabled: Macro is defined.
      • RDC not enabled: Macro is not defined.
  • NVIDIA/thrust#1701: Remove the cub symlink from the root of the Thrust repository.
    • This symlink caused issues in certain build environments (e.g. NVIDIA/thrust#1328).
    • Builds that relied on this symlink will need to add the full CUB include path (-I ${THRUST_ROOT}/dependencies/cub).
    • CMake builds that use the Thrust packages via CPM, add_subdirectory, or find_package are not affected.
  • NVIDIA/thrust#1760: A compile-time error is now emitted when a __device__-only lambda’s return type is queried from host code (requires libcu++ ≥ 1.9.0).
    • Due to limitations in the CUDA programming model, the result of this query is unreliable, and will silently return an incorrect result. This leads to difficult to debug errors.
    • When using libcu++ 1.9.0, an error will be emitted with information about work-arounds:
      • Use a named function object with a __device__-only implementation of operator().
      • Use a __host__ __device__ lambda.
      • Use cuda::proclaim_return_type (Added in libcu++ 1.9.0)
  • NVIDIA/thrust#1761: Removed support for deprecated THRUST_DEVICE_BACKEND and THRUST_HOST_BACKEND macros. The THRUST_DEVICE_SYSTEM and THRUST_HOST_SYSTEM macros should be used instead.

Bug Fixes

  • NVIDIA/thrust#1605: Fix some execution space warnings in the allocator library.
  • NVIDIA/thrust#1683: Fix bug in iterator_category_to_traversal metafunctions.
  • NVIDIA/thrust#1715: Add missing __thrust_exec_check_disable__ annotation to thrust::make_zip_function. Thanks to @mfbalin for this contribution.
  • NVIDIA/thrust#1722: Remove CUDA-specific error handler from code that may be executed on non-CUDA backends. Thanks to @dkolsen-pgi for this contribution.
  • NVIDIA/thrust#1756: Fix copy_if for output iterators that don’t support copy assignment. Thanks for @mfbalin for this contribution.

Other Enhancements

  • NVIDIA/thrust#1605: Removed special case code for unsupported CUDA architectures.
  • NVIDIA/thrust#1605: Replace several usages of __CUDA_ARCH__ with <nv/target> to handle host/device code divergence.
  • NVIDIA/thrust#1752: Remove a leftover merge conflict from a documentation file. Thanks to @tabedzki for this contribution.

2.0.0-rc2

1 year ago

Summary

The Thrust 2.0.0 major release adds a dependency on libcu++ and contains several breaking changes. These include new diagnostics when inspecting device-only lambdas from the host, removal of the cub symlink in the Thrust repository root, and removal of the deprecated THRUST_*_BACKEND macros. It also includes several minor bugfixes and cleanups.

Breaking Changes

  • NVIDIA/thrust#1605: Add libcu++ dependency.
    • A suitable version of libcu++ is provided through the ${THRUST_ROOT}/dependencies/libcudacxx/ submodule.
    • Non-cmake users may need to add the libcu++ include path to their builds (-I ${THRUST_ROOT}/dependencies/libcudacxx/include/).
    • The Thrust CMake packages have been updated to add this include path.
  • NVIDIA/thrust#1605: The following macros are no longer defined by default. They can be re-enabled by defining THRUST_PROVIDE_LEGACY_ARCH_MACROS. These will be removed completely in a future release.
    • THRUST_IS_HOST_CODE: Replace with NV_IF_TARGET.
    • THRUST_IS_DEVICE_CODE: Replace with NV_IF_TARGET.
    • THRUST_INCLUDE_HOST_CODE: Replace with NV_IF_TARGET.
    • THRUST_INCLUDE_DEVICE_CODE: Replace with NV_IF_TARGET.
    • THRUST_DEVICE_CODE: Replace with NV_IF_TARGET.
  • NVIDIA/thrust#1661: Thrust’s CUDA Runtime support macros have been updated to support NV_IF_TARGET. They are now defined consistently across all host/device compilation passes. This should not affect most usages of these macros, but may require changes for some edge cases.
    • THRUST_RUNTIME_FUNCTION: Execution space annotations for functions that invoke CUDA Runtime APIs.
      • Old behavior:
        • RDC enabled: Defined to __host__ __device__
        • RDC not enabled:
          • NVCC host pass: Defined to __host__ __device__
          • NVCC device pass: Defined to __host__
      • New behavior:
        • RDC enabled: Defined to __host__ __device__
        • RDC not enabled: Defined to __host__
    • __THRUST_HAS_CUDART__: No change in behavior, but no longer used in Thrust. Provided for legacy support only. Legacy behavior:
      • RDC enabled: Defined to 1.
      • RDC not enabled:
        • NVCC host pass: Defined to 1.
        • NVCC device pass: Defined to 0.
    • THRUST_RDC_ENABLED: New macro, may be combined with NV_IF_TARGET to replace most usages of __THRUST_HAS_CUDART__. Behavior:
      • RDC enabled: Macro is defined.
      • RDC not enabled: Macro is not defined.
  • NVIDIA/thrust#1701: Remove the cub symlink from the root of the Thrust repository.
    • This symlink caused issues in certain build environments (e.g. NVIDIA/thrust#1328).
    • Builds that relied on this symlink will need to add the full CUB include path (-I ${THRUST_ROOT}/dependencies/cub).
    • CMake builds that use the Thrust packages via CPM, add_subdirectory, or find_package are not affected.
  • NVIDIA/thrust#1760: A compile-time error is now emitted when a __device__-only lambda’s return type is queried from host code (requires libcu++ ≥ 1.9.0).
    • Due to limitations in the CUDA programming model, the result of this query is unreliable, and will silently return an incorrect result. This leads to difficult to debug errors.
    • When using libcu++ 1.9.0, an error will be emitted with information about work-arounds:
      • Use a named function object with a __device__-only implementation of operator().
      • Use a __host__ __device__ lambda.
      • Use cuda::proclaim_return_type (Added in libcu++ 1.9.0)
  • NVIDIA/thrust#1761: Removed support for deprecated THRUST_DEVICE_BACKEND and THRUST_HOST_BACKEND macros. The THRUST_DEVICE_SYSTEM and THRUST_HOST_SYSTEM macros should be used instead.

Bug Fixes

  • NVIDIA/thrust#1605: Fix some execution space warnings in the allocator library.
  • NVIDIA/thrust#1683: Fix bug in iterator_category_to_traversal metafunctions.
  • NVIDIA/thrust#1715: Add missing __thrust_exec_check_disable__ annotation to thrust::make_zip_function. Thanks to @mfbalin for this contribution.
  • NVIDIA/thrust#1722: Remove CUDA-specific error handler from code that may be executed on non-CUDA backends. Thanks to @dkolsen-pgi for this contribution.
  • NVIDIA/thrust#1756: Fix copy_if for output iterators that don’t support copy assignment. Thanks for @mfbalin for this contribution.

Other Enhancements

  • NVIDIA/thrust#1605: Removed special case code for unsupported CUDA architectures.
  • NVIDIA/thrust#1605: Replace several usages of __CUDA_ARCH__ with <nv/target> to handle host/device code divergence.
  • NVIDIA/thrust#1752: Remove a leftover merge conflict from a documentation file. Thanks to @tabedzki for this contribution.

2.0.0-rc0

1 year ago

Summary

The Thrust 2.0.0 major release adds a dependency on libcu++ and contains several breaking changes. These include new diagnostics when inspecting device-only lambdas from the host, removal of the cub symlink in the Thrust repository root, and removal of the deprecated THRUST_*_BACKEND macros. It also includes several minor bugfixes and cleanups.

Breaking Changes

  • NVIDIA/thrust#1605: Add libcu++ dependency.
    • A suitable version of libcu++ is provided through the ${THRUST_ROOT}/dependencies/libcudacxx/ submodule.
    • Non-cmake users may need to add the libcu++ include path to their builds (-I ${THRUST_ROOT}/dependencies/libcudacxx/include/).
    • The Thrust CMake packages have been updated to add this include path.
  • NVIDIA/thrust#1605: The following macros are no longer defined by default. They can be re-enabled by defining THRUST_PROVIDE_LEGACY_ARCH_MACROS. These will be removed completely in a future release.
    • THRUST_IS_HOST_CODE: Replace with NV_IF_TARGET.
    • THRUST_IS_DEVICE_CODE: Replace with NV_IF_TARGET.
    • THRUST_INCLUDE_HOST_CODE: Replace with NV_IF_TARGET.
    • THRUST_INCLUDE_DEVICE_CODE: Replace with NV_IF_TARGET.
    • THRUST_DEVICE_CODE: Replace with NV_IF_TARGET.
  • NVIDIA/thrust#1661: Thrust’s CUDA Runtime support macros have been updated to support NV_IF_TARGET. They are now defined consistently across all host/device compilation passes. This should not affect most usages of these macros, but may require changes for some edge cases.
    • THRUST_RUNTIME_FUNCTION: Execution space annotations for functions that invoke CUDA Runtime APIs.
      • Old behavior:
        • RDC enabled: Defined to __host__ __device__
        • RDC not enabled:
          • NVCC host pass: Defined to __host__ __device__
          • NVCC device pass: Defined to __host__
      • New behavior:
        • RDC enabled: Defined to __host__ __device__
        • RDC not enabled: Defined to __host__
    • __THRUST_HAS_CUDART__: No change in behavior, but no longer used in Thrust. Provided for legacy support only. Legacy behavior:
      • RDC enabled: Defined to 1.
      • RDC not enabled:
        • NVCC host pass: Defined to 1.
        • NVCC device pass: Defined to 0.
    • THRUST_RDC_ENABLED: New macro, may be combined with NV_IF_TARGET to replace most usages of __THRUST_HAS_CUDART__. Behavior:
      • RDC enabled: Macro is defined.
      • RDC not enabled: Macro is not defined.
  • NVIDIA/thrust#1701: Remove the cub symlink from the root of the Thrust repository.
    • This symlink caused issues in certain build environments (e.g. NVIDIA/thrust#1328).
    • Builds that relied on this symlink will need to add the full CUB include path (-I ${THRUST_ROOT}/dependencies/cub).
    • CMake builds that use the Thrust packages via CPM, add_subdirectory, or find_package are not affected.
  • NVIDIA/thrust#1760: A compile-time error is now emitted when a __device__-only lambda’s return type is queried from host code (requires libcu++ ≥ 1.9.0).
    • Due to limitations in the CUDA programming model, the result of this query is unreliable, and will silently return an incorrect result. This leads to difficult to debug errors.
    • When using libcu++ 1.9.0, an error will be emitted with information about work-arounds:
      • Use a named function object with a __device__-only implementation of operator().
      • Use a __host__ __device__ lambda.
      • Use cuda::proclaim_return_type (Added in libcu++ 1.9.0)
  • NVIDIA/thrust#1761: Removed support for deprecated THRUST_DEVICE_BACKEND and THRUST_HOST_BACKEND macros. The THRUST_DEVICE_SYSTEM and THRUST_HOST_SYSTEM macros should be used instead.

Bug Fixes

  • NVIDIA/thrust#1605: Fix some execution space warnings in the allocator library.
  • NVIDIA/thrust#1683: Fix bug in iterator_category_to_traversal metafunctions.
  • NVIDIA/thrust#1715: Add missing __thrust_exec_check_disable__ annotation to thrust::make_zip_function. Thanks to @mfbalin for this contribution.
  • NVIDIA/thrust#1722: Remove CUDA-specific error handler from code that may be executed on non-CUDA backends. Thanks to @dkolsen-pgi for this contribution.
  • NVIDIA/thrust#1756: Fix copy_if for output iterators that don’t support copy assignment. Thanks for @mfbalin for this contribution.

Other Enhancements

  • NVIDIA/thrust#1605: Removed special case code for unsupported CUDA architectures.
  • NVIDIA/thrust#1605: Replace several usages of __CUDA_ARCH__ with <nv/target> to handle host/device code divergence.
  • NVIDIA/thrust#1752: Remove a leftover merge conflict from a documentation file. Thanks to @tabedzki for this contribution.

1.17.1

1 year ago

Summary

Thrust 1.17.1 is a minor bugfix release that provides an updated version of CUB.

1.17.0-rc2

1 year ago

Thrust 1.17.0

Summary

Thrust 1.17.0 is the final minor release of the 1.X series. This release provides GDB pretty-printers for device vectors/references, a new unique_count algorithm, and an easier way to create tagged Thrust iterators. Several documentation fixes are included, which can be found on the new Thrust documentation site at https://nvidia.github.io/thrust. We’ll be migrating existing documentation sources to this new location over the next few months.

New Features

  • NVIDIA/thrust#1586: Add new thrust::make_tagged_iterator convenience function. Thanks to @karthikeyann for this contribution.
  • NVIDIA/thrust#1619: Add unique_count algorithm. Thanks to @upsj for this contribution.
  • NVIDIA/thrust#1631: Add GDB pretty-printers for device vectors/references to scripts/gdb-pretty-printers.py. Thanks to @upsj for this contribution.

Bug Fixes

  • NVIDIA/thrust#1671: Fixed reduce_by_key when called with 2^31 elements.

Other Enhancements

  • NVIDIA/thrust#1512: Use CUB to implement adjacent_difference.
  • NVIDIA/thrust#1555: Use CUB to implement scan_by_key.
  • NVIDIA/thrust#1611: Add new doxybook-based Thrust documentation at https://nvidia.github.io/thrust.
  • NVIDIA/thrust#1639: Fixed broken link in documentation. Thanks to @jrhemstad for this contribution.
  • NVIDIA/thrust#1644: Increase contrast of search input text in new doc site. Thanks to @bdice for this contribution.
  • NVIDIA/thrust#1647: Add __forceinline__ annotations to a functor wrapper. Thanks to @mkuron for this contribution.
  • NVIDIA/thrust#1660: Fixed typo in documentation example for permutation_iterator.
  • NVIDIA/thrust#1669: Add a new explicit_cuda_stream.cu example that shows how to use explicit CUDA streams and par/par_nosync execution policies.

1.17.0

1 year ago

Thrust 1.17.0

Summary

Thrust 1.17.0 is the final minor release of the 1.X series. This release provides GDB pretty-printers for device vectors/references, a new unique_count algorithm, and an easier way to create tagged Thrust iterators. Several documentation fixes are included, which can be found on the new Thrust documentation site at https://nvidia.github.io/thrust. We’ll be migrating existing documentation sources to this new location over the next few months.

New Features

  • NVIDIA/thrust#1586: Add new thrust::make_tagged_iterator convenience function. Thanks to @karthikeyann for this contribution.
  • NVIDIA/thrust#1619: Add unique_count algorithm. Thanks to @upsj for this contribution.
  • NVIDIA/thrust#1631: Add GDB pretty-printers for device vectors/references to scripts/gdb-pretty-printers.py. Thanks to @upsj for this contribution.

Bug Fixes

  • NVIDIA/thrust#1671: Fixed reduce_by_key when called with 2^31 elements.

Other Enhancements

  • NVIDIA/thrust#1512: Use CUB to implement adjacent_difference.
  • NVIDIA/thrust#1555: Use CUB to implement scan_by_key.
  • NVIDIA/thrust#1611: Add new doxybook-based Thrust documentation at https://nvidia.github.io/thrust.
  • NVIDIA/thrust#1639: Fixed broken link in documentation. Thanks to @jrhemstad for this contribution.
  • NVIDIA/thrust#1644: Increase contrast of search input text in new doc site. Thanks to @bdice for this contribution.
  • NVIDIA/thrust#1647: Add __forceinline__ annotations to a functor wrapper. Thanks to @mkuron for this contribution.
  • NVIDIA/thrust#1660: Fixed typo in documentation example for permutation_iterator.
  • NVIDIA/thrust#1669: Add a new explicit_cuda_stream.cu example that shows how to use explicit CUDA streams and par/par_nosync execution policies.

1.17.0-rc0

1 year ago

Thrust 1.17.0

Summary

Thrust 1.17.0 is the final minor release of the 1.X series. This release provides GDB pretty-printers for device vectors/references, a new unique_count algorithm, and an easier way to create tagged Thrust iterators. Several documentation fixes are included, which can be found on the new Thrust documentation site at https://nvidia.github.io/thrust. We’ll be migrating existing documentation sources to this new location over the next few months.

New Features

  • NVIDIA/thrust#1586: Add new thrust::make_tagged_iterator convenience function. Thanks to @karthikeyann for this contribution.
  • NVIDIA/thrust#1619: Add unique_count algorithm. Thanks to @upsj for this contribution.
  • NVIDIA/thrust#1631: Add GDB pretty-printers for device vectors/references to scripts/gdb-pretty-printers.py. Thanks to @upsj for this contribution.

Bug Fixes

  • NVIDIA/thrust#1671: Fixed reduce_by_key when called with 2^31 elements.

Other Enhancements

  • NVIDIA/thrust#1512: Use CUB to implement adjacent_difference.
  • NVIDIA/thrust#1555: Use CUB to implement scan_by_key.
  • NVIDIA/thrust#1611: Add new doxybook-based Thrust documentation at https://nvidia.github.io/thrust.
  • NVIDIA/thrust#1639: Fixed broken link in documentation. Thanks to @jrhemstad for this contribution.
  • NVIDIA/thrust#1644: Increase contrast of search input text in new doc site. Thanks to @bdice for this contribution.
  • NVIDIA/thrust#1647: Add __forceinline__ annotations to a functor wrapper. Thanks to @mkuron for this contribution.
  • NVIDIA/thrust#1660: Fixed typo in documentation example for permutation_iterator.
  • NVIDIA/thrust#1669: Add a new explicit_cuda_stream.cu example that shows how to use explicit CUDA streams and par/par_nosync execution policies.