OneDPL Versions Save

oneAPI DPC++ Library (oneDPL) https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-library.html

oneDPL-2022.5.0-rc1

1 month ago

New Features

  • Added new histogram algorithms for generating a histogram from an input sequence into an output sequence representing either equally spaced or user-defined bins. These algorithms are currently only available for device execution policies.
  • Supported zip_iterator for transform algorithm.

Fixed Issues

  • Fixed handling of permutation_iterator as input to oneDPL algorithms for a variety of source iterator and permutation types which caused issues.
  • Fixed zip_iterator to be sycl device copyable for trivially copyable source iterator types.
  • Added a workaround for reduction algorithm failures with 64-bit data types. Define the ONEDPL_WORKAROUND_FOR_IGPU_64BIT_REDUCTION macro to 1 before including oneDPL header files.

New Known Issues and Limitations

  • Crashes or incorrect results may occur when using oneapi::dpl::reverse_iterator or std::reverse_iterator as input to oneDPL algorithms with device execution policies.

oneDPL-2022.4.0-rc1

2 months ago

New Features

  • Added experimental radix_sort and radix_sort_by_key algorithms residing in the oneapi::dpl::experimental::kt::esimd namespace. These algorithms are first in the family of kernel templates that allow configuring a variety of parameters including the number of elements to process by a work item, and the size of a workgroup. The algorithms only work with Intel® Data Center GPU Max Series.
  • Added new transform_if algorithm for applying a transform function conditionally based on a predicate, with overloads provided for one and two input sequences that use correspondingly unary and binary operations and predicates.
  • Optimizations used with Intel® oneAPI DPC++/C++ Compiler are expanded to the open source oneAPI DPC++ compiler.

New Known Issues and Limitations

  • esimd::radix_sort and esimd::radix_sort_by_key kernel templates fail to compile when a program is built with -g, -O0, -O1 compiler options.
  • esimd::radix_sort_by_key kernel template produces wrong results with the following combinations of kernel_param and types of keys and values:
    • sizeof(key_type) + sizeof(val_type) == 12, kernel_param::workgroup_size == 64, and kernel_param::data_per_workitem == 96
    • sizeof(key_type) + sizeof(val_type) == 16, kernel_param::workgroup_size == 64, and kernel_param::data_per_workitem == 64

oneDPL-2022.3.0-rc1

5 months ago

New Features

  • Added an experimental feature to dynamically select an execution context, e.g., a SYCL queue. The feature provides selection functions such as select, submit and submit_and_wait, and several selection policies: fixed_resource_policy, round_robin_policy, dynamic_load_policy, and auto_tune_policy.
  • unseq and par_unseq policies now enable vectorization also for Intel® oneAPI DPC++/C++ Compiler.
  • Added support for passing zip iterators as segment value data in reduce_by_segment, exclusive_scan_by_segment, and inclusive_scan_by_segment.
  • Improved performance of the merge, sort, stable_sort, sort_by_key, reduce, min_element, max_element, minmax_element, is_partitioned, and lexicographical_compare algorithms with DPC++ execution policies.

Fixed Issues

  • Fixed the reduce_async function to not ignore the provided binary operation.

New Known Issues and Limitations

  • When compiled with -fsycl-pstl-offload option of Intel® oneAPI DPC++/C++ compiler and with libstdc++ version 8 or libc++, oneapi::dpl::execution::par_unseq offloads standard parallel algorithms to the SYCL device similarly to std::execution::par_unseq in accordance with the -fsycl-pstl-offload option value.
  • When using the dpl modulefile to initialize the user's environment and compiling with -fsycl-pstl-offload option of Intel® oneAPI DPC++/C++ compiler, a linking issue or program crash may be encountered due to the directory containing libpstloffload.so not being included in the search path. Use the env/vars.sh to configure the working environment to avoid the issue.
  • Compilation issues may be encountered when passing zip iterators to exclusive_scan_by_segment on Windows.
  • Incorrect results may be produced by set_intersection with a DPC++ execution policy, where elements are copied from the second input range rather than the first input range.
  • For transform_exclusive_scan and exclusive_scan to run in-place (that is, with the same data used for both input and destination) and with an execution policy of unseq or par_unseq, it is required that the provided input and destination iterators are equality comparable. Furthermore, the equality comparison of the input and destination iterator must evaluate to true. If these conditions are not met, the result of these algorithm calls is undefined.
  • sort, stable_sort, sort_by_key, partial_sort_copy algorithms may work incorrectly or cause a segmentation fault when used a DPC++ execution policy for CPU device, and built on Linux with Intel® oneAPI DPC++/C++ Compiler and -O0 -g compiler options. To avoid the issue, pass -fsycl-device-code-split=per_kernel option to the compiler.
  • Incorrect results may be produced by exclusive_scan, inclusive_scan, transform_exclusive_scan, transform_inclusive_scan, exclusive_scan_by_segment, inclusive_scan_by_segment, reduce_by_segment with unseq or par_unseq policy when compiled by Intel® oneAPI DPC++/C++ Compiler with -fiopenmp, -fiopenmp-simd, -qopenmp, -qopenmp-simd options on Linux. To avoid the issue, pass -fopenmp or -fopenmp-simd option instead.
  • Incorrect results may be produced by reduce and transform_reduce with 64-bit types and std::multiplies, sycl::multiplies operations when compiled by Intel® C++ Compiler 2021.3 and newer and executed on GPU devices.

oneDPL-2022.2.0-rc1

9 months ago

New Features

  • Added sort_by_key algorithm for key-value sorting.
  • Improved performance of the reduce, min_element, max_element, minmax_element, is_partitioned, and lexicographical_compare algorithms with DPC++ execution policies.
  • Improved performance of the reduce_by_segment, inclusive_scan_by_segment, and exclusive_scan_by_segment algorithms for binary operators with known identities when using DPC++ execution policies.
  • Added value_type to all views in oneapi::dpl::experimental::ranges.
  • Extended oneapi::dpl::experimental::ranges::sort to support projections applied to the range elements prior to comparison.

Fixed Issues

  • The minimally required CMake version is raised to 3.11 on Linux and 3.20 on Windows.
  • Added new CMake package oneDPLIntelLLVMConfig.cmake to resolve issues using CMake 3.20+ on Windows for icx and icx-cl.
  • Fixed an error in the sort and stable_sort algorithms when performing a descending sort on signed numeric types with negative values.
  • Fixed an error in reduce_by_segment algorithm when a non-commutative predicate is used.
  • Fixed an error in sort and stable_sort algorithms for integral types wider than 4 bytes.
  • Fixed an error for some compilers where OpenMP or SYCL backend was selected by CMake scripts without full compiler support.

New Known Issues and Limitations

  • Incorrect results may be produced with in-place scans using unseq and par_unseq policies on CPUs with the Intel® C++ Compiler 2021.8.

This release also includes the following changes from oneDPL 2022.1.1

New Features

  • Improved sort algorithm performance for the arithmetic data types with std::less or std::greater comparison operator and DPC++ policy.

Fixes Issues

  • Fixed an error that caused segmentation faults in transform_reduce, minmax_element, and related algorithms when ran on CPU devices.
  • Fixed a compilation error in transform_reduce, minmax_element, and related algorithms on FPGAs.
  • Fixed permutation_iterator to support C-style array as a permutation map.
  • Fixed a radix-sort issue with 64-bit signed integer types.

oneDPL-2022.1.0-rc3

1 year ago

New Features

  • Added generate, generate_n, transform algorithms to Tested Standard C++ API.
  • Improved performance of inclusive_scan, exclusive_scan, reduce and max_element algorithms with DPC++ execution policies.

Fixed Issues

  • Added a workaround for the TBB headers not found issue occurring with libstdc++ version 9 when oneTBB headers are not present in the environment. The workaround requires inclusion of the oneDPL headers before the libstdc++ headers.
  • When possible, oneDPL CMake scripts now enforce C++17 as the minimally required language version. Inspired by Daniel Simon (https://github.com/oneapi-src/oneDPL/pull/739).
  • Fixed an error in the exclusive_scan algorithm when the output iterator is equal to the input iterator (in-place scan).

oneDPL-2022.0.0-release

1 year ago

New Features

  • Added the functionality from <complex> and more APIs from <cmath> and <limits> standard headers to Tested Standard C++ API.
  • Improved performance of sort and stable_sort algorithms on GPU devices when using Radix sort*.

Fixed Issues

  • Fixed permutation_iterator to work with C++ lambda functions for index permutation
  • Fixed an error in oneapi::dpl::experimental::ranges::guard_view and oneapi::dpl::experimental::ranges::zip_view when using operator[] with an index exceeding the limits of a 32 bit integer type.
  • Fixed errors when data size is 0 in upper_bound, lower_bound and binary_search algorithms.

Changes affecting backward compatibility

  • Removed support of C++11 and C++14.

  • Changed the size and the layout of the discard_block_engine class template.

    For further details, please refer to 2022.0 Changes

*The sorting algorithms in oneDPL use Radix sort for arithmetic data types compared with std::less or std::greater, otherwise Merge sort.

oneDPL-2021.7.1-release

1 year ago

New Features

  • Added possibility to construct a zip_iterator out of a std::tuple of iterators.
  • Added 9 more serial-based versions of algorithms: is_heap, is_heap_until, make_heap, push_heap, pop_heap, is_sorted, is_sorted_until, partial_sort, partial_sort_copy. Please refer to Tested Standard C++ API Reference.

Fixed Issues

  • Added namespace alias dpl = oneapi::dpl into all public headers.
  • Fixed error in reduce_by_segment algorithm.
  • Fixed errors when data size is 0 in upper_bound, lower_bound and binary_search algorithms.
  • Fixed wrong results error in algorithms call with permutation iterator.

oneDPL-2021.7.0-release

1 year ago

Deprecation Notice

  • Deprecated support of C++11 for Parallel API with host execution policies (seq, unseq, par, par_unseq). C++17 is the minimal required version going forward.

Fixed Issues

  • Fixed a kernel name definition error in range-based algorithms and reduce_by_segment used with a device_policy object that has no explicit kernel name.

oneDPL-2021.6.1-release

2 years ago
  • Fixed compilation errors with C++20.
  • Fixed CL_OUT_OF_RESOURCES issue for Radix sort algorithm executed on CPU devices.
  • Fixed crashes in exclusive_scan_by_segment, inclusive_scan_by_segment, reduce_by_segment algorithms applied to device-allocated USM.

oneDPL-2021.6.0-release

2 years ago