oneAPI DPC++ Library (oneDPL) https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-library.html

oneAPI DPC++ Library (oneDPL)

The oneAPI DPC++ Library (oneDPL) aims to work with the oneAPI DPC++ Compiler to provide high-productivity APIs to developers, which can minimize DPC++ programming efforts across devices for high performance parallel applications. oneDPL consists of following components:

  • Parallel STL for DPC++
  • An additional set of library classes and functions (referred below as "Extension API")
  • Tested standard C++ APIs

Prerequisites

Install Intel(R) oneAPI Base Toolkit (Base Kit) to use oneDPL and please refer to System requirements.

Release Information

Here is the latest Release Notes.

License

oneDPL is licensed under Apache License Version 2.0 with LLVM exceptions. Refer to the "LICENSE" file for the full license text and copyright notice.

Security

See Intel's Security Center for information on how to report a potential security issue or vulnerability. See also: Security Policy

Contributing

See CONTRIBUTING.md for details.

Documentation

See Library Guide with oneDPL.

Samples

You can find oneDPL samples in Samples.

Support and contribution

Please report issues and suggestions via GitHub issues.


Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.

* Other names and brands may be claimed as the property of others.

Owner
oneAPI-SRC
oneAPI open source projects
oneAPI-SRC
Comments
  • Migrated build process from Makefiles to CMake

    Migrated build process from Makefiles to CMake

    Hi,

    I've migrated current build process from raw Makefiles to CMake instead. This allows for easier configurable releases and the standard configure/make/make install process across platforms for easier packaging across linux distributions and otherwise.

    Additionally, it also exports projectConfig.cmake files for other projects to detect pstl and specify it as a dependency!

  • Error with TBB CMake on OS X

    Error with TBB CMake on OS X

    Hello,

    I've been using parallelstl for a few months, and it is really a fantastic library.

    I've just upgraded to the latest release, which includes CMake support. But, when I add the parallelstl subdirectory in CMake, I get the following error:

    CMake Error at /usr/local/lib/cmake/TBB/TBBConfig.cmake:77 (message):
      Missed required Intel TBB component: tbb
    Call Stack (most recent call first):
      third_party/parallelstl-20180619/CMakeLists.txt:39 (find_package)
    
    
    -- Configuring incomplete, errors occurred!
    

    To be clear, I definitely have TBB installed. I'm on OS X and I installed it via brew. Its header files and libraries are setting in /usr/local/include and /usr/local/lib respectively. If I skip the parallelstl CMake and add things manually, everything is fine.

    I was hoping there would be some variable somewhere, like TBB_ROOT or something, that I could set that would make the TBB CMake happy, but so far nothing has worked. Any thoughts?

    Thanks!

  • Executing algorithms on subdevices broken with Intel LLVM trunk

    Executing algorithms on subdevices broken with Intel LLVM trunk

    I'm not sure whether this is a bug in oneDPL or in DPC++, but launching oneDPL algorithms on SYCL subdevices (e.g. one tile of a PVC) seems to be broken in most recent LLVM trunk.

    I encountered this bug with both 2021.7.0 and the main branch of oneDPL. My DPC++ is an open-source build from commit 77b1e9dfef554d8e686a373553d3898d6833f5d8 built with all default options and NVIDIA support enabled.

    The following minimal example demonstrates the bug. I expect reduce to run successfully on a single device as well as subdevice, but it crashes when oneDPL reduce is called with an execution policy on a subdevice with the following output:

    (base) [email protected]:~/src/oneDPL_subdevice$ ./very_minimized 
    Reducing on device...
    Allocating memory on subdevice...
    Result is 0
    Reducing on subdevice...
    Allocating memory on subdevice...
    terminate called after throwing an instance of 'sycl::_V1::exception'
      what():  Not all devices are associated with the context or vector of devices is empty
    Aborted (core dumped)
    

    There is only a single device given to oneDPL's execution policy, so the error doesn't make a ton of sense to me.

    #include <CL/sycl.hpp>
    #include <oneapi/dpl/execution>
    #include <oneapi/dpl/async>
    #include <oneapi/dpl/numeric>
    #include <cstdio>
    
    void test_device(sycl::context context, sycl::device device) {
      size_t size = 100;
      int* a;
    
      printf("Allocating memory on subdevice...\n");
      a = sycl::malloc_device<int>(size, device, context);
      oneapi::dpl::execution::device_policy policy(device);
    
      int result = oneapi::dpl::reduce(policy, a, a+100, float(0), std::plus());
    
      printf("Result is %d\n", result);
    }
    
    int main(int argc, char** argv) {
      sycl::gpu_selector g;
      sycl::device device = sycl::device(g);
      auto subdevices = device.create_sub_devices<sycl::info::partition_property::partition_by_affinity_domain>(sycl::info::partition_affinity_domain::numa);
    
      printf("Reducing on device...\n");
      sycl::context device_context(device);
      test_device(device_context, device);
    
      printf("Reducing on subdevice...\n");
      sycl::context subdevice_context(subdevices);
      test_device(subdevice_context, subdevices[0]);
    
      return 0;
    }
    
  • Create install target that copies include files and oneDPLConfig.cmake

    Create install target that copies include files and oneDPLConfig.cmake

    Currently, the CMake setup doesn't define an install target that allows oneDPL to be installed in a user-provided directory properly. This pull request provides that by copying the include files and invoking generate_config.cmake.

  • Modify Jenkinsfiles to use good compiler in the Last Good OneDPL link file

    Modify Jenkinsfiles to use good compiler in the Last Good OneDPL link file

    The change will switch our Jenkins CI test to use good compiler in OneDPL link. Verified manually with: http://icl-jenkins2.sc.intel.com:8080/job/Tools_SH/job/test_jobs/job/onedpl_test/job/RHEL_Test/19/console http://icl-jenkins2.sc.intel.com:8080/job/Tools_SH/job/test_jobs/job/onedpl_test/job/UB20_Test/8/console http://icl-jenkins2.sc.intel.com:8080/job/Tools_SH/job/test_jobs/job/onedpl_test/job/UB18_Test/5/console http://icl-jenkins2.sc.intel.com:8080/blue/organizations/jenkins/Tools_SH%2Ftest_jobs%2Fonedpl_test%2FWin_test/detail/Win_test/46/pipeline

  • Error on configuring parallelstl using CMake on Windows

    Error on configuring parallelstl using CMake on Windows

    Complaining that a TBB CMake directory is missing, though TBB is not build (doesn't provide) CMake configuration and building

    **CMake Error at CMakeLists.txt:39 (find_package): By not providing "FindTBB.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "TBB", but CMake did not find one.

    Could not find a package configuration file provided by "TBB" (requested version 2018) with any of the following names:

    TBBConfig.cmake
    tbb-config.cmake
    

    Add the installation prefix of "TBB" to CMAKE_PREFIX_PATH or set "TBB_DIR" to a directory containing one of the above files. If "TBB" provides a separate development package or SDK, be sure it has been installed.**

  • help with getting transform_reduce to work for device array

    help with getting transform_reduce to work for device array

    Hello, Can you please help with finding the issue with the following code? More specifically, I am trying to apply std::transform_reduce on a device pointer (allocated using malloc_device and populated with memcpy). I want to first apply abs, and then take the max. I understand std::transform, std::reduce, std::transform_reduce each needs an iterator and usm allocated array/buffer should work (not 100% sure). It does work for std::transform (without any explicit conversion of the device pointer to an iterator but It fails for both std::reduce and std::transform_reduce.

    #include <oneapi/dpl/execution>
    #include <oneapi/dpl/algorithm>
    #include <CL/sycl.hpp>
    
    #include <vector>
    #include <iostream>
    
    struct FunctionalAbs {
        float operator()(const float& x) const {
            return sycl::fabs((float)x);
        }
    };
    
    constexpr int N = 16;
    
    int main()
    {
        // create queue and policy
        sycl::queue myQueue = sycl::queue();
        auto policy = oneapi::dpl::execution::make_device_policy(myQueue);
    
        // fill up host vecotr
        std::vector<float> values_h;
        for (int i = 0; i < N; ++i) {
            values_h.push_back(-(float)N/2 + i);
        }
    
        // allocate and fill up device pointer
        auto values_d = sycl::malloc_device<float>(N, myQueue);
        myQueue.memcpy(values_d, &values_h[0], N * sizeof(float)).wait();
    
        /* THIS WORKS */
        // transform and reduce host vector: takes abs followed by max
        float max_h = std::transform_reduce(    // works
            values_h.begin(),
            values_h.end(),
            0.0f,
            oneapi::dpl::maximum<float>(),
            FunctionalAbs());
    
        /* THIS IS NOT WORKING */
        float max_d = std::transform_reduce(    // does not work
            policy,
            values_d,       // how to get an iterator here
            values_d + N,   // how to get an iterator here
            0.0f,
            oneapi::dpl::maximum<float>(),
            FunctionalAbs());
        
        /* THIS WORKS */
        std::transform(                         // works
            policy,
            values_d,       // does not require iterator here
            values_d + N,   // does not require iterator here
            values_d,
            FunctionalAbs());
    
        std::cout << "max: " << max_h << std::endl;
    
        return 0;
    }
    

    I get this error when trying to compile using dpcpp test_transform_reduce.cpp

    test_transform_reduce.cpp:42:19: error: no matching function for call to 'transform_reduce' float max_d = std::transform_reduce( // does not work ^~~~~~~~~~~~~~~~~~~~~ /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/numeric:338:5: note: candidate template ignored: deduced conflicting types for parameter '_InputIterator1' ('oneapi::dpl::execution::device_policy<>' vs. 'float *') transform_reduce(_InputIterator1 __first1, _InputIterator1 __last1, ^ /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/pstl/glue_numeric_impl.h:81:1: note: candidate template ignored: requirement '__pstl::execution::is_execution_policy<oneapi::dpl::execution::device_policy<oneapi::dpl::execution::DefaultKernelName>>::value' was not satisfied [with _ExecutionPolicy = oneapi::dpl::execution::device_policy<> &, _ForwardIterator = float *, _Tp = float, _BinaryOperation = oneapi::dpl::maximum<float>, _UnaryOperation = FunctionalAbs] transform_reduce(_ExecutionPolicy&& __exec, _ForwardIterator __first, _ForwardIterator __last, _Tp __init, ^ /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/numeric:405:5: note: candidate function template not viable: requires 5 arguments, but 6 were provided transform_reduce(_InputIterator __first, _InputIterator __last, _Tp __init, ^ /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/pstl/glue_numeric_impl.h:54:1: note: candidate function template not viable: requires 5 arguments, but 6 were provided transform_reduce(_ExecutionPolicy&& __exec, _ForwardIterator1 __first1, _ForwardIterator1 __last1, ^ /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/pstl/glue_numeric_impl.h:69:1: note: candidate function template not viable: requires 7 arguments, but 6 were provided transform_reduce(_ExecutionPolicy&& __exec, _ForwardIterator1 __first1, _ForwardIterator1 __last1, ^ /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/numeric:380:5: note: candidate function template not viable: requires 4 arguments, but 6 were provided transform_reduce(_InputIterator1 __first1, _InputIterator1 __last1, ^ 1 error generated.

    Thanks, Golam

  • Issues with reduce_by_segment with zip_iterators

    Issues with reduce_by_segment with zip_iterators

    Hi,

    I was trying to figure out a test case that involves using dpl::reduce_by_segment with zip_iterators(tuple) and was facing some difficultly with compiling it.

    Can someone please comment if there is something wrong the way the test case is setup or otherwise.

    #define PSTL_USE_PARALLEL_POLICIES 0
    #define _GLIBCXX_USE_TBB_PAR_BACKEND 0
    
    #include <CL/sycl.hpp>
    #include <oneapi/dpl/execution>
    #include <oneapi/dpl/algorithm>
    #include <oneapi/dpl/iterator>
    #include <oneapi/dpl/functional>
    
    #include <functional>
    #include <iostream>
    #include <vector>
    
    int main()
    {
        sycl::queue q(sycl::gpu_selector{});
    
        std::vector<int> keys1{11, 11, 21, 20, 21, 21, 21, 37, 37};
        std::vector<int> keys2{11, 11, 20, 20, 20, 21, 21, 37, 37};
        std::vector<int> values{0, 1, 2, 3, 4, 5, 6, 7, 8};
        std::vector<int> output_keys1(keys1.size());
        std::vector<int> output_keys2(keys2.size());    
        std::vector<int> output_values(values.size());
    
        int* d_keys1         = sycl::malloc_device<int>(9, q);
        int* d_keys2         = sycl::malloc_device<int>(9, q);
        int* d_values        = sycl::malloc_device<int>(9, q);
        int* d_output_keys1  = sycl::malloc_device<int>(9, q);
        int* d_output_keys2  = sycl::malloc_device<int>(9, q);
        int* d_output_values = sycl::malloc_device<int>(9, q);
    
        q.memcpy(d_keys1, keys1.data(), sizeof(int)*9);
        q.memcpy(d_keys2, keys2.data(), sizeof(int)*9);
        q.memcpy(d_values, values.data(), sizeof(int)*9);
    
        auto begin_keys_in = oneapi::dpl::make_zip_iterator(d_keys1, d_keys2);
        auto end_keys_in   = oneapi::dpl::make_zip_iterator(d_keys1 + 9, d_keys2 + 9);
        auto begin_keys_out= oneapi::dpl::make_zip_iterator(d_output_keys1, d_output_keys2);
    
        auto new_last = oneapi::dpl::reduce_by_segment(oneapi::dpl::execution::make_device_policy(q),
    						   begin_keys_in, end_keys_in, d_values, begin_keys_out, d_output_values);
    
        q.memcpy(output_keys1.data(), d_output_keys1, sizeof(int)*9);
        q.memcpy(output_keys2.data(), d_output_keys2, sizeof(int)*9);    
        q.memcpy(output_values.data(), d_output_values, sizeof(int)*9);
        q.wait();
    
        // Expected output
        // {11, 11}: 1
        // {21, 20}: 2
        // {20, 20}: 3
        // {21, 20}: 4
        // {21, 21}: 11
        // {37, 37}: 15
        for(int i=0; i<9; i++) {
          std::cout << "{" << output_keys1[i] << ", " << output_keys2 << "}: " << output_values[i] << std::endl;
        }
    }
    

    Environment Target device and vendor: Intel GPUs DPC++ version: Intel(R) oneAPI DPC++/C++ Compiler 2021.2.0 (2021.x.0.20210323)

  • Replace invoke_if with if constexpr

    Replace invoke_if with if constexpr

    This PR replaces the internal helper functions oneapi::dpl::__internal::__invoke_if, oneapi::dpl::__internal::__invoke_if_not and oneapi::dpl::__internal::__invoke_if_else with if constexpr now that oneDPL is free to use C++17 constructs.

  • question about Random Number Engines in oneDPL  and Random Number Engines in oneMKL

    question about Random Number Engines in oneDPL and Random Number Engines in oneMKL

    In oneDPL, the names of the engines are:

    https://docs.oneapi.io/versions/latest/onedpl/random.html

    In oneMKL, the names of the engines are: https://www.intel.com/content/www/us/en/develop/documentation/oneapi-mkl-dpcpp-developer-reference/top/random-number-generators/random-number-generators-device-routines/device-engines-basic-random-number-generators.html

    Can you please explain the relationships between the engines from the two components ? Thanks.

  • oneDPL/include/oneapi/dpl/pstl/./omp/parallel_merge.h:87: undefined reference to `omp_in_parallel'

    oneDPL/include/oneapi/dpl/pstl/./omp/parallel_merge.h:87: undefined reference to `omp_in_parallel'

    I'm trying to build 67383db9ac3223c825f4b8783b38bb9ab01aa757 like this:

    cmake .. -DONEDPL_BACKEND=dpcpp_only -DONEDPL_DEVICE_TYPE=GPU -DONEDPL_DEVICE_BACKEND=level_zero -DONEDPL_USE_UNNAMED_LAMBDA=TRUE -DCMAKE_CXX_COMPILER=dpcpp -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_STANDARD=17
    

    It fails because OpenMP isn't available. I know how to tell it where OpenMP is, but I should not have to do that, since I am building a DPC++-only backend for the GPU.

    [ 16%] Linking CXX executable merge.pass
    /usr/bin/ld: /tmp/merge-bd963e.o: in function `_ZN6oneapi3dpl13__omp_backend16__parallel_mergeIRKNS0_9execution2v115parallel_policyEN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEENS9_IPKiSD_EESE_NS0_10__internal11__pstl_lessEZNSI_15__pattern_mergeIS7_SE_SH_SE_SJ_St17integral_constantIbLb0EEEENSt9enable_ifIXsr6oneapi3dpl10__internal26__is_host_execution_policyINSt5decayIT_E4typeEEE5valueET2_E4typeEOSP_T0_SW_T1_SX_SS_T3_T4_SL_IbLb1EEEUlSE_SE_SH_SH_SE_SJ_E_EEvSV_SW_SW_SX_SX_SS_SY_SZ_':
    /tmp/oneDPL/include/oneapi/dpl/pstl/./omp/parallel_merge.h:87: undefined reference to `omp_in_parallel'
    /usr/bin/ld: /tmp/merge-bd963e.o: in function `_ZN6oneapi3dpl13__omp_backend16__parallel_mergeIRKNS0_9execution2v115parallel_policyEN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEENS9_IPKdSD_EESE_NS0_10__internal11__pstl_lessEZNSI_15__pattern_mergeIS7_SE_SH_SE_SJ_St17integral_constantIbLb0EEEENSt9enable_ifIXsr6oneapi3dpl10__internal26__is_host_execution_policyINSt5decayIT_E4typeEEE5valueET2_E4typeEOSP_T0_SW_T1_SX_SS_T3_T4_SL_IbLb1EEEUlSE_SE_SH_SH_SE_SJ_E_EEvSV_SW_SW_SX_SX_SS_SY_SZ_':
    /tmp/oneDPL/include/oneapi/dpl/pstl/./omp/parallel_merge.h:87: undefined reference to `omp_in_parallel'
    /usr/bin/ld: /tmp/merge-bd963e.o: in function `_ZN6oneapi3dpl13__omp_backend16__parallel_mergeIRKNS0_9execution2v115parallel_policyESt16reverse_iteratorIN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEEES8_INSA_IPKiSE_EEESG_St7greaterIiEZNS0_10__internal15__pattern_mergeIS7_SG_SK_SG_SM_St17integral_constantIbLb0EEEENSt9enable_ifIXsr6oneapi3dpl10__internal26__is_host_execution_policyINSt5decayIT_E4typeEEE5valueET2_E4typeEOST_T0_S10_T1_S11_SW_T3_T4_SP_IbLb1EEEUlSG_SG_SK_SK_SG_SM_E_EEvSZ_S10_S10_S11_S11_SW_S12_S13_':
    /tmp/oneDPL/include/oneapi/dpl/pstl/./omp/parallel_merge.h:87: undefined reference to `omp_in_parallel'
    /usr/bin/ld: /tmp/merge-bd963e.o: in function `_ZN6oneapi3dpl13__omp_backend16__parallel_mergeIRKNS0_9execution2v127parallel_unsequenced_policyESt16reverse_iteratorIN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEEES8_INSA_IPKiSE_EEESG_St7greaterIiEZNS0_10__internal15__pattern_mergeIS7_SG_SK_SG_SM_St17integral_constantIbLb1EEEENSt9enable_ifIXsr6oneapi3dpl10__internal26__is_host_execution_policyINSt5decayIT_E4typeEEE5valueET2_E4typeEOST_T0_S10_T1_S11_SW_T3_T4_SQ_EUlSG_SG_SK_SK_SG_SM_E_EEvSZ_S10_S10_S11_S11_SW_S12_S13_':
    /tmp/oneDPL/include/oneapi/dpl/pstl/./omp/parallel_merge.h:87: undefined reference to `omp_in_parallel'
    /usr/bin/ld: /tmp/merge-bd963e.o: in function `_ZN6oneapi3dpl13__omp_backend16__parallel_mergeIRKNS0_9execution2v115parallel_policyESt16reverse_iteratorIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEEES8_INSA_IPKdSE_EEESG_St7greaterIdEZNS0_10__internal15__pattern_mergeIS7_SG_SK_SG_SM_St17integral_constantIbLb0EEEENSt9enable_ifIXsr6oneapi3dpl10__internal26__is_host_execution_policyINSt5decayIT_E4typeEEE5valueET2_E4typeEOST_T0_S10_T1_S11_SW_T3_T4_SP_IbLb1EEEUlSG_SG_SK_SK_SG_SM_E_EEvSZ_S10_S10_S11_S11_SW_S12_S13_':
    /tmp/oneDPL/include/oneapi/dpl/pstl/./omp/parallel_merge.h:87: undefined reference to `omp_in_parallel'
    /usr/bin/ld: /tmp/merge-bd963e.o:/tmp/oneDPL/include/oneapi/dpl/pstl/./omp/parallel_merge.h:87: more undefined references to `omp_in_parallel' follow
    
  • Replace instances of __invoke_if/__invoke_if_not in glue_memory_impl.h (#725 part 1)

    Replace instances of __invoke_if/__invoke_if_not in glue_memory_impl.h (#725 part 1)

    Splitting https://github.com/oneapi-src/oneDPL/pull/725 in to multiple PRs.

    This PR replaces the internal helper functions oneapi::dpl::__internal::__invoke_if, oneapi::dpl::__internal::__invoke_if_not and oneapi::dpl::__internal::__invoke_if_else with if constexpr now that oneDPL is free to use C++17 constructs.

  • Replace instances of __invoke_if/__invoke_if_not (#725 part 1)

    Replace instances of __invoke_if/__invoke_if_not (#725 part 1)

    Splitting #725 in to multiple PRs.

    This PR replaces the internal helper functions oneapi::dpl::__internal::__invoke_if, oneapi::dpl::__internal::__invoke_if_not and oneapi::dpl::__internal::__invoke_if_else with if constexpr now that oneDPL is free to use C++17 constructs.

  • Fix warning in transform_binary.pass.cpp

    Fix warning in transform_binary.pass.cpp

    In this PR we fix warning in in transform_binary.pass.cpp :

    /oneDPL/test/parallel_api/algorithm/alg.modifying.operations/transform_binary.pass.cpp:52:85: warning: floating-point comparison is always true; constant cannot be represented exactly in type 'float' [-Wliteral-range]
                EXPECT_TRUE((expected > actual ? expected - actual : actual - expected) < 1e-7,
                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~
    /oneDPL/test/support/utils.h:59:67: note: expanded from macro 'EXPECT_TRUE'
    #define EXPECT_TRUE(condition, message) ::TestUtils::expect(true, condition, __FILE__, __LINE__, message)
                                                                      ^~~~~~~~~
    /oneDPL/test/parallel_api/algorithm/alg.modifying.operations/transform_binary.pass.cpp:74:9: note: in instantiation of function template specialization 'check_and_reset<__gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>>' requested here
            check_and_reset(first1, last1, first2, out_first);
            ^
    /oneDPL/test/support/iterator_utils.h:260:9: note: in instantiation of function template specialization 'test_one_policy<float, float, float>::operator()<const oneapi::dpl::execution::sequenced_policy &, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, TheOperation<float, float, float>>' requested here
            op(::std::forward<Rest>(rest)...);
            ^
    /oneDPL/test/support/iterator_utils.h:411:9: note: in instantiation of function template specialization 'TestUtils::invoke_if_<std::integral_constant<bool, false>, std::integral_constant<bool, false>>::operator()<test_one_policy<float, float, float>, const oneapi::dpl::execution::sequenced_policy &, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, TheOperation<float, float, float> &>' requested here
            invoke_if<InputIterator1>()(
            ^
    /oneDPL/test/support/iterator_utils.h:542:9: note: in instantiation of function template specialization 'TestUtils::iterator_invoker<std::random_access_iterator_tag, std::integral_constant<bool, false>>::operator()<const oneapi::dpl::execution::sequenced_policy &, test_one_policy<float, float, float>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, TheOperation<float, float, float> &>' requested here
            iterator_invoker<::std::random_access_iterator_tag, IsReverse>()(::std::forward<Rest>(rest)...);
            ^
    /oneDPL/test/support/iterator_utils.h:558:9: note: in instantiation of function template specialization 'TestUtils::reverse_invoker<std::integral_constant<bool, false>>::operator()<const oneapi::dpl::execution::sequenced_policy &, test_one_policy<float, float, float> &, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, TheOperation<float, float, float> &>' requested here
            reverse_invoker</* IsReverse = */ ::std::false_type>()(::std::forward<Rest>(rest)...);
            ^
    /oneDPL/test/support/utils_invoke.h:54:9: note: in instantiation of function template specialization 'TestUtils::invoke_on_all_iterator_types::operator()<const oneapi::dpl::execution::sequenced_policy &, test_one_policy<float, float, float> &, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, TheOperation<float, float, float> &>' requested here
            invoke_on_all_iterator_types()(seq,       op, ::std::forward<T>(rest)...);
            ^
    /oneDPL/test/support/utils_invoke.h:165:9: note: in instantiation of function template specialization 'TestUtils::invoke_on_all_host_policies::operator()<test_one_policy<float, float, float>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, TheOperation<float, float, float> &>' requested here
            invoke_on_all_host_policies()(op, ::std::forward<T>(rest)...);
            ^
    /oneDPL/test/parallel_api/algorithm/alg.modifying.operations/transform_binary.pass.cpp:89:9: note: in instantiation of function template specialization 'TestUtils::invoke_on_all_policies<0>::operator()<test_one_policy<float, float, float>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, __gnu_cxx::__normal_iterator<float *, std::vector<float>>, TheOperation<float, float, float> &>' requested here
            invoke_on_all_policies<0>()(test_one_policy<In1, In2, Out>(), in1.begin(), in1.end(), in2.begin(), in2.end(),
            ^
    /oneDPL/test/parallel_api/algorithm/alg.modifying.operations/transform_binary.pass.cpp:115:5: note: in instantiation of function template specialization 'test<float, float, float, TheOperation<float, float, float>>' requested here
        test<float32_t, float32_t, float32_t>(TheOperation<float32_t, float32_t, float32_t>(1.5));
        ^
    
  • Add C++17 requirement to installed oneDPL target

    Add C++17 requirement to installed oneDPL target

    Add the c++17 compile feature requirement to the oneDPL target so consumers also use c++17. This is still overridden by the CXX_STANDARD target property, so there still can be sharp edges for users.

  • Avoid double type in tests of std::complex<T>

    Avoid double type in tests of std::complex

    In this PR we fix cases in tests of std::complex when we use constants like 1.5 and etc. which has double type: in this case if double type is unavailable on device runtime error (exception) occurred:

    • we change 1.5 -> 1.5f and etc.
    • some peace's of tests moved inside of IF_DOUBLE_SUPPORT check.
Tbb - oneAPI Threading Building Blocks (oneTBB)

oneAPI Threading Building Blocks oneTBB is a flexible C++ library that simplifies the work of adding parallelism to complex applications, even if you

Jan 5, 2023
A library OS for Linux multi-process applications, with Intel SGX support

Graphene Library OS with Intel SGX Support A Linux-compatible Library OS for Multi-Process Applications NOTE: We are in the middle of transitioning ou

Jan 4, 2023
Code from https://queue.acm.org/detail.cfm?id=3448307 unzipped

Copyright (C) 2020-2021 Terence Kelly. All rights reserved. Author contact: [email protected], [email protected], [email protected] Adde

May 30, 2021
BoloPi Software Project

RT-Thread For Bolopi 中文页 Overview Bolopi-F1 is a all io extracted for f1c100s development board.Such like 40pin rgb lcd,dvp(camera),audio,tv in (cvbs)

Dec 15, 2022
KRATOS Multiphysics ("Kratos") is a framework for building parallel, multi-disciplinary simulation software
KRATOS Multiphysics (

KRATOS Multiphysics ("Kratos") is a framework for building parallel, multi-disciplinary simulation software, aiming at modularity, extensibility, and high performance. Kratos is written in C++, and counts with an extensive Python interface.

Dec 29, 2022
Bolt is a C++ template library optimized for GPUs. Bolt provides high-performance library implementations for common algorithms such as scan, reduce, transform, and sort.

Bolt is a C++ template library optimized for heterogeneous computing. Bolt is designed to provide high-performance library implementations for common

Dec 27, 2022
ArrayFire: a general purpose GPU library.
ArrayFire: a general purpose GPU library.

ArrayFire is a general-purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures i

Dec 27, 2022
A C++ GPU Computing Library for OpenCL

Boost.Compute Boost.Compute is a GPU/parallel-computing library for C++ based on OpenCL. The core library is a thin C++ wrapper over the OpenCL API an

Jan 5, 2023
C++React: A reactive programming library for C++11.

C++React is reactive programming library for C++14. It enables the declarative definition of data dependencies between state and event flows. Based on

Dec 22, 2022
A library for enabling task-based multi-threading. It allows execution of task graphs with arbitrary dependencies.

Fiber Tasking Lib This is a library for enabling task-based multi-threading. It allows execution of task graphs with arbitrary dependencies. Dependenc

Dec 30, 2022
The C++ Standard Library for Parallelism and Concurrency

Documentation: latest, development (master) HPX HPX is a C++ Standard Library for Concurrency and Parallelism. It implements all of the corresponding

Jan 3, 2023
A C++ library of Concurrent Data Structures

CDS C++ library The Concurrent Data Structures (CDS) library is a collection of concurrent containers that don't require external (manual) synchroniza

Jan 3, 2023
OpenCL based GPU accelerated SPH fluid simulation library

libclsph An OpenCL based GPU accelerated SPH fluid simulation library Can I see it in action? Demo #1 Demo #2 Why? Libclsph was created to explore the

Jul 27, 2022
A header-only C++ library for task concurrency
A header-only C++ library for task concurrency

transwarp Doxygen documentation transwarp is a header-only C++ library for task concurrency. It allows you to easily create a graph of tasks where eve

Dec 19, 2022
VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP

VexCL VexCL is a vector expression template library for OpenCL/CUDA. It has been created for ease of GPGPU development with C++. VexCL strives to redu

Nov 27, 2022
:copyright: Concurrent Programming Library (Coroutine) for C11

libconcurrent tiny asymmetric-coroutine library. Description asymmetric-coroutine bidirectional communication by yield_value/resume_value native conte

Sep 2, 2022
Simple and fast C library implementing a thread-safe API to manage hash-tables, linked lists, lock-free ring buffers and queues

libhl C library implementing a set of APIs to efficiently manage some basic data structures such as : hashtables, linked lists, queues, trees, ringbuf

Dec 3, 2022
An optimized C library for math, parallel processing and data movement

PAL: The Parallel Architectures Library The Parallel Architectures Library (PAL) is a compact C library with optimized routines for math, synchronizat

Dec 11, 2022
A easy to use multithreading thread pool library for C. It is a handy stream like job scheduler with an automatic garbage collector. This is a multithreaded job scheduler for non I/O bound computation.

A easy to use multithreading thread pool library for C. It is a handy stream-like job scheduler with an automatic garbage collector for non I/O bound computation.

Jun 4, 2022