Portable and vendor neutral framework for parallel programming on heterogeneous platforms.
[4199d8f7a6c2f1767262b8dfbfa21669e5fc4914] Removed occa::cuda::getMappedPtr
and occa::opencl::getMappedPtr
and replaced them with
occa::memory::ptr("mapped: true")
[4199d8f7a6c2f1767262b8dfbfa21669e5fc4914] Allocating mapped/pinned memory (CUDA, OpenCL)
It was too verbose and not as flexible to pass
cuda: { mapped: true }
opencl: { mapped: true }
It's now the same for both CUDA and OpenCL
mapped: true
[4199d8f7a6c2f1767262b8dfbfa21669e5fc4914] Allocating unified memory (CUDA)
The driver API uses the method cuMemAllocManaged
so the prop was named accordingly
cuda: { managed: true }
However, most users know this feature as unified memory so we're switching the prop name to unified
.
Similar to mapped allocation, it has been shortened to
unified: true
occaMemoryPtr(occaMemory)
→ occaMemoryPtr(occaMemory, occaProperties)
[4199d8f7a6c2f1767262b8dfbfa21669e5fc4914] Added occa::memory::ptr(occa::properties)
[abc3bea7f4a4b414c5921766bdd84f629b225aec] Added #pragma occa attributes
option
#pragma occa attributes @kernel
void addVectors(const int entries,
const float *a,
const float *b,
float *ab) {
#pragma occa attributes @tile(16, @outer, @inner)
for (int i = 0; i < entries; ++i) {
ab[i] = a[i] + b[i];
}
}
↓
@kernel void addVectors(const int entries,
const float *a,
const float *b,
float *ab) {
for (int i = 0; i < entries; ++i; @tile(16, @outer, @inner)) {
ab[i] = a[i] + b[i];
}
}
occa::getKernelProperties()
occa::free
→ occa::freeUvaPtr
occaDeviceUmalloc
→ occaDeviceUMalloc
occaWaitFor
→ occaWaitForTag
occaDeviceWaitFor
→ occaDeviceWaitForTag
occaTimeBetween
→ occaTimeBetweenTags
occaDeviceTimeBetween
→ occaDeviceTimeBetweenTags
Part of code | % Coverage Change | LOC Coverage Change |
---|---|---|
Headers | 74.7% → 91.5% (+16.8% ) |
714 → 1374 (+660 ) |
C API | 24.1% → 99.4% (+75.3% ) |
139 → 655 (+516 ) |
C++ API | 58.0% → 67.0% (+ 9.0% ) |
8347 → 10374 (+2027 ) |
IO Tooling | 62.2% → 97.3% (+35.1% ) |
225 → 326 (+101 ) |
General Tooling | 51.0% → 63.7% (+12.7% ) |
1249 → 1524 (+275 ) |
OKL Parser | 61.4% → 63.7% (+ 2.3% ) |
6479 → 7063 (+584 ) |
occa::exception
occaPropertiesHas
occaFreeUvaPtr
occaUndefined
and occaIsUndefined
occaIsDefault
[e529137738bee329413a39dea6f8f0160e8f5a1a][#154] CLI options that take arguments can be passed as: -Dfoo=1
→ -D foo=3
[3707f9ee4b0e61ba1eefc24bfb82405c075dbe67] Examples have arg parsing to make them more interactive
[dae53084bed303c2fe7999f18f0026aae6bb97f0] occa::sys::rmrf
cannot delete any path that has less than 2 parent directories (e.g. /
or /usr/bin
) without:
occa::settings()["options/safe-rmrf"] = false;
occa::opencl::getCLMappedPtr
→ occa::opencl::getMappedPtr
getMappedPtr
for OpenCL and CUDApopUp
@noelchalmers @jdahm
[d1fd6e0, 8691434] In order to standardize key names in properties
, we're moving to snake_case
which is a valid JSON5 identifier for JSON Objects. That way short-hand notations such as
{ mode: 'CUDA', device_id: 0 }
are still valid
Changes:
deviceID
→ device_id
platformID
→ platform_id
threadCount
→ threads
pinnedCores
→ pinned_cores
compilerFlags
→ compiler_flags
compilerEnvScripts
→ compiler_env_scripts
[7055d92] Added -I/--include-path
and -D/--define
to occa transform
and occa compile
[a7c578c] Added -v/--verbose
to add transform information in comments
[80b9972] oklForStatements
check the iterator's base type
@pdhahn
[70c9ddf58f3de6d7635d2269ff0b0a51a87ab914] Swapped dump
and toString
[6e760d23c008d5c686132369bb905b00ad5aa256] Added double2
, double3
, double4
)
[f5cf04b30d214c6cad1cbde13d9b66e7481bb983] restrict
-> @restrict
[#145, 3fc6753ebf9aaf9e1890bc340957264ad254d0ed] Added dlerror
messages to dlopen
and dlsym
[cb5eec725fe2c5e892a2d8e12a8fd3890f18068a] Fixed ~/
expansion
[d3bec390e9a7495e7bf513881932f347e2052fa7, 9d26399dc24aabff35ad13781198eb2f985da1a2] Added compile and translate options to occa
[#133, fd248c62df1fca4a6ae91fcc8871ff34c9f13e0c] Added vartype nodes for parenCast expressions [#136, #140, e672972bba4e7b6337ec9fd04a9de9b6285f5089] Added type expansion to get around issue [#147, f8a4ac81f3fc7bd16a1b49d1d779deab2dd39f51] Fixed statement attributes getting overridden [db2f26361c6c72ac6a3cc4a05c516953c8b3a455] withLauncher success also depends on the host
@jedbrown
kernel::free()
removes itself from the device kernel cacheOpenCL
mode the kernel compilation will be verbose:{
kernel: { verbose: false },
mode: {
OpenCL: {
kernel: { verbose: true },
}
}
}
occa::json
(still keeps it as \uXXXX
for the user to parse)OCCA_VERBOSE
worksoccaPtr
fix