Add support for getKernelMaxWorkGroupSize(), getKernelCompileWorkGroupSize(), getKernelPreferredWorkGroupSizeMultiple(), getKernelMinimumPrivateMemSizeInUsePerWorkItem() and getKernelLocalMemSizeInUse()
Fixed Barriers give inconsistent results on NVIDIA backend.
New Kernel.compile(...) methods for forcing pre-compilation of a kernel without executing it
Fixed NPE bug for Kernel.getProfileReportCurrentThread(device) and similar methods
Fixed bug where ClassModel would throw an error when loaded if boot strap methods were 0.
Aparapi can now run on any OpenCL version rather than failing on untested versions it produces a warning.
Fix Java execution mode with barriers to not deadlock when a thread dies or is interrupted (InterruptedException)
Fix Java execution mode to fail-fast when Kernel execution fails
Java execution mode now provides detailed backtraces of failed Kernel threads including passId, groupIds, globalIds and localIds
Internal translation of bytecode is now facilitated by the BCEL library
Scala support has been added (see unit tests).
Fix arrays of AtomicInteger stored on local variables no longer fail with type cast exception while generating OpenCL (support for I_ALOAD_0,1,2,3 bytecode instructions)
v1.9.0
6 years ago
Fixed local arrays handling 1D and ND, to cope with arrays resizing across kernel executions
Significant speed-up on discrete GPUs with dedicated memory - OpenCLDevice.setSharedMemory(false)
Now supports efficient execution on discrete GPU and other devices with dedicated memory
Support for OpenCLDevice configurator/configure API
v1.8.0
6 years ago
Updated KernelManager to facilitate class extensions having constructors with non static parameters
Enable kernel profiling and execution simultaneously on multiple devices (multiple threads calling same kernel class on multiple devices)
Fixed JVM crash when multi-dimensional arrays were used in Local memory (2D and 3D local arrays are now supported)
Fixed bug where signed integer constants were being interpreted as unsigned values during Codegen.
v1.7.0
6 years ago
Fully support barrier() - localBarrier(), globalBarrier() and localGlobalBarrier() on OpenCL 1.2 and later.
Improved exception handling, stack traces no longer double print and Error and other throwables are never caught.
Fix issue causing SEVERE log messages on Aparapi kernel profiling under multithreading.
Provide new interfaces for thread safe kernel profiling (mutiple threads calling same kernel class on same device).
Fixed occasional deadlock in JTP execution mode.
Significant speedup when running in JTP execution mode.
v1.6.0
6 years ago
Added support for Local arguments in kernel functions
Added full support for atomic operations of arrays of integers on OpenCL 1.2 and later.
Parent pom no longer points to a snapshot.
v1.5.0
6 years ago
Support for OpenCL 2.1 added.
Support inline array creation in the kernel, which is implemented in the GPU in private memory.
Updated parent pom to v6.
createProgram had the wrong signature producing a unsatisfied link exception that is now fixed.
Build now requires version 3.5.0 of maven due to changes in surefire plugin.
Added the functions popcount and clz
v1.4.1
6 years ago
Fixed NullPointerException when using KernelManger from the KernelManagers class
Now requires maven 3.0.4 or later.
Bumped parent pom version to 4.
Removed explicit versions from several plugins in the pom as these are defined in the parent.
v1.4.0
6 years ago
Updated nexus stagin plugin: 1.6.7 -> 1.6.8
Added Fused Multiply Add support.
Fixed a bug whereby the library failed to load if OpenCL implementation isnt present, system now loads and falls back to native Java.
Fixed several mistakes and typos in the various examples.