Cekirdekler Versions Save

Multi-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).

v1.4.1_update5

6 years ago

flushLastUsedCommandQueue() function added to numberCruncher class.

Used in enqueue mode.

Flushes commands in last used command queue.

Useful for some graphics cards that don't start accepting commands immediately.

Use rar files named ...._update5

v1.4.1_update4

6 years ago

Only difference from update 3 is, addition of "-cl-std=CL1.2" option to compile parameters in C++ project and exchanging new DLL file in the ..._update3.rar and re-adding it as ..._update4 rar file in the entrance folder.

Now this should let you use even more different cards (if you have any failing to use this project). Now it targets OpenCL 1.2 instead of devices' default option.

If you are already using update3, then you only need to copy KutuphaneCL.dll from update4 to your project folder.

v1.4.1_update3

6 years ago

Fixed v1.4.1_update2's device-pool dispose() method to release worker threads successfully.

CekrideklerCPP project: https://github.com/tugrul512bit/CekirdeklerCPP CekrideklerCPP2 project: https://github.com/tugrul512bit/CekirdeklerCPP2

v1.4.1_update2

6 years ago

Added "OpenCL 2.0 Dynamic parallelism" support(along with kernel-only features of OpenCL 2.0 such as work-group-reduction functions).

To use OpenCL 2.0, KutuphaneCL.dll must be derived from CekrideklerCPP2 project(produces KutuphaneCL2.dll which needs to be renamed to KutuphaneCL.dll again). CekirdeklerCPP uses only OpenCL1.2. CekirdeklerCPP2 uses only OpenCL2.0, selects v2.0 platforms and v2.0 devices.

No need to create default queue for dynamic parallelism. It is handled by Cekirdekler.dll whenever enqueue_kernel() function is found in kernel C99 code string. Device-queue size is automatically adjusted between preferred and max values of device.

CekrideklerCPP2 project:

https://github.com/tugrul512bit/CekirdeklerCPP2

v1.3.8

6 years ago

added callback option to ClTask. When a task is executed in device pool, its callback function is called.

uses cekirdeklercpp project v1.3.8

64-binary zip file contains all needed files

v1.3.5

6 years ago

device pool finish() optimized

v1.3.4

6 years ago

bug fixed: pool quitting early when finish() is called and a device hasn't received a task.

v1.3.3

6 years ago

Added serial mode to ClTask to run in ClDevicePool.

Added broadcast mode to ClTask to run in ClDevicePool.

Added single device mode to ClTask to run in ClDevicePool.

Optimized pool performance.

Dropped round-robin mode compatibility. (only compute_at_will is working)

v1.3.2

6 years ago

added multiple kernel instance generation based on compute-id + kernel name (decreases number of clsetkernelarg() calls and makes async queue computing with same kernel name and different parameters)(for tiled computing by task pool + device pool)

added task (to compute() later instead)

added task pool and device pool features (non separable kernels are distributed to devices with greedy algorithm)

uses CekirdeklerCPP 1.3.1 binary (kutuphanecl.dll, 64 bit)

v1.2.12_update

6 years ago

fixed version info, deleted old binaries

v1.2.12:

needs KutuphaneCL.dll (CekirdeklerCPP project) v1.2.12

added concurrency option to single device pipeline class, to limit its number of command queues between 1 and 16 inclusive

optimized it for performance

fixed minor bugs.