Intel® Implicit SPMD Program Compiler
An ISPC release with language extensions for performance fine tuning, cpu definitions for AlderLake
and SapphireRapids
targets, support for macOS ARM targets, and massive update of Intel GPUs support. Windows and Linux binaries in this release support both CPU and GPU targets, while macOS binary supports only CPU. This release is based on patched LLVM 12.0.0.
The language changes include the following:
--enable-llvm-intrinsics
switch. Please refer to LLVM Intrinsic Functions
section of the user manual for more details.assume()
optimization hint, which can be used for communicating assumptions to the optimizer. It will not lead to runtime check, unlike assert()
calls. This is intended for optimizations like removing null pointer checks, removing loop reminders, communicating alignment information to the optimizer, and etc. Please refer to Compiler Optimization Hints
section of the user manual for more details.alloca()
calls.trunc()
standard library functions.Changes for CPU targets:
AlderLake
and SapphireRapids
were added: alderlake
and sapphirerapids
respectively.apple-a7
, apple-a10
, apple-a11
, apple-a12
, apple-a13
, apple-a14
.Using GPU-enabled binaries you can build ISPC programs and run them on Intel(R) Core(tm) Processors with Gen9 graphics (formerly Skylake
, Kaby Lake
, Coffee Lake
) and Gen12 graphics (TigerLake mobile CPU) using --target
options (genx-x8
and genx-x16
) and --cpu
option for specifying particular platform (e.g. --cpu=TGLLP
).
The main GPU feature of the current release is Windows support. There are also a bunch of stability and performance improvements. Here are some of them:
TaskQueue::submit()
method which allows to start executing, but don't wait for the completion.More details about the current state of GPU support are available here: https://ispc.github.io/ispc_for_gen.html
For build instructions check our docker recipe: https://github.com/ispc/ispc/blob/main/docker/ubuntu/xpu_ispc_build/Dockerfile
GPU support is still in Beta stage so you may experience some issues but we strongly encourage you to try it out and give us feedback! You can reach us through Github discussions and issues, or on Twitter (@ispc_updates).
Runtime Dependencies when targeting GPU:
Linux:
Windows:
Components revisions used in GPU-enabled build:
KhronosGroup/SPIRV-LLVM-Translator@0592c4f intel/vc-intrinsics@2d0795c oneapi-src/level-zero@0d30b1f (v1.2.3) llvm/llvm-project@d28af7c (llvmorg-12.0.0) + patches from llvm_patches folder
An ISPC release with several improvements for CPU and Beta support of Intel graphics hardware architectures. The binaries in this release include CPU versions for Windows, Linux, and macOS, and a GPU-enabled Linux binary, which supports both CPU and GPU. CPU binaries are based on patched LLVM 11.0.0, GPU binary is based on patched LLVM 10.0.1.
CPU changes include:
#pragma unroll
and #pragma nounroll
directives
provide loop unrolling optimization hints to the compiler. This pragma may be used
immediately before a loop statement. Currently, this functionality is limited to
uniform for
and do-while
.packed_[load|store]_active()
stdlib functions implementation
(up to 2.5x faster), which now supports 64 bit types.icelake-server
, tigerlake
, alderlake
, sapphirerapids
.ISPC support was added to CMake 3.19 so now you can use the standard CMake approach to find ISPC on the system and use it in your build. https://cmake.org/cmake/help/latest/release/3.19.html#languages
Using GPU-enabled Linux binary you can build ISPC programs and run them on Intel(R)
Core(tm) Processors with Gen9 graphics (formerly Skylake, Kaby Lake, Coffee Lake) and
Gen12 graphics (TigerLake mobile CPU) using --target
options (genx-x8
and
genx-x16
) and --cpu
option for specifying particular platform (e.g. --cpu=TGLLP
).
Stability and performance were significantly improved in this release. Here is the list of new features:
--emit-zebin
switch. You can use this binary from ISPC Runtime by setting
ISPCRT_USE_ZEBIN env variable to 1. Please note that SPIR-V format is still a recommended and default way.--cpu=TGLLP
).We also added examples to demonstrate interoperability with oneAPI DPC++ Compiler. More details about current state of GPU support are available here: https://ispc.github.io/ispc_for_gen.html
For build instructions check our docker recipe: https://github.com/ispc/ispc/blob/main/docker/ubuntu/gen/Dockerfile
GPU support is in Beta stage so you may experience some issues but we strongly encourage to try it out and give us feedback! You can reach us through Github discussions and issues, ISPC mailing list ([email protected]), or on Twitter (@ispc_updates).
Runtime Dependencies:
Intel(R) Graphics Compute Runtime https://github.com/intel/compute-runtime/releases/tag/20.50.18716 Level Zero Loader https://github.com/oneapi-src/level-zero/releases/tag/v1.0.22 OpenMP Runtime. Consult your Linux distribution documentation for the installation of OpenMP runtime instructions. No specific version is required.
Components revisions used in GPU-enabled build:
KhronosGroup/SPIRV-LLVM-Translator@ab5e12a intel/vc-intrinsics@2de2dd4 oneapi-src/level-zero@c6fa2cd (v1.0.22) llvm/llvm-project@ef32c61 (llvmorg-10.0.1) + patches from llvm_patches folder
UPDATE: macOS build was updated on 21 Dec 2020.
A minor ISPC update with a bug fix for AVX512 detection problem on macOS (for more details see issue #1854) and update of GPU version to use Level0 v1.0. CPU binaries are based on patched LLVM 10.0.1.
Runtime Dependencies for GPU-enabled build:
Components revisions used in GPU-enabled build: KhronosGroup/SPIRV-LLVM-Translator@1a5c52f intel/vc-intrinsics@f39ff1e oneapi-src/level-zero@fcc7b7a (v1.0) llvm/llvm-project@ef32c61 (llvmorg-10.0.1) + patches from llvm_patches folder
An ISPC release with several improvements for CPU and initial support of Intel graphics hardware architectures. The binaries in this release include CPU versions for Windows, Linux, and macOS, as previous releases, plus a GPU-enabled Linux binary, which supports both CPU and GPU. CPU binaries are based on patched LLVM 10.0.1.
CPU changes include:
Using GPU-enabled Linux binary you can build ISPC programs and run them on Intel(R) Core(tm) Processors with Gen9 graphics (formerly Skylake, Kaby Lake, Coffee Lake) using new '--target' options: 'genx-x8' and 'genx-x16'. For code generation ISPC uses Vector Compute backend which is the part of 'Intel(R) Graphics Compute Runtime' through SPIR-V interface. This release also includes ISPC Runtime based on 'oneAPI Level Zero' for GPU and 'OpenMP Runtime' for CPU, which creates unified abstraction for executing ISPC code on CPU and GPU.
More details are available here: https://ispc.github.io/ispc_for_gen.html
For build instructions check our docker recipe: https://github.com/ispc/ispc/blob/main/docker/ubuntu/gen/Dockerfile
The stability and performance of GPU part of this release is not mature yet but we strongly encourage to try it out and give us feedback! You can reach us through Github issues, ISPC mailing list ([email protected]), or on Twitter (@ispc_updates).
Runtime Dependencies
Components revisions used in this build: KhronosGroup/SPIRV-LLVM-Translator@1e661b2 intel/vc-intrinsics@a0b66f2 oneapi-src/level-zero@317bc0d (v0.91.21) llvm/llvm-project@d32170d (llvmorg-10.0.0)
An ISPC update, which graduates cross-compilation support to production and
has multiple code generation improvements and bug fixes. AVX512 targets may
get the biggest performance boost due to changed internal representation of
masks (we observed up to 5% speedups), and new switch --opt=disable-zmm
,
which disables using zmm registers in favour of ymm for avx512skx-i32x16 target.
All targets will definitely benefit from LLVM 10.0 backend used in this release.
Here is the list of other changes:
--support-matrix
was added to display information about supported
cross-compilation targets, which are managed by --target-os=<os>
,
--target=<ispc-target>
, and --arch=<arch>
switches.bool
occupies one byte) for better interoperability.uint8
, uint16
, uint32
,
uint64
, and uint
. To detect if these types are supported you can check if
ISPC_UINT_IS_DEFINED macro is defined.extract()
/insert()
for boolean arguments, and abs()
for all integer and
FP types were added to standard library.Supported platforms in this release are below. Rows are hosts, columns are targets. x86 and arm are both 32 and 64 bits, where appropriate.
Windows | Linux | macOS | Android | iOS | PS4 | FreeBSD | |
---|---|---|---|---|---|---|---|
Windows | x86 | x86, arm | x86 | x86, arm | x86 | x86, arm | |
Linux | x86, arm | x86 | x86, arm | x86, arm | |||
macOS | x86, arm | x86 | x86, arm | arm | x86, arm |
This ISPC update includes experimental cross OS compilation support, ARM and AARCH64 support and a bunch of language features and stability fixes.
Here are the details:
neon-i32x4
, neon-i8x16
and neon-i16x8
targets. AARCH64 is supported for neon-i32x4
as well as for a new "double-pumped" 8-wide target: neon-i32x8
.avx2-i32x4
) was added.--cpu=icl
). Note that there is no
special target for new instructions in Ice Lake flavor of AVX512 yet. For now, You
can use SKX targets (avx512skx-i32x8
and avx512skx-i32x16
) with --cpu=icl
.avx512knl-i32x16
).noinline
function qualifier.rsqrt_fast()
and rcp_fast()
functions.--emit-llvm-text
was added to dump LLVM IR in text format.An ISPC top-of-trunk build is now available in the Compiler Explorer
The release is based on a patched LLVM 8.0.0 backend.
An ISPC update with a bunch of new features and stability bug fixes based on a patched LLVM 8.0.0 backend.
Notable new features are:
avx512skx-i32x8
).-O1
switch to optimize for size.#pragma once
in auto-generated headers.-O0
.Also we resumed support for PS4 build.
To efficiently write ISPC programs you can now use the ISPC plug-in for VSCode.
An ISPC update, which brings several new features, has a bunch of stability and
performance bug fixes, and infrastructure improvements for those who are
interested in participating in hacking on the ISPC trunk. We also are also
deprecating KNC support and the KNL-generic target (in favor of the native KNL
target, i.e. avx512knl-i32x16
).
We've added:
aos_to_soa
/soa_to_aos
intrinsics--x86-asm-syntax
switch
documentation is help message)#pragma ignore
in
documentation)Our examples include a new SGEMM example which demonstrates different versions of matrix multiply with various level of optimality. It is useful for learning how to start from a naive implementation and then add various optimizations afterwards. Also, our build system is now based on CMake, as are the examples. So you can use it as a reference for integrating ISPC to your CMake-based project.
For those who are interested in hacking ISPC or trying a bleeding edge development version, we have CI on Linux (Travis-CI) and Windows (Appveyor), including automatic package builds on Windows. We also have Dockerfiles, which demonstrate bringing up your environment for ISPC development.
The release is based on a patched LLVM 5.0.2 backend.
An ISPC update, which brings out-of-the-box debug support on Windows, better performance of most of the targets and a bunch of stability and performance bug fixes.
The release is based on patched LLVM 5.0 backend.
Windows build is now supports only VS2015 and newer. If you are using earlier
versions, the only known problem that you may encounter is a problem with
print
ISPC library function.
AVX512 targets are the main beneficiaries of a newer LLVM backend and demonstrate the biggest performance improvements. SVML support is also now available on these targets (requires linking by ICC compiler).
An ISPC update with new native AVX512 target for future Xeon CPUs and
improvements for debugging, including new switch --dwarf-version
to support
debugging on old systems.
The release is based on patched LLVM 3.8.