Taichi Versions Save

Productive, portable, and performant GPU programming in Python.

v1.1.3

1 year ago

Highlights:

Aot module
- Added texture interfaces to C-API (#5520) (by PENGUINLIONG)
Bug fixes
- Disable vkCmdWriteTimestamp with MacOS to enable tests on Vulkan (#6020) (by Zhanlue Yang)
- Fix printing i8/u8 (#5893) (by Yi Xu)
- Fix wrong type cast in codegen of storing quant floats (#5818) (by Yi Xu)
- Remove wrong optimization: Float x // 1 -> x (#5672) (by Yi Xu)
Build system
- Clean up Taichi core cmake (#5595) (by Bo Qiao)
CI/CD workflow
- Update torch and cuda version (#6054) (by pengyu)
Documentation
- Refactor field (#6006) (by Zhao Liang)
- Update docstring of pow() (#6046) (by Yi Xu)
- Fix spelling of numerical and nightly in README.md (#6025) (by Lauchlin)
- Added Accelerate Python (#5940) (by Vissidarte-Herman)
- New FAQs added (#5784) (by Olinaaaloompa)
- Update type cast (#5831) (by Zhao Liang)
- Update global_settings.md (#5764) (by Zhao Liang)
- Update init docstring (#5759) (by Zhao Liang)
- Add introduction to quantized types (#5705) (by Yi Xu)
- Add docs for GGUI's new features (#5647) (by Mocki)
- Add introduction to forward mode autodiff (#5680) (by Mingrui Zhang)
- Add doc about offline cache (#5646) (by Mingming Zhang)
- Typo in the doc. (#5652) (by dongqi shen)
Error messages
- Add error when breaking/continuing a static for inside non-static if (#5755) (by Lin Jiang)
- Do not show warning when the offline cache path does not exist (#5747) (by Lin Jiang)
Language and syntax
- Sort coo to build correct csr format sparse matrix on GPU (#6050) (by pengyu)
- MatrixNdarray refactor part6: Add scalarization for LocalLoadStmt & GlobalLoadStmt with TensorType (#6024) (by Zhanlue Yang)
- MatrixField refactor 4/n: Disallow invalid matrix field definition (#6074) (by Yi Xu)
- Fixes matrix-vector multiplication (#6014) (by Mike He)
- MatrixNdarray refactor part5: Add scalarization for LocalStoreStmt & GlobalStoreStmt with TensorType (#5946) (by Zhanlue Yang)
- Deprecate SOA-layout for NdarrayMatrix/NdarrayVector (#6030) (by Zhanlue Yang)
- Indexing for new local matrix implementation (#5783) (by Mike He)
- Make scalar kernel arguments immutable (#5990) (by Lin Jiang)
- Demote pow() with integer exponent (#6044) (by Yi Xu)
- Support abs(i64) (#6018) (by Yi Xu)
- MatrixNdarray refactor part4: Lowered TensorType to CHI IR level for elementwise-indexed MatrixNdarray (#5936) (by Zhanlue Yang)
- MatrixNdarray refactor part3: Enable TensorType for MatrixNdarray at Frontend IR level (#5900) (by Zhanlue Yang)
- Support linear system solving on GPU with cuSolver (#5860) (by pengyu)
- MatrixNdarray refactor part2: Remove redundant members in python-scope AnyArray (#5885) (by Zhanlue Yang)
- MatrixNdarray refactor part1: Refactor Taichi kernel argument to use TensorType (#5881) (by Zhanlue Yang)
- MatrixNdarray refactor part0: Support direct TensorType construction in Ndarray and refactor use of element_shape (#5875) (by Zhanlue Yang)
- Enable definition of local matrices/vectors (#5782) (by Mike He)
- Build csr sparse matrix on GPU using coo format ndarray (#5838) (by pengyu)
- Add @python_scope decorator for selected MatrixNdarray/VectorNdarray methods (#5844) (by Zhanlue Yang)
- Make python scope comparison return 1 instead of -1 (#5840) (by daylily)
- Allow implicit conversion of integer types in if conditions (#5763) (by daylily)
- Support sparse matrix on GPU (#5185) (by pengyu)
- Improve loop error message and remove the check for real type id (#5792) (by Zhao Liang)
- Implement index validation for matrices/vectors (#5605) (by Mike He)
MeshTaichi
- Fix nested mesh for (#6062) (by Chang Yu)
Vulkan backend
- Track image layout internally (#5597) (by PENGUINLIONG)

Full changelog:

[bug] [gui] Fix a bug of drawing mesh instacing that cpu/cuda objects have an offset when copying to vulkan object (#6028) (by Mocki)
[bug] Fix cleaning cache failed (#6100) (by PGZXB)
[aot] Support multi-target builds for Apple M1 (#6083) (by PENGUINLIONG)
[spirv] [refactor] Rename debug_ segment to names_ (#6094) (by Ailing)
[dx12] Update codegen for range_for and mesh_for (#6092) (by Xiang Li)
[gui] Direct image presentation & faster direct copy routine (#6085) (by Bob Cao)
[vulkan] Support printing in debug mode on vulkan backend (#6075) (by Ailing)
[bug] Fix crashing when loading old offline cache files (#6089) (by PGZXB)
[ci] Update prebuild binary for llvm 15. (#6091) (by Xiang Li)
[example] Add RHI examples (#5969) (by Bob Cao)
[aot] Pragma once in taichi.cpp (#6088) (by PENGUINLIONG)
[Lang] Sort coo to build correct csr format sparse matrix on GPU (#6050) (by pengyu)
[build] Refactor test infrastructure for AOT tests (#6064) (by Zhanlue Yang)
[Lang] MatrixNdarray refactor part6: Add scalarization for LocalLoadStmt & GlobalLoadStmt with TensorType (#6024) (by Zhanlue Yang)
[Lang] MatrixField refactor 4/n: Disallow invalid matrix field definition (#6074) (by Yi Xu)
[bug] Remove unnecessary lower() in AotModuleBuilder::add (#6068) (by PGZXB)
[lang] Preserve shape info for Vectors (#6076) (by Mike He)
[misc] Simplify PR template (#6063) (by Ailing)
[Bug] Disable vkCmdWriteTimestamp with MacOS to enable tests on Vulkan (#6020) (by Zhanlue Yang)
[bug] Set cfg.offline_cache after reset() (#6073) (by PGZXB)
[ci] [dx12] Enable dx12 build for windows cpu ci. (#6069) (by Xiang Li)
[ci] Upgrade conda cudatoolkit version to 11.3 (#6070) (by Proton)
[Mesh] [bug] Fix nested mesh for (#6062) (by Chang Yu)
[Lang] Fixes matrix-vector multiplication (#6014) (by Mike He)
[ir] MatrixField refactor 3/n: Add MatrixFieldExpression (#6010) (by Yi Xu)
[dx12] Drop code for llvm passes which prepare for DXIL generation. (#5998) (by Xiang Li)
[aot] Guard C-API interfaces with try-catch (#6060) (by PENGUINLIONG)
[CI] Update torch and cuda version (#6054) (by pengyu)
[Lang] MatrixNdarray refactor part5: Add scalarization for LocalStoreStmt & GlobalStoreStmt with TensorType (#5946) (by Zhanlue Yang)
[Lang] Deprecate SOA-layout for NdarrayMatrix/NdarrayVector (#6030) (by Zhanlue Yang)
[aot] Dump required device capability in AOT module meta (#6056) (by PENGUINLIONG)
[Doc] Refactor field (#6006) (by Zhao Liang)
[Lang] Indexing for new local matrix implementation (#5783) (by Mike He)
[lang] Reformat source indicator in Python convention (#6053) (by PENGUINLIONG)
[misc] Enable offline cache in frontend instead of C++ Side (#6051) (by PGZXB)
[lang] Remove redundant codegen of integer pow (#6048) (by Yi Xu)
[Doc] Update docstring of pow() (#6046) (by Yi Xu)
[Lang] Make scalar kernel arguments immutable (#5990) (by Lin Jiang)
[build] Fix compile error on gcc (#6047) (by PGZXB)
[llvm] [refactor] Split LLVMCompiledData of kernels and tasks (#6019) (by Lin Jiang)
[Lang] Demote pow() with integer exponent (#6044) (by Yi Xu)
[doc] Refactor type system (#5984) (by Zhao Liang)
[test] Change deprecated make_camera() to Camera() (#6009) (by Zihua Wu)
[doc] Fix a typo in README.md (#6033) (by OccupyMars2025)
[misc] Lazy load spirv code from disk during offline cache (#6000) (by PGZXB)
[aot] Fixed compilation on Linux distros (#6043) (by PENGUINLIONG)
[bug] [test] Run C-API tests correctly on Windows (#6038) (by PGZXB)
[aot] C-API texture support and tests (#5994) (by PENGUINLIONG)
[Doc] Fix spelling of numerical and nightly in README.md (#6025) (by Lauchlin)
[doc] Fixed a format issue (#6023) (by Vissidarte-Herman)
[doc] Indenting (#6022) (by Vissidarte-Herman)
[Lang] Support abs(i64) (#6018) (by Yi Xu)
[lang] Merge ti_core.make_index_expr and ti_core.subscript (#5993) (by Zhanlue Yang)
[llvm] [refactor] Remove the use of vector with size=1 (#6002) (by Lin Jiang)
[bug] [test] Fix patch_os_environ_helper (#6017) (by Lin Jiang)
[ci] Remove legacy perf monitoring (to be reworked) (#6015) (by Proton)
Fix (#5999) (by PGZXB)
[doc] Format updates (#6016) (by Olinaaaloompa)
[refactor] Turn on torch_io tests for opengl, vulkan and dx11 backend (#5997) (by Ailing)
[ci] Adjust Windows GPU task buildbot tag (#6008) (by Proton)
Fixed compilation (#6005) (by PENGUINLIONG)
[autodiff] Avoid initializing Field with None (#6007) (by Yi Xu)
[doc] Cloth simulation tutorial (#6004) (by Olinaaaloompa)
[Lang] MatrixNdarray refactor part4: Lowered TensorType to CHI IR level for elementwise-indexed MatrixNdarray (#5936) (by Zhanlue Yang)
[llvm] [refactor] Link modules instead of cloning modules (#5962) (by Lin Jiang)
[dx12] Drop code for dxil generation. (#5958) (by Xiang Li)
[ci] Windows Build: Use PowerShell 7 (pwsh) (#5996) (by Proton)
Use CUDA primary context to work with PyTorch and Numba. (#5992) (by Haidong Lan)
[vulkan] Implement offline cache cleaning on vulkan (#5968) (by PGZXB)
[ir] MatrixField refactor 2/n: Rename GlobalVariableExpression to FieldExpression (#5989) (by Yi Xu)
[build] Add option to generate dependency graph of cmake targets (#5966) (by Ailing)
[autodiff] Fix matrix dual (#5985) (by Mingrui Zhang)
[doc] Add a note about the offline cache in the print_ir part (#5986) (by Zihua Wu)
[ir] MatrixField refactor 1/n: Make a GlobalVariableExpression solely represent a field (#5980) (by Yi Xu)
[aot] [opengl] Add GL interop so that we can export GL memory (#5956) (by Ailing)
[ci] Temporarily lower CUDA tests parallelism by 1 (4->3) (#5981) (by Proton)
[aot] Temporarily allow Vulkan extensions to be automatically enabled in C-API runtimes (#5976) (by PENGUINLIONG)
[cuda] Clear cuda context after init (#5891) (by Haidong Lan)
[ir] MatrixField refactor 0/n: Remove redundant code for SNode expressions (#5964) (by Yi Xu)
[test] [gui] Skip test of drawing mesh instances on Windows platform (#5971) (by Mocki)
[test] [gui] Add a test of fetching depth attachment (#5947) (by Mocki)
[test] [gui] Add a test of the display of wireframe mode (#5967) (by Mocki)
[aot] Support i8/u8 args in cgraph (#5961) (by Ailing)
RHI fixes & improvements (#5950) (by Bob Cao)
[test] [gui] Add a test of drawing part of mesh instances (#5965) (by Mocki)
[bug] Force CMAKE_OSX_ARCHITECTURES in sync with host processor's architecture to avoid ABI issues on Mac m1 (#5952) (by Zhanlue Yang)
[doc] Add explanatation of usage of TI_VISIBLE_DEVICE and CUDA_VISIBLE_DEVICES (#5910) (by Mocki)
[test] [gui] Add a test of drawing mesh instances (#5963) (by Mocki)
[test] [gui] Add a test of drawing part of lines (#5957) (by Mocki)
[doc] Refactor kernels and functions (#5943) (by Zhao Liang)
[dx12] Drop code for dx12 codegen. (#5953) (by Xiang Li)
[test] [gui] Add a test of drawing part of mesh (#5955) (by Mocki)
[test] [gui] Add a test of drawing part of particles (#5951) (by Mocki)
[test] [gui] Add a test of drawing lines (#5948) (by Mocki)
[llvm] [refactor] Refactor and add code about llvm modules (#5941) (by Lin Jiang)
[llvm] [refactor] Move the generation of struct for function to the context (#5937) (by Lin Jiang)
[Doc] Added Accelerate Python (#5940) (by Vissidarte-Herman)
[test] [gui] Add a test of fetching color attachment (#5920) (by Mocki)
[refactor] Refactor the implementation of cleaning offline cache files (#5934) (by PGZXB)
Revert "[vulkan] Less sync overhead for GGUI & Device API Examples (#5880)" (#5945) (by PENGUINLIONG)
[vulkan] Less sync overhead for GGUI & Device API Examples (#5880) (by Bob Cao)
[autodiff] Fix adjoint checkbit type in gdar checker (#5938) (by Mingrui Zhang)
[bug] Fix undefined symbol error by isolating CacheManager as a separate target (#5931) (by PGZXB)
[aot] Fix LLVM submit (#5930) (by PENGUINLIONG)
[refactor] [llvm] Unify llvm_type() and get_data_type() (#5927) (by Yi Xu)
[llvm] Add attributes to LLVMCompiledData (#5929) (by Lin Jiang)
[llvm] [refactor] Move the common parts of compilation to the base class (#5926) (by Lin Jiang)
Update index.md (#5928) (by Zhao Liang)
[doc] Refactor "Getting Started" (#5902) (by Zhao Liang)
[autodiff] Support clear gradients by type (#5911) (by Mingrui Zhang)
[Lang] MatrixNdarray refactor part3: Enable TensorType for MatrixNdarray at Frontend IR level (#5900) (by Zhanlue Yang)
[aot] Taichi C-API C++ wrapper (#5899) (by PENGUINLIONG)
[llvm] Enhance function is_same_type (#5922) (by Lin Jiang)
[Lang] Support linear system solving on GPU with cuSolver (#5860) (by pengyu)
Bugfix: Minor issue for CUPTI kernel profiler (#5879) (by Jack He)
[Error] [lang] Add error when breaking/continuing a static for inside non-static if (#5755) (by Lin Jiang)
[llvm] [refactor] Rename methods in KernelCodeGen (#5919) (by Lin Jiang)
[refactor] [ir] Remove legacy LocalAddress / VectorElement / create_vector_or_scalar_type() (#5918) (by Yi Xu)
[autodiff] Fix validate autodiff kernel name lost (#5912) (by Mingrui Zhang)
[misc] Disable parallel compilation (#5916) (by Lin Jiang)
[refactor] [ir] Remove legacy LaneAttribute (#5901) (by Yi Xu)
[aot] [test] Fix c api aot tests for vulkan and opengl backend (by Ailing Zhang)
[aot] Add C API for opengl backend (by Ailing Zhang)
[doc] Updated forward-mode autodiff (#5894) (by Vissidarte-Herman)
[Lang] MatrixNdarray refactor part2: Remove redundant members in python-scope AnyArray (#5885) (by Zhanlue Yang)
[refactor] [ir] Remove legacy LaneAttribute usage from ExternalPtrStmt/GlobalPtrStmt (#5898) (by Yi Xu)
[ci] Switch windows cpu build to llvm 15. (#5832) (by Xiang Li)
[Lang] MatrixNdarray refactor part1: Refactor Taichi kernel argument to use TensorType (#5881) (by Zhanlue Yang)
[Bug] [lang] Fix printing i8/u8 (#5893) (by Yi Xu)
[misc] Remove FrontendEvalStmt (#5897) (by PGZXB)
[refactor] [ir] Remove legacy ElementShuffleStmt (#5892) (by Yi Xu)
[bug] Remove mistakenly-added C-API compilation from release pipeline (#5890) (by Zhanlue Yang)
[llvm] Fix PtrOffset address for shared array in llvm 15. (#5867) (by Xiang Li)
[vulkan] [refactor] [bug] Redesign gfx::OfflineCacheManager to unify compilation of kernels on vulkan (#5889) (by PGZXB)
[doc] Updated Quantized data types (#5886) (by Vissidarte-Herman)
[Lang] MatrixNdarray refactor part0: Support direct TensorType construction in Ndarray and refactor use of element_shape (#5875) (by Zhanlue Yang)
[refactor] [ir] Remove legacy stmt width (#5882) (by Yi Xu)
[bug] [ggui] Fix cpu vulkan interop build (#5865) (by Ailing)
[refactor] Separate texture args from scalar arg declaration (#5878) (by Ailing)
[Lang] Enable definition of local matrices/vectors (#5782) (by Mike He)
[gfx] Unify the implementation of offline cache for gfx backends (#5868) (by PGZXB)
[bug] Improve error message with GlobalPtrStmt indexing (#5841) (by Zhanlue Yang)
[aot] Workaround build structure to export GGUI symbols in libtaichi_export_core.so (#5870) (by Ailing)
[doc] Updated supported backend DX 11 (#5845) (by Vissidarte-Herman)
[bug] Enabled NdarrayType & MatrixType annotation parsing for ti.func (#5814) (by Zhanlue Yang)
Removed Unexpected debug code in repo (#5866) (by PENGUINLIONG)
[aot] C-API error handling mechanism (#5847) (by PENGUINLIONG)
[vulkan] Detect and set device-capabilities for aot::TargetDevice used in offline cache (#5843) (by PGZXB)
[Lang] Build csr sparse matrix on GPU using coo format ndarray (#5838) (by pengyu)
[gui] Add some built-int math APIs for building translation, scale and rotation matrix (#5827) (by Mocki)
[Lang] Add @python_scope decorator for selected MatrixNdarray/VectorNdarray methods (#5844) (by Zhanlue Yang)
[Lang] Make python scope comparison return 1 instead of -1 (#5840) (by daylily)
[vulkan] Support offline cache on Vulkan (#5825) (by PGZXB)
[Lang] Allow implicit conversion of integer types in if conditions (#5763) (by daylily)
making gravity option used (#5836) (by Michael Xu)
[autodiff] Add grad type for SNode (#5805) (by Mingrui Zhang)
[Doc] New FAQs added (#5784) (by Olinaaaloompa)
[dx12] Drop code for dx12. (#5816) (by Xiang Li)
[Doc] Update type cast (#5831) (by Zhao Liang)
[autodiff] Fix global data access rule checker memory allocation (#5801) (by Mingrui Zhang)
[misc] Bump version to v1.1.3 (#5823) (by Ailing)
[doc] Updated compilation warnings (#5808) (by Vissidarte-Herman)
Fixed crash in SPIR-V CodeGen when a const is declared twice (#5813) (by PENGUINLIONG)
[test] Refactor opengl and vulkan cpp aot tests (#5812) (by Ailing)
[Bug] [type] Fix wrong type cast in codegen of storing quant floats (#5818) (by Yi Xu)
[autodiff] Support shift ptr in dynamic index (#5770) (by Mingrui Zhang)
[ci] Regenerate AOT binaries for every Android smoke test (#5815) (by Proton)
[ci] Disable show_env job on fork repo (#5811) (by Bo Qiao)
[bug] Fix: GraphBuilder::Sequential unable to handle Matrix-type argument (#5806) (by Zhanlue Yang)
[build] [bug] Fix Taichi build with vulkan on and opengl off (#5807) (by Bo Qiao)
[test] Add a cgraph test with template args in ti.func (#5803) (by Ailing)
[Lang] Support sparse matrix on GPU (#5185) (by pengyu)
[test] Enable llvm aot tests for vulkan and opengl backend (#5795) (by Ailing)
Fixed a display issue (#5796) (by Vissidarte-Herman)
[Lang] Improve loop error message and remove the check for real type id (#5792) (by Zhao Liang)
[Doc] Update global_settings.md (#5764) (by Zhao Liang)
[ci] Workaround nightly C++ test crash (#5789) (by Proton)
[llvm] Disable f16 atomic hack for llvm 15. (#5756) (by Xiang Li)
[aot] Workaround C-API build structure to include GGUI symbols (#5787) (by Zhanlue Yang)
[ci] Skip C++ tests on macOS + CPU (#5778) (by Proton)
[Doc] Update init docstring (#5759) (by Zhao Liang)
[doc] Fix struct to numpy example typo (#5781) (by Garry Ling)
[test] Get rid of utils.py in aot python scripts (#5785) (by Ailing)
[refactor] Refactor C++ aot test to accommodate multiple backends (by Ailing Zhang)
[test] Expand python aot test coverage for opengl backend (by Ailing Zhang)
[opengl] Fix target device for opengl aot (by Ailing Zhang)
Fixed GGUI scene mesh memory leak (#5779) (by PENGUINLIONG)
[autodiff] [test] Recover the forward mode test cases (#5696) (by Mingrui Zhang)
[ci] Test the offline cache every day (#5768) (by PGZXB)
[lang] Update scan impl with shared memory usage (#5762) (by Bo Qiao)
Revert "[aot] Added basic infrastructure for gui_utils interfaces - unimplemented (#5688)" (#5760) (by Zhanlue Yang)
Fixed window size crash (#5765) (by PENGUINLIONG)
[llvm] Support SharedArray global when lower PtrOffsetStmt. (#5758) (by Xiang Li)
[ci] Prevent using the offline cache of previous jobs (#5734) (by Lin Jiang)
[gui] Add GGUI set_image support for non-Vector fields and numpy ndarrays. (#5654) (by Carbene)
[Lang] Implement index validation for matrices/vectors (#5605) (by Mike He)
[aot] Added basic infrastructure for gui_utils interfaces - unimplemented (#5688) (by Zhanlue Yang)
[Error] Do not show warning when the offline cache path does not exist (#5747) (by Lin Jiang)
[vulkan] Fixed query pool invalid usage (#5717) (by PENGUINLIONG)
[llvm] Fix crash caused on ByVal Attribute when switch to llvm 15. (#5745) (by Xiang Li)
[bug] Fix incorrect autodiff_mode information in offline cache key (#5737) (by Mingming Zhang)
[lang] Add parallel scan prefix sum utility (#5697) (by Bo Qiao)
[AOT] Added texture interfaces to C-API (#5520) (by PENGUINLIONG)
[autodiff] Move the global data access rule checker to experimental (#5719) (by Mingrui Zhang)
[ci] Rename lite test CI label, force full test on rc branch (#5732) (by Proton)
[ci] Fix TI_SKIP_CPP_TESTS (#5720) (by Proton)
[misc] Bump version to v1.1.1 (#5726) (by Taichi Gardener)
Improve Windows build script (#5611) (by PENGUINLIONG)
Remove miscommited file (#5727) (by Bo Qiao)
[test] Fix autodiff test for unsupported shift ptr (#5723) (by Mingrui Zhang)
[Doc] [type] Add introduction to quantized types (#5705) (by Yi Xu)
Fix shared array for all Vulkan versions. (#5722) (by Haidong Lan)
[autodiff] Clear all dual fields when exiting context manager (#5716) (by Mingrui Zhang)
[bug] Support indexing via np.integer for field (#5712) (by Ailing)
[vulkan] Relax number of array args for each kernel (#5689) (by Ailing)
[bug] Properly delete functions of a SNode Tree (#5710) (by Lin Jiang)
[Doc] Add docs for GGUI's new features (#5647) (by Mocki)
[gui] Support set_image with texture (#5655) (by PENGUINLIONG)
[Doc] Add introduction to forward mode autodiff (#5680) (by Mingrui Zhang)
[autodiff] Fix AdStackAllocaStmt not correctly backup (#5692) (by Mingrui Zhang)
[doc] Rename ti.struct_class to ti.dataclass (#5706) (by Yi Xu)
[gui] GGUI renames (#5704) (by PENGUINLIONG)
[Vulkan] Track image layout internally (#5597) (by PENGUINLIONG)
[ci] Confine show_environ task to Linux bots (#5677) (by Proton)
[build] [refactor] Decouple GUI source files from taichi_core target (#5676) (by Bo Qiao)
[ci] Rename libcommon.sh -> common-utils.sh, remove expore core build task (#5673) (by Proton)
[ci] Temporarily disable a M1 vulkan test (#5701) (by Proton)
[doc] Add comments to explain the commit Id for Build Andriod Demos CI pipeline (#5700) (by Zhanlue Yang)
[bug] Fix ndarray arg with shape=(1,) in cgraph (#5666) (by Ailing)
[doc] Fix typo (#5695) (by Proton)
[ci] Temporarily disable M1 vulkan tests (bot3 is ill) (#5698) (by Proton)
[build] Add commit id to taichi-aot-demo for Build-Andriod-Demos CI pipeline (#5693) (by Zhanlue Yang)
[bug] Fix potential bug of loading of offline cache (#5682) (by Mingming Zhang)
[aot] Add Comet Demo to AOT test cases (#5671) (by Zhanlue Yang)
[Doc] Add doc about offline cache (#5646) (by Mingming Zhang)
[lang] Make vector a real class (#5653) (by PENGUINLIONG)
[spirv] [bug] Fix invalid CFG error when simulated atomic is present (#5678) (by Ailing)
[ci] Fix python/examples crashes in non-lite test mode (#5670) (by Proton)
[Bug] [opt] Remove wrong optimization: Float x // 1 -> x (#5672) (by Yi Xu)
[bug] [ir] Change the way to convert for-loop with break to while-loop (#5674) (by Lin Jiang)
[aot] Add SPH to AOT test cases (#5642) (by Zhanlue Yang)
[ci] Only run portion of tests on PR draft (#5626) (by Proton)
[autodiff] Use external array fill to initialize and clear seed (#5656) (by Mingrui Zhang)
[bug] Fix bug that kernel names are not correctly captured by the profiler (#5651) (by Mingming Zhang)
[lang] Give warning about printing in Vulkan (#5661) (by PENGUINLIONG)
Support exporting vertex velocity (#5644) (by YuZhang)
[aot] Add helper function to construct Ndarray in C-API (#5641) (by Zhanlue Yang)
[gui] GGUI scene APIs are broken (#5658) (by PENGUINLIONG)
[Doc] Typo in the doc. (#5652) (by dongqi shen)
[autodiff] Add the global data access rule checker (by mingrui)
[autodiff] Add gradient visited for global data access rule checker (by mingrui)
[autodiff] Print more specific error message that autodiff does not support to_numpy (#5630) (by PhrygianGates)
[ci] Drop py36 in nightly and release (#5640) (by Ailing)
[misc] Explicitly specify base tag commit when running make_changelog.py (#5632) (by Ailing)
[aot] Rewrite mpm88 aot test with C-API (#5615) (by Zhanlue Yang)
[Build] Clean up Taichi core cmake (#5595) (by Bo Qiao)

v1.1.2

1 year ago

This is a bug fix release for v1.1.0. Full changelog:

[misc] Bump version to v1.1.2
[Bug] [type] Fix wrong type cast in codegen of storing quant floats (https://github.com/taichi-dev/taichi/pull/5818)
[bug] Fix incorrect autodiff_mode information in offline cache key (https://github.com/taichi-dev/taichi/pull/5737)
[Error] Do not show warning when the offline cache path does not exist (https://github.com/taichi-dev/taichi/pull/5747)
[autodiff] Support shift ptr in dynamic index (https://github.com/taichi-dev/taichi/pull/5770)

v1.1.0

1 year ago

Highlights

New features

Quantized data types

High-resolution simulations can deliver great visual quality, but are often limited by the capacity of the onboard GPU memory. This release adds quantized data types, allowing you to define your own integers, fixed-point numbers, or floating-point numbers of arbitrary number of bits that may strike a balance between your hardware limits and simulation effects. See Using quantized data types for a comprehensive introduction.

Offline cache

A Taichi kernel is implicitly compiled the first time it is called. The compilation results are kept in an online in-memory cache to reduce the overhead in the subsequent function calls. As long as the kernel function is unchanged, it can be directly loaded and launched. The cache, however, is no longer available when the program terminates. Then, if you run the program again, Taichi has to re-compile all kernel functions and reconstruct the online in-memory cache. And the first launch of a Taichi function is always slow due to the compilation overhead. To address this problem, this release adds the offline cache feature, which dumps the compilation cache to the disk for future runs. The first launch overhead can be drastically reduced in subsequent runs. Taichi now constructs and maintains an offline cache by default. The following table shows the launch overhead of running cornell_box on the CUDA backend with and without offline cache:

	Time spent on compilation and cached data loading
Offline cache disabled	24.856s
Offline cache enabled (1st run)	25.435s
Offline cache enabled (2nd run)	0.677s

Note that, for now, the offline cache feature works only on the CPU and CUDA backends. If your code behaves abnormally, disable offline cache by setting the environment variable TI_OFFLINE_CACHE=0 or ti.init(offline_cache=False) and file an issue with us on Taichi's GitHub repo. See Offline cache for more information.

Forward-mode automatic differentiation

Adds forward-mode automatic differentiation via ti.ad.FwdMode. Unlike the existing reverse-mode automatic differentiation, which computes vector-Jacobian product (vJp), forward-mode computes Jacobian-vector product (Jvp) when evaluating derivatives. Therefore, forward-mode automatic differentiation is much more efficient in situations where the number of a function's outputs is greater than its inputs. Read this example, which demonstrates Jacobian matrix computation in forward mode and reverse mode.

SharedArray (experimental)

GPU's shared memory is a fast small memory that is visible within each thread block (or workgroup in Vulkan). It is widely used in scenarios where performance is a crucial concern. To give you access to your GPU's shared memory, this release adds the SharedArray API under the namespace ti.simt.block. The following diagram illustrates the performance benefits of Taichi's SharedArray. With SharedArray, Taichi Lang is comparable to or even outperforms the equivalent CUDA code.

n-body benchmarking

Texture (experimental)

Taichi now supports texture bilinear sampling and raw texel fetch on both Vulkan and OpenGL backends. This feature leverages the hardware texture unit and diminishes the need for manual composition of bilinear interpolation code in image processing tasks. This feature also provides an easy way for texture mapping in tasks such as rasterization or ray-tracing. On Vulkan backend, Taichi additionally supports image load and store. You can directly manipulate texels of an image and use this very image in subsequent texture mapping.

Note that the current texture and image APIs are in the early stages and subject to change. In the future we plan to support bindless textures to extend to tasks such as ray-tracing. We also plan to extend full texture support to all backends that support texture APIs.

Run ti example simple_texture to see an example of texture support!

Improvements

GGUI

Supports fetching and storing the depth information of the current scene:
- In a Taichi field: ti.ui.Window.get_depth_buffer(field);
- In a NumPy array: ti.ui.Window.get_depth_buffer_as_numpy().
Supports drawing 3D lines using Scene.lines(vertices, width).
Supports drawing mesh instances. You can pass a list of transformation matrices (ti.Matrix.field(4, 4, ti.f32, shape=N)) and call ti.ui.Scene.mesh_instance(vertices, transforms=TransformMatrixField) to put various mesh instances at different places.
Supports showing the wireframe of a mesh when calling Scene.mesh() or Scene.mesh_instance() by setting show_wireframe=True.

Syntax

Taichi dataclass: Taichi now recommends using the @ti.dataclass decorator to define struct types, or even attach functions to them. See Taichi dataclasses for more information.

@ti.dataclass
class Sphere:
  center: vec3
  radius: ti.f32
  @ti.func
  def area(self):
    # a function to run in taichi scope
    return 4 * math.pi * self.radius * self.radius
  def is_zero_sized(self):
    # a python scope function
    return self.radius == 0.0

As shown in the dataclass example above, vec2, vec3, and vec4 in the taichi.math module (same for ivec and uvec) can be directly used as type hints. The numeric precision of these types is determined by default_ip or default_fp in ti.init().
More flexible instantiation for a struct or dataclass: In earlier releases, to instantiate a taichi.types.struct and taichi.dataclass, you have to explicitly put down a complete list of member-value pairs like:
```
ray = Ray(ro=vec3(0), rd=vec3(1, 0, 0), t=1.0)
```
As of this release, you are given more options. The positional arguments are passed to the struct members in the order they are defined; the keyword arguments set the corresponding struct members. Unspecified struct members are automatically set to zero. For example:
```
# use positional arguments to set struct members in order
ray = Ray(vec3(0), vec3(1, 0, 0), 1.0)

# ro is set to vec3(0) and t will be set to 0
ray = Ray(vec3(0), rd=vec3(1, 0, 0))

# both ro and rd are set to vec3(0)
ray = Ray(t=1.0)

# ro is set to vec3(1), rd=vec3(0) and t=0.0
ray = Ray(1)

# all members are set to 0.
ray = Ray()
```
Supports calling fill() from both the Python scope and the Taichi scope. In earlier releases, you can only call fill() from the Python scope, which is a method in the ScalarField or MatrixField class. As of this release, you can call this method from either the Python scope or the Taichi scope. See the following code snippet:
```
x = ti.field(int, shape=(10, 10))
x.fill(1)

@ti.kernel
def test():
    x.fill(-1)
```

More flexible initialization for customized matrix types: As the following code snippet shows, matrix types created using taichi.types.matrix() or taichi.types.vector() can be initialized more flexibly: Taichi automatically combines the inputs and converts them to a matrix whose shape matches the shape of the target matrix type.

# mat2 and vec3 are predefined types in the ti.math module
mat2 = ti.types.matrix(2, 2, float)
vec3 = ti.types.vector(3, float)

m = mat2(1)  # [[1., 1.], [1., 1.]]
m = mat2(1, 2, 3, 4)  # [[1., 2.], [3, 4.]]
m = mat2([1, 2], [3, 4])  # [[1., 2.], [3, 4.]]
m = mat2([1, 2, 3, 4])  # [[1., 2.], [3, 4.]]
v = vec3(1, 2, 3)
m = mat2(v, 4)  # [[1., 2.], [3, 4.]]

Makes ti.f32(x) syntax sugar for ti.cast(x, ti.f32), if x is neither a literal nor of a compound data type. Same for other primitive types such as ti.i32, ti.u8, or ti.f64.
More convenient axes order adjustment: A common way to improve the performance of a Taichi program is to adjust the order of axes when laying out field data in the memory. In earlier releases, this requires in-depth knowledge about the data definition language (the SNode system) and may become an extra burden in situations where sparse data structures are not required. As of this release, Taichi supports specifying the order of axes when defining a Taichi field.
```
# Before
x = ti.field(ti.i32)
y = ti.field(ti.i32)
ti.root.dense(ti.i, M).dense(ti.j, N).place(x)  # row-major
ti.root.dense(ti.j, N).dense(ti.i, M).place(y)  # column-major
# New syntax
x = ti.field(ti.i32, shape=(M, N), order='ij')
y = ti.field(ti.i32, shape=(M, N), order='ji')
# SoA vs. AoS example
p = ti.Vector.field(3, ti.i32, shape=(M, N), order='ji', layout=ti.Layout.SOA)
q = ti.Vector.field(3, ti.i32, shape=(M, N), order='ji', layout=ti.Layout.AOS)
```

Important bug fixes

Fixed infinite loop when an integer pow() has a negative exponent (#5275)
Fixed numerical issues with matrix slicing (#4677)
Improved data type checks for ti.ndrange (#4478)

API changes

Added

ti.BitpackedFields
ti.from_paddle
ti.to_paddle
ti.FieldsBuilder.lazy_dual
ti.math module
ti.Texture
ti.ref
ti.dataclass
ti.simt.block.SharedArray

Moved

Old API	New API
`ti.clear_all_gradients`	`ti.ad.clear_all_gradients`
`ti.Tape`	`ti.ad.Tape`
`ti.FieldsBuilder.bit_array`	`ti.FieldsBuilder.quant_array`
`ti.ui.Window.write_image`	`ti.ui.Window.save_image`
`ti.ui.Window.GUI`	`ti.ui.Window.get_gui`

Deprecated

ti.ui.make_camera: Please construct cameras with ti.ui.Camera instead.

Deprecation notice

Python 3.6

As announced in v1.0.0 release, we no longer provide official python3.6 wheels through pypi. Users who need taichi with python3.6 may still build from source but its support is not guaranteed.

Taichi_GLSL

The taichi_glsl package on pypi will no longer be maintained as of this release. GLSL-related features will be implemented in the official taichi.math module, which includes data types and handy functions for daily math and shader development:

Vector types: vec2, vec3, and vec4.
Matrix types: mat2,mat3, and mat4.
GLSL functions such as step(),clamp(), and smoothstep().

MacOS 10.14

Official support for MacOS Mojave (10.14, released in 2018) will be dropped starting from v1.2.0. Please upgrade your MacOS if possible or let us know if you have any concerns.

Full changelog:

[misc] Update version to v1.1.0 (by Ailing Zhang)
[test] Fix autodiff test for unsupported shift ptr (#5723) (by Mingrui Zhang)
[Doc] [type] Add introduction to quantized types (#5705) (by Yi Xu)
[autodiff] Clear all dual fields when exiting context manager (#5716) (by Mingrui Zhang)
[bug] Support indexing via np.integer for field (#5712) (by Ailing)
[Doc] Add docs for GGUI's new features (#5647) (by Mocki)
[Doc] Add introduction to forward mode autodiff (#5680) (by Mingrui Zhang)
[autodiff] Fix AdStackAllocaStmt not correctly backup (#5692) (by Mingrui Zhang)
Fix shared array for all Vulkan versions. (#5721) (by Haidong Lan)
[misc] Rc v1.1.0 patch3 (#5709) (by Ailing)
[bug] RC v1.1.0 patch2 (#5683) (by Ailing)
[ci] Temporarily disable a M1 vulkan test (#5703) (by Proton)
[Doc] Add doc about offline cache (#5646) (#5686) (by Mingming Zhang)
[bug] Fix bug that kernel names are not correctly captured by the profiler (#5651) (#5669) (by Mingming Zhang)
[gui] GGUI scene APIs are broken (#5658) (#5667) (by PENGUINLIONG)
[release] v1.1.0 patch1 (#5649) (by Ailing)
[llvm] Compile serially when num_thread=0 (#5631) (by Lin Jiang)
[cuda] Reduce kernel profiler memory usage (#5623) (by Bo Qiao)
[doc] Add docstrings for texture related apis (by Ailing Zhang)
[Lang] Support from/to_image for textures and add tests (by Ailing Zhang)
[gui] Add wareframe mode for mesh & mesh_instance, add slider_int for Window.GUI. (#5576) (by Mocki)
avoid redundant compilation (#5607) (by yixu)
[misc] Enable offline cache by default (#5613) (by Mingming Zhang)
[Lang] Add parameter 'order' to specify layout for scalar, vector, matrix fields (#5617) (by Yi Xu)
[autodiff] [example] Add an example for computing Jacobian matrix (#5609) (by Mingrui Zhang)
[ci] Add PR tag for dx12. (#5614) (by Xiang Li)
fix ti.ui.Space (#5606) (by yixu)
[ci] Build Android export core (#5409) (by Proton)
[type] Rename module quantized_types to quant (#5608) (by Yi Xu)
[llvm] [aot] Add unit tests for Dynamic SNodes with LLVM AOT (#5594) (by Zhanlue Yang)
[build] Forcing write file encoding in misc/make_changelog.py (#5604) (by Proton)
[llvm] [aot] Add unit tests for Bitmasked SNodes with LLVM AOT (#5593) (by Zhanlue Yang)
[GUI] Shifted to a more commonly supported type for set_image (#5514) (by PENGUINLIONG)
[gui] Fix snode offset (mesh disappearing bug) (#5579) (by Bob Cao)
[refactor] Redesign loading, dumping and cleaning of offline cache (#5578) (by Mingming Zhang)
[autodiff] [test] Add more complex for loop test cases for forward mode (#5592) (by Mingrui Zhang)
fix num_triangles (#5602) (by yixu)
[cuda] Decouple update from sync in kernel profiler (#5589) (by Bo Qiao)
Removed unnecessary tags to work around a crowdIn issue. (#5590) (by Vissidarte-Herman)
[Lang] Change vec2/3/4 from function calls to types (#5556) (by Zhao Liang)
[vulkan] Enable shared array support for vulkan backend (#5583) (by Haidong Lan)
[aot] Avoid reserved words when generate C# AOT bindings (#5586) (by Proton)
[ci] Update llvm15 prebuild binary. (#5581) (by Xiang Li)
[doc] Removed a redundant line break to see if it will fix a CrowdIn issue (#5584) (by Vissidarte-Herman)
[type] Refine SNode with quant 10/n: Add validity checks and simplify BitStructType (#5573) (by Yi Xu)
[autodiff] [refactor] Rename the parameters to param for forward mode (#5582) (by Mingrui Zhang)
[doc] Format fix to work around a crowdIn issue (#5580) (by Vissidarte-Herman)
Update syntax.md (#5575) (by Zhao Liang)
[doc] Added an mdx-code-block escape hatch syntaxt to workaround a CrowdIn … (#5574) (by Vissidarte-Herman)
[Doc] Update external.md (#5547) (by Zhao Liang)
[doc] Add introductions to ambient_elements in llvm_sparse_runtime.md (#5567) (by Zhanlue Yang)
[refactor] Unify ways to set external array args (#5565) (by Ailing)
[Lang] [type] Refine SNode with quant 9/n: Rename some parameters in quant APIs (#5566) (by Yi Xu)
[opt] Improved warning messages for statements (#5564) (by Zhanlue Yang)
[bug] Fix android build for taichi-aot-demo (#5560) (by Ailing)
[opt] Added llvm::SeparateConstOffsetFromGEPPass() for shared_memory optimizations (#5494) (by Zhanlue Yang)
[Lang] [type] Refine SNode with quant 8/n: Replace bit_struct with ti.BitpackedFields (#5532) (by Yi Xu)
[build] Enforce local-scoped symbols in static llvm libs (#5553) (by Bo Qiao)
[refactor] Unify ways to set ndarray args (#5559) (by Ailing)
[gui] [vulkan] Support for drawing mesh instances (#5546) (by Mocki)
[llvm] [aot] Added taichi_sparse unit test to C-API for CUDA backend (#5531) (by Zhanlue Yang)
Add glFinish to wait_idle (#5538) (by Bo Qiao)
[autodiff] Skip ConstStmt when generating alloca for dual (#5554) (by Mingrui Zhang)
[ci] Fix macOS nightly build (#5552) (by Proton)
Fix potential bug of lang::Program that could be double finalized (#5550) (by Mingming Zhang)
[Error] Raise error when using the struct for in python scope (#5536) (by Lin Jiang)
[bug] Fix calling make_aot_kernel failed when offline_cache=True (#5537) (by Mingming Zhang)
[ci] Move macOS 10.15 workloads to self-hosted runners (#5539) (by Proton)
[build] [refactor] Utilize find_cuda_toolkit and clean some target dependencies (#5526) (by Bo Qiao)
[autodiff] [test] Add more for-loop tests for forward mode (#5525) (by Mingrui Zhang)
[Lang] [bug] Ensure non-i32 compatibility in while statement conditions (#5521) (by daylily)
[Lang] Improve error message for ggui on opengl backend (#5509) (by Zhao Liang)
[aot] Support texture and rwtexture in cgraph (#5528) (by Ailing)
[llvm] Add parallel compilation to CUDA backend (#5519) (by Lin Jiang)
[type] [refactor] Decouple quant from SNode 9/n: Remove exponent handling from SNode (#5510) (by Yi Xu)
[Lang] Fix numpy and taichi operations problem (#5506) (by Zhao Liang)
[Vulkan] Added an interface to get accumulated on-device execution time (#5488) (by PENGUINLIONG)
[Async] [refactor] Remove AsyncTaichi (#5523) (by Lin Jiang)
[misc] Fix warning at GGUI canvas.circles (#5424) (#5518) (by Proton)
[gui] Support rendering lines from a part of VBO (#5495) (by Mocki)
[ir] Cast indices of ExternalPtrStmt to ti.i32 (#5516) (by Yi Xu)
[Lang] Support syntax sugar for ti.cast (#5515) (by Yi Xu)
[Lang] Better struct initialization (#5481) (by Zhao Liang)
[example] Make implicit_fem fallback to CPU when CUDA is not available (#5512) (by Yi Xu)
[Lang] Make MatrixType support more ways of initialization (#5479) (by Zhao Liang)
[Vulkan] Fixed depth texture validation error (#5507) (by PENGUINLIONG)
[bug] Fix vulkan source when build for android (#5508) (by Bo Qiao)
[refactor] [llvm] Rename CodeGenCPU/CUDA/WASM and CodeGenLLVMCPU/CUDA/WASM (#5500) (by Lin Jiang)
[bug] Let the arguments in ti.init override the environment variables (#5497) (by Lin Jiang)
[misc] Add debug logging and TI_AUTO_PROF for offline cache (#5503) (by Mingming Zhang)
[misc] ti.Tape -> ti.ad.Tape (#5501) (by Zihua Wu)
[misc] Support jit offline cache for kernels that call real functions (#5477) (by Mingming Zhang)
[doc] Update cpp tests build doc (#5493) (by Bo Qiao)
[Lang] Support call field.fill in kernel functions (#5486) (by Zhao Liang)
[Lang] [bug] Make comparisons always return i32 (#5487) (by Yi Xu)
[gui] [vulkan] Support 3d-lines rendering (#5492) (by Mocki)
[autodiff] Switch off parts of store forwarding optimization for autodiff (#5464) (by Mingrui Zhang)
[llvm] [aot] Add LLVM to CAPI part 9: Added AOT field tests for LLVM backend in C-API (#5461) (by Zhanlue Yang)
[bug] [llvm] Fix GEP when allocating TLS buffer in struct for (#5473) (by Lin Jiang)
[gui] [vulkan] Modify some internal APIs (#5484) (by Mocki)
[Build] Remove TI_EMSCRIPTENED related code (#5483) (by Bo Qiao)
[type] [refactor] Decouple quant from SNode 8/n: Remove redundant handling of llvm15 in codegen_llvm_quant (#5480) (by Yi Xu)
[CUDA] Enable shared memory for CUDA (#5429) (by Haidong Lan)
[gui] [vulkan] A faster version of depth copy through ti.field/ti.ndarray (copy directly from vulkan to cuda/gpu/cpu) (#5455) (by Mocki)
[misc] Add missing members of XXXExpression and FrontendXXXStmt to result of ASTSerializer (#5471) (by Mingming Zhang)
[llvm] [aot] Added field tests for LLVM backend in CGraph (#5458) (by Zhanlue Yang)
[type] [refactor] Decouple quant from SNode 7/n: Rewrite BitStructStoreStmt codegen without SNode (#5475) (by Yi Xu)
[llvm] [aot] Add LLVM to CAPI part 8: Added CGraph tests for LLVM backend in C-API (#5456) (by Zhanlue Yang)
[build] [refactor] Rename taichi core and taichi python targets (#5451) (by Bo Qiao)
[llvm] [aot] Add LLVM to CAPI part 6: Handle Field initialization in C-API (#5444) (by Zhanlue Yang)
[llvm] [aot] Add LLVM to CAPI part 7: Added AOT kernel tests for LLVM backend in C-API (#5447) (by Zhanlue Yang)
[error] Throw proper error message when an Ndarray is passed in via ti.template (#5457) (by Ailing)
[type] [refactor] Decouple quant from SNode 6/n: Rewrite extract_quant_float() without SNode (#5448) (by Yi Xu)
[bug] Set SNode tree id to all SNodes (#5454) (by Lin Jiang)
[AOT] Support on-device event (#5433) (by PENGUINLIONG)
[llvm] [aot] Add LLVM to CAPI part 5: Added C-API tests for Vulkan and Cuda backend (#5440) (by Zhanlue Yang)
[llvm] [bug] Fixing the crash in release tests introduced by a typo in #5381 where we need a deep copy of arglist. (#5441) (by Proton)
[llvm] [aot] Add LLVM to CAPI part 4: Enabled C-API tests on CI & Added C-API tests for CPU backend (#5435) (by Zhanlue Yang)
[misc] Bump version to v1.0.5 (#5437) (by Proton)
[aot] Support specifying vk_api_version in CompileConfig (#5419) (by Ailing)
[Lang] Add append attribute to dynamic fields (#5413) (by Zhao Liang)
[Lang] Add inf and nan (#5270) (by Zhao Liang)
[Doc] Updated docsite structure (#5416) (by Vissidarte-Herman)
[ci] Run release tests (#5327) (by Proton)
[type] [refactor] Decouple quant from SNode 5/n: Rewrite load_quant_float() without SNode (#5422) (by Yi Xu)
[llvm] Allow using clang 15 for COMPILE_LLVM_RUNTIME (#5381) (by Xiang Li)
[opengl] Speedup compilation for Nvidia cards (#5430) (by Bob Cao)
[Bug] Fix infinite loop when exponent of integer pow is negative (#5275) (by Mike He)
[build] [refactor] Move spirv codegen and common targets (#5415) (by Bo Qiao)
[autodiff] Check not placed field.dual and add needs_dual (#5412) (by Mingrui Zhang)
[bug] Simplify scalar handling in cgraph and relax field_dim check (#5411) (by Ailing)
[gui] [vulkan] Surpport for getting depth information for python users. (#5410) (by Mocki)
[AOT] Adjusted C-API for nd-array type conformance (#5417) (by PENGUINLIONG)
[type] Decouple quant from SNode 4/n: Add exponent info to BitStructType (#5407) (by Yi Xu)
[llvm] Avoid creating new LLVM contexts when updating struct module (#5397) (by Lin Jiang)
[build] Enable C-API compilation on CI (#5403) (by Zhanlue Yang)
[Lang] Implement assignment by slicing (#5369) (by Mike He)
[llvm] [aot] Add LLVM to CAPI part 3: Adapted AOT interfaces for LLVM backend (#5402) (by Zhanlue Yang)
[AOT] Fixed Vulkan device import capability settings (#5400) (by PENGUINLIONG)
[llvm] [aot] Add LLVM to CAPI part 2: Adapted memory allocation interfaces for LLVM backend (#5396) (by Zhanlue Yang)
[autodiff] Add ternary operators for forward mode (#5405) (by Mingrui Zhang)
[llvm] [aot] Add LLVM to CAPI part 1: Implemented capi::LlvmRuntime class (#5393) (by Zhanlue Yang)
[ci] Add per test hard timeout limit (#5384) (by Proton)
[ci] Properly detect $DISPLAY (#5398) (by Proton)
[ci] Llvm15 clang10 ci (#5368) (by Xiang Li)
[llvm] [bug] Add stop grad to ASTSerializer (#5401) (by Lin Jiang)
[autodiff] Add test for ternary operators in reverse mode autodiff (#5395) (by Mingrui Zhang)
[llvm] [aot] Add numerical unit tests for LLVM-CGraph (#5319) (by Zhanlue Yang)

v1.0.4

1 year ago

Highlights:

Documentation
- Fix typos (#5283) (by Kian-Meng Ang)
- Update dev_install.md (#5266) (by Vissidarte-Herman)
- Updated README command lines (#5199) (by Vissidarte-Herman)
- Modify compilation warnings (#5180) (by Olinaaaloompa)
- Updated odop.md, removing obsolete information (#5163) (by Vissidarte-Herman)
Language and syntax
- Refine SNode with quant 7/n: Support placing QuantFixedType under quant_array (#5386) (by Yi Xu)
- Add determinant for 1d case (#5375) (by Zhao Liang)
- Make floor, ceil and round accept a dtype optional argument (#5307) (by Zhao Liang)
- Rename struct_class to dataclass (#5365) (by Zhao Liang)
- Improve ti example so that users can choose which example to run by entering numbers. (#5265) (by Zhao Liang)
- Refine SNode with quant 5/n: Rename bit_array to quant_array (#5344) (by Yi Xu)
- Make bit_vectorize a parameter of ti.loop_config (#5334) (by Yi Xu)
- Refine SNode with quant 3/n: Turn bit_vectorize into an on/off switch (#5331) (by Yi Xu)
- Add errror message for missing init call (#5280) (by Zhao Liang)
- Fix fractal gui close warning (#5281) (by Zhao Liang)
- Refine SNode with quant 2/n: Enable struct for on bit_array with bit_vectorize off (#5253) (by Yi Xu)
- Refactor indexing expressions in AST & enforce integer indices (#5138) (by daylily)

Full changelog:

Revert "[llvm] (Decomp of #5251 11/n) Enable parallel compilation on CPU backend (#5394)" (by Proton)
[refactor] Default dtype of ndarray type should be None instead of f32 (#5391) (by Ailing)
[llvm] (Decomp of #5251 11/n) Enable parallel compilation on CPU backend (#5394) (by Lin Jiang)
[gui] [vulkan] Surpport for python users to control the start index and count number of particles & meshes data. (#5388) (by Mocki)
[autodiff] Support binary operators for forward mode (#5389) (by Mingrui Zhang)
[llvm] (Decomp of #5251 10/n) Make SNode tree compatible with parallel compilation (#5390) (by Lin Jiang)
[llvm] [refactor] (Decomp of #5251 9/n) Refactor CodeGen to support parallel compilation on LLVM backend (#5387) (by Lin Jiang)
[Lang] [type] Refine SNode with quant 7/n: Support placing QuantFixedType under quant_array (#5386) (by Yi Xu)
[llvm] [refactor] (Decomp of #5251 8/n) Refactor KernelCacheData (#5383) (by Lin Jiang)
[cuda] [type] Refine SNode with quant 6/n: Support __ldg for loading QuantFixedType and QuantFloatType (#5374) (by Yi Xu)
[doc] Add simt functions in operators (#5333) (by Bo Qiao)
[Lang] Add determinant for 1d case (#5375) (by Zhao Liang)
[lang] Texture image load store support (#5317) (by Bob Cao)
[bug] Cast scalar to right type before converting to uint64 (by Ailing Zhang)
[refactor] Check dtype mismatch in cgraph compilation and runtime (by Ailing Zhang)
[refactor] Check field_dim mismatch in cgraph compilation and runtime (by Ailing Zhang)
[test] Check repeated arg names in cgraph (by Ailing Zhang)
[llvm] [refactor] (Decomp of #5251 6/n) Let ModuleToFunctionConverter support multiple modules (#5372) (by Lin Jiang)
[Lang] Make floor, ceil and round accept a dtype optional argument (#5307) (by Zhao Liang)
[refactor] Rename the confused needs_grad (#5359) (by Mingrui Zhang)
[autodiff] Support unary ops for forward mode (#5366) (by Mingrui Zhang)
[llvm] (Decomp of #5251 7/n) Change the way to record the time of offline cache (#5373) (by Lin Jiang)
[llvm] (Decomp of #5251 5/n) Add the parallel compilation worker to LlvmProgramImpl (#5364) (by Lin Jiang)
[gui] [test] Fix bug in test_ggui.py when some pc env do not surrport ggui (#5370) (by Mocki)
[Lang] Rename struct_class to dataclass (#5365) (by Zhao Liang)
[llvm] Drop code for llvm 15. (#5313) (by Xiang Li)
[llvm] [aot] Rewrite LLVM AOT tests with LlvmRuntimeExecutor (#5358) (by Zhanlue Yang)
[example] Avoid f64 type in simulation/initial_value_problem.py (#5355) (by Proton)
[ci] testing: add retention-days for broken wheels (#5326) (by Proton)
[test] (Decomp of #5251 4/n) Delete tests for AsyncTaichi (#5357) (by Lin Jiang)
[llvm] [refactor] (Decomp of #5251 2/n) Make modulegen a virtual function and let LLVMCompiledData replace ModuleGenValue (#5353) (by Lin Jiang)
[gui] Support exporting gif && video in GGUI (#5354) (by Mocki)
[autodiff] Handle field accessing by zero for forward mode (#5339) (by Mingrui Zhang)
[llvm] [refactor] (Decomp of #5251 3/n) Remove codegen from OffloadedTask and let it replace OffloadedTaskCacheData (#5356) (by Lin Jiang)
[refactor] Turn off stack traceback info by default (#5347) (by Ailing)
[refactor] (Decomp of #5251 1/n) Move ParallelExecutor out of async engine (#5351) (by Lin Jiang)
[Lang] Improve ti example so that users can choose which example to run by entering numbers. (#5265) (by Zhao Liang)
[gui] Add get_view_matrix() and get_projection_matrix() APIs for camera (#5345) (by Mocki)
[bug] Added warning messages for implicit type conversion for RangeFor boundaries (#5322) (by Zhanlue Yang)
[example] Fix simulation/waterwave.py:update race condition (#5346) (by Proton)
[Lang] [type] Refine SNode with quant 5/n: Rename bit_array to quant_array (#5344) (by Yi Xu)
[llvm] [aot] Added CGraph tests for LLVM backend (#5305) (by Zhanlue Yang)
[autodiff] [test] Add for-loop tests for forward mode (#5336) (by Mingrui Zhang)
[example] Lower example GUI resolution to fit buildbot display (#5337) (by Proton)
[build] [bug] Fix building on macOS 10.14 failed (#5332) (by PGZXB)
[llvm] [aot] Replaced LlvmProgramImpl with LlvmRuntimeExecutor for LlvmAotModuleLoader (#5330) (by Zhanlue Yang)
[AOT] Fixed certain crashes in C-API (#5335) (by PENGUINLIONG)
[Lang] [type] Make bit_vectorize a parameter of ti.loop_config (#5334) (by Yi Xu)
[autodiff] Skip store forwarding to keep the GlobalLoadStmt alive (#5315) (by Mingrui Zhang)
[llvm] [aot] RModified ModuleToFunctionConverter to use LlvmRuntimeExecutor instead of LlvmProgramImpl (#5328) (by Zhanlue Yang)
[llvm] Changed LlvmProgramImpl to save cache_data_ with unique_ptr instead of raw object (#5329) (by Zhanlue Yang)
[Lang] [type] Refine SNode with quant 3/n: Turn bit_vectorize into an on/off switch (#5331) (by Yi Xu)
[misc] Fix a few compilation warnings (#5325) (by yekuang)
[bug] Accept numpy integers in ndrange (#5245) (#5323) (by Proton)
[misc] Implement cache file cleaning (#5310) (by PGZXB)
Fixed C-AP build on Android (#5321) (by PENGUINLIONG)
[AOT] Save AOT module artifacts as zip archive (#5316) (by PENGUINLIONG)
[llvm] [aot] Added LLVM backend support for Compute Graph (#5294) (by Zhanlue Yang)
[AOT] Unity native plugin interfaces (#5273) (by PENGUINLIONG)
[autodiff] Check not placed field.grad when needs_grad = True (#5295) (by Mingrui Zhang)
[autodiff] Fix alloca block and add control flow test case for forward mode (#5301) (by Mingrui Zhang)
[refactor] Synchronize should always be called in non-async mode (#5302) (by Ailing)
[Lang] Add errror message for missing init call (#5280) (by Zhao Liang)
Update prtags.json (#5304) (by Bob Cao)
[refactor] Get rid ndarray host accessor kernels (by Ailing Zhang)
[refactor] Use device api for CPU/CUDA ndarray (by Ailing Zhang)
[refactor] Switch to using staging buffer for metal/vulkan/opengl (by Ailing Zhang)
[llvm] Use LlvmProgramImpl::cache_data_ to store compiled kernel info (#5290) (by Zhanlue Yang)
[opengl] Texture support in OpenGL (#5296) (by Bob Cao)
[build] [refactor] Cleanup backends folder and rename to RHI (#5288) (by Bo Qiao)
[Lang] Fix fractal gui close warning (#5281) (by Zhao Liang)
[autodiff] [test] Add atomic test for forward autodiff (#5286) (by Mingrui Zhang)
[dx11] Fix DX backend with new runtime & Better D3D11 buffer handling (#5244) (by Bob Cao)
[autodiff] Set default seed only for scalar parameter to avoid silent unexpected results (#5287) (by Mingrui Zhang)
test (#5292) (by Ailing)
[AOT] Added C-API for on-device memory copy (#5271) (by PENGUINLIONG)
[Doc] Fix typos (#5283) (by Kian-Meng Ang)
[autodiff] Support control flow for forward mode (by mingrui)
[autodiff] Support for-loop and mutation for forward mode (by mingrui)
[autodiff] Refactor dual field allocation (by mingrui)
[AOT] Refactor C-API codegen (#5272) (by PENGUINLIONG)
Update README.md (#5279) (by Taichi contributor)
[metal] Support memcpy_internal via buffer_copy (#5268) (by Ailing)
[bug] Fix missing old but useful metadata in offline cache (#5267) (by PGZXB)
[Lang] [type] Refine SNode with quant 2/n: Enable struct for on bit_array with bit_vectorize off (#5253) (by Yi Xu)
[Doc] Update dev_install.md (#5266) (by Vissidarte-Herman)
[build] [bug] Fix dependency for opengl_rhi target (by Bo Qiao)
Update fallback order, move opengl behind Vulkan (#5257) (by Bob Cao)
[opengl] Move OpenGL backend onto Gfx runtime (#5246) (by Bob Cao)
[build] [refactor] Move LLVM source files to target locations (#5254) (by Bo Qiao)
[bug] Fixed misuse of std::forward (#5237) (by Zhanlue Yang)
[AOT] Added safety checks to prevent hard crashes on failure (#5249) (by PENGUINLIONG)
[build] [refactor] Move shaders source files to runtime (#5247) (by Bo Qiao)
[example] Fix diff_sph example with --train (#5242) (by Mingrui Zhang)
[misc] Add filename option to ti.tools.VideoManager. (#5219) (by Qian Bao)
[bug] Throw exceptions when ndrange gets non-integral arguments (#5245) (by Mike He)
[build] [refactor] Move wasm and dx11 source files to target locations (#5235) (by Bo Qiao)
[type] [bug] Refine SNode with quant 1/n: Fix (atomic_)set_mask_b##N (#5238) (by Yi Xu)
[lang] 1d/3d texture support (#5233) (by Bob Cao)
[vulkan] Fix OpBranch for reversed RangeForStmt (#5241) (by Mingrui Zhang)
[build] Fix -Werror errors for TI_WITH_CUDA_TOOLKIT=ON (#5133) (#5216) (by Proton)
[ci] Enable pylint on examples (#5222) (by Proton)
[llvm] [aot] Split LlvmRuntimeExecutor from LlvmProgramImpl (#5207) (by Zhanlue Yang)
[type] [refactor] Decouple quant from SNode 3/n: Extend bit pointers (#5232) (by Yi Xu)
[vulkan] Codegen & runtime improvements (#5213) (by Bob Cao)
[gui] Fix the device memory leak when GGUI terminates (by Ailing Zhang)
[gui] Let gui and renderer manage the resource they own (by Ailing Zhang)
[AOT] Unity language binding generator (#5204) (by PENGUINLIONG)
[type] [refactor] Decouple quant from SNode 2/n: Remove physical_type from QuantIntType (#5223) (by Yi Xu)
[type] [refactor] Decouple quant from SNode 1/n: Add BitStructTypeBuilder (#5209) (by Yi Xu)
[build] [refactor] Move metal source files to target locations (#5208) (by Bo Qiao)
[lang] Export a few types from the share library (#5220) (by yekuang)
[llvm] [refactor] LLVMProgramImpl code clean up: part-5 (#5197) (by Zhanlue Yang)
[spirv] Fixed OpLoad with physical address (#5212) (by PENGUINLIONG)
[wip] Enable full wheel build when TI_EXPORT_CORE is on (#5211) (by Ailing)
[llvm] [refactor] LLVMProgramImpl code clean up: part-4 (#5189) (by Zhanlue Yang)
Move spdlog include to profiler.cpp (#5210) (by Ailing)
Fix ti gallery command bug (#5196) (by Zhao Liang)
[misc] Improve TI_STATIC_ASSERT compatibility (#5205) (by Yuanming Hu)
[llvm] [refactor] LLVMProgramImpl code clean up: part-3 (#5188) (by Zhanlue Yang)
Fixed C-API provision (#5203) (by PENGUINLIONG)
[lang] Improve error message when literal val is out of range of default dtype (#5191) (by Ailing)
[Lang] [ir] Refactor indexing expressions in AST & enforce integer indices (#5138) (by daylily)
Remove stale coverage from README.md (#5202) (by yekuang)
[ci] Slim cpu build image (#5198) (by Proton)
[build] [refactor] Move opengl source files to target locations (#5200) (by Bo Qiao)
[example] Fix dtype for metal backend and enforce vulkan (#5201) (by Mingrui Zhang)
[Doc] Updated README command lines (#5199) (by Vissidarte-Herman)
[llvm] [refactor] LLVMProgramImpl code clean up: part-2 (#5187) (by Zhanlue Yang)
[AOT] Support Matrix/Vector as graph arguments (#5165) (by Haidong Lan)
[refactor] Enable adaptive block_dim selection for CPU backend (#5190) (by Bo Qiao)
[Doc] Modify compilation warnings (#5180) (by Olinaaaloompa)
[ci] Save wheel to artifact when test fails (#5186) (by Proton)
[gui] Detailed error message when GGUI is not available (#5164) (by Proton)
[ci] Run C++ tests on Windows (#5176) (by Proton)
[lang] Texture support 3/n (Python changes) (#5174) (by Bob Cao)
[llvm] [refactor] LLVMProgramImpl code clean up: part-1 (#5181) (by Zhanlue Yang)
[AOT] Implementation of Taichi Runtime C-API (#5168) (by PENGUINLIONG)
[refactor] [autodiff] Clean redundant compiled functions and refactor kernel key (#5178) (by Mingrui Zhang)
[doc] Add badge on README.md (#5177) (by yanqingzhang)
[lang] Texture support 2/n (SPIR-V backend & runtime changes) (#5159) (by Bob Cao)
[build] Export cmake config to ease clients usage in Cmake (#5162) (by Bo Qiao)
[refactor] [autodiff] Refactor autodiff api and add corresponding tests (#5175) (by Mingrui Zhang)
[aot] [llvm] LLVM AOT Field part-4: Added AOT tests for Fields - CUDA backend (#5124) (by Zhanlue Yang)
[type] [refactor] Consistently use quant_xxx in quant-related names (#5166) (by Yi Xu)
[cuda] Disable reduction in non-full warps (#5161) (by Bob Cao)
[autodiff] Support basic operations for forward mode autodiff (by mingrui)
[autodiff] Add a context manager for forward mode autodiff (by mingrui)
[AOT] C-APIs for Taichi runtime distribution (#5150) (by PENGUINLIONG)
[cli] Improve user interface for CLI command ti example (#5153) (by Zhao Liang)
[Doc] Updated odop.md, removing obsolete information (#5163) (by Vissidarte-Herman)
[autodiff] [refactor] Refactor autodiff tape api and TapeImpl (#5154) (by Mingrui Zhang)
[type] [refactor] Separate CustomFixedType from CustomFloatType (#5149) (by Yi Xu)
[ui] Properlly fix UTF-8 title string by converting to UTF16 (#5155) (by Bob Cao)
[aot] [llvm] LLVM AOT Field #3: Added AOT tests for Fields - CPU backend (#5121) (by Zhanlue Yang)
Bump version to v1.0.4 (#5157) (by Taichi Gardener)
[lang] Texture support 1/n (Context & Programs) (#5139) (by Bob Cao)

v1.0.3

1 year ago

Highlights:

Aot module
- Support importing external Vulkan buffers (#5020) (by PENGUINLIONG)
- Supported inclusion of taichi as subdirectory for AOT modules (#5007) (by PENGUINLIONG)
Bug fixes
- Fix frontend type check for reading a whole bit_struct (#5027) (by Yi Xu)
- Remove redundant AllocStmt when lowering FrontendWhileStmt (#4870) (by Zhanlue Yang)
Build system
- Improve Windows build script (#4955) (by PENGUINLIONG)
- Improved building on Windows (#4925) (by PENGUINLIONG)
- Define Cmake OpenGL runtime target (#4887) (by Bo Qiao)
- Use keywords instead of plain target_link_libraries CMake (#4864) (by Bo Qiao)
- Define runtime build target (#4838) (by Bo Qiao)
- Switch to scikit-build as the build backend (#4624) (by Frost Ming)
Documentation
- Improve ODOP doc structure (#5089) (by Yi Xu)
- Add documentation of Taichi Struct Classes. (#5075) (by bsavery)
- Updated type system (#5054) (by Vissidarte-Herman)
- Branding updates. Also tests netlify. (#4994) (by Vissidarte-Herman)
- Fix netlify cache & sync doc without pr content (#5003) (by Justin)
- Update trouble shooting URL in bug report template (#4988) (by Haidong Lan)
- Updated URL (#4990) (by Vissidarte-Herman)
- Fix docs deploy netlify test configuration (#4991) (by Justin)
- Updated relative path (#4929) (by Vissidarte-Herman)
- Updated broken links (#4912) (by Vissidarte-Herman)
- Updated links that may break. (#4874) (by Vissidarte-Herman)
- Add limitation about TLS optimization (#4877) (by Ailing)
Examples
- Fix block_dim warning in ggui (#5128) (by Zhao Liang)
- Update visual effects of mass_spring_3d_ggui.py (#5081) (by Zhao Liang)
- Update mass_spring_3d_ggui.py to v2 (#3879) (by Alex Brown)
Language and syntax
- Add more initialization routines for glsl matrix types (#5069) (by Zhao Liang)
- Support constructing vector and matrix ndarray from ti.ndarray() (by ailzhang)
- Disallow reading a whole bit_struct (#5061) (by Yi Xu)
- Struct Classes implementation (#4989) (by bsavery)
- Add short-circuit if-then-else operator (#5022) (by daylily)
- Build sparse matrix from ndarray (#4841) (by pengyu)
- Fix potential precision bug when using math vector and matrix types (#5032) (by Zhao Liang)
- Refactor quant type definition APIs (#5036) (by Yi Xu)
- Fix parameter name 'range' for ti.types.quant.fixed (#5006) (by Yi Xu)
- Refactor quantized_types module and make quant APIs public (#4985) (by Yi Xu)
- Add more functions to math module (#4939) (by Zhao Liang)
- Support sparse matrix datatype and storage format configuration (#4673) (by pengyu)
- Copy-free interaction between Taichi and PaddlePaddle (#4886) (by 0xzhang)
LLVM backend (CPU and CUDA)
- Add AOT builder and loader (#5013) (by yekuang)
Metal backend
- Support Ndarray (#4720) (by yekuang)
RFC
- AOT for all SNodes (#4806) (by yekuang)
SIMT programming
- Add match_all warp intrinsics (#4961) (by Zeyu Li)
- Add match_any warp intrinsics (#4921) (by Zeyu Li)
- Add uni_sync warp intrinsics (#4927) (by 0xzhang)
- Add activemask warp intrinsics (#4918) (by Zeyu Li)
- Add syncwarp warp intrinsics (#4917) (by Zeyu Li)
Vulkan backend
- Fixed vulkan backend crash on AOT examples (#5047) (by PENGUINLIONG)
GitHub Actions/Workflows
- Update release_test.sh (#4960) (by Chuandong Yan)

Full changelog:

[aot] [llvm] LLVM AOT Field #2: Updated LLVM AOTModuleLoader & AOTModuleBuilder to support Fields (#5120) (by Zhanlue Yang)
[type] [refactor] Misc improvements to quant codegen (#5129) (by Yi Xu)
[ci] Enable yapf and isort on example files (#5140) (by Ailing)
[Example] Fix block_dim warning in ggui (#5128) (by Zhao Liang)
fix mass_spring_3d_ggui backend (#5127) (by Zhao Liang)
[lang] Texture support 0/n: IR changes (#5134) (by Bob Cao)
Editorial update (#5119) (by Olinaaaloompa)
[aot] [llvm] LLVM AOT Field #1: Adjust serialization/deserialization logics for FieldCacheData (#5111) (by Zhanlue Yang)
[aot][bug] Use cached compiled kernel pointer when it's added to graph (#5122) (by Ailing)
[aot] [llvm] LLVM AOT Field #0: Implemented FieldCacheData & refactored initialize_llvm_runtime_snodes() (#5108) (by Zhanlue Yang)
[autodiff] Add forward mode pipeline for autodiff pass (#5098) (by Mingrui Zhang)
[build] [refactor] Move Vulkan runtime out of backends dir (#5106) (by Bo Qiao)
[bug] Fix build without llvm backend crash (#5113) (by Bo Qiao)
[type] [llvm] [refactor] Fix function names in codegen_llvm_quant (#5115) (by Yi Xu)
[llvm] [refactor] Replace cast_int() with LLVM native integer cast (#5110) (by Yi Xu)
[type] [refactor] Remove redundant promotion for custom int in type_check (#5102) (by Yi Xu)
[Example] Update visual effects of mass_spring_3d_ggui.py (#5081) (by Zhao Liang)
[test] Save mpm88 graph in python and load in C++ test. (#5104) (by Ailing)
[llvm] [refactor] Move load_bit_pointer() to CodeGenLLVM (#5099) (by Yi Xu)
[refactor] Remove ndarray element shape from extra arg buffer (#5100) (by Haidong Lan)
[refactor] Update Ndarray constructor used in AOT runtime. (#5095) (by Ailing)
clean hidden override functions (#5097) (by Mingrui Zhang)
[llvm] [aot] CUDA-AOT PR #2: Implemented AOTModuleLoader & AOTModuleBuilder for LLVM-CUDA backend (#5087) (by Zhanlue Yang)
[Doc] Improve ODOP doc structure (#5089) (by Yi Xu)
Use pre-calculated runtime size array for gfx runtime. (#5094) (by Haidong Lan)
[bug] Minor fix for ndarray element_shape in graph mode (#5093) (by Ailing)
[llvm] [refactor] Use LLVM native atomic ops if possible (#5091) (by Yi Xu)
[autodiff] Extract shared components for reverse and forward mode (#5088) (by Mingrui Zhang)
[llvm] [aot] Add LLVM-CPU AOT tests (#5079) (by Zhanlue Yang)
[Doc] Add documentation of Taichi Struct Classes. (#5075) (by bsavery)
[build] [refactor] Change CMake global include_directories to target based function (#5082) (by Bo Qiao)
[autodiff] Allocate dual and adjoint snode (#5083) (by Mingrui Zhang)
[refactor] Make sure Ndarray shape is field shape (#5085) (by Ailing)
[llvm] [refactor] Merge AtomicOpStmt codegen in CPU and CUDA backends (#5086) (by Yi Xu)
[llvm] [aot] CUDA-AOT PR #1: Extracted common logics from CPUAotModuleImpl into LLVMAotModule (#5072) (by Zhanlue Yang)
[infra] Refactor Vulkan runtime into true Common Runtime (#5058) (by Bob Cao)
[refactor] Correctly set ndarray element_size and nelement (#5080) (by Ailing)
[cuda] [simt] Add assertions for warp intrinsics on old GPUs (#5077) (by Bo Qiao)
[Lang] Add more initialization routines for glsl matrix types (#5069) (by Zhao Liang)
[spirv] Specialize element shape for spirv codegen. (#5068) (by Haidong Lan)
[llvm] Specialize element shape for LLVM backend (#5071) (by Haidong Lan)
[doc] Fix broken link for github action status badge (#5076) (by Ailing)
[Example] Update mass_spring_3d_ggui.py to v2 (#3879) (by Alex Brown)
[refactor] Resolve comments from #5065 (#5074) (by Ailing)
[Lang] Support constructing vector and matrix ndarray from ti.ndarray() (by ailzhang)
[refactor] Pass element_shape and layout to C++ Ndarray (by ailzhang)
[refactor] Specialized Ndarray Type is (element_type, shape, layout) (by ailzhang)
[aot] [CUDA-AOT PR #0] Refactored compile_module_to_executable() to CUDAModuleToFunctionConverter (#5070) (by Zhanlue Yang)
[refactor] Split GraphBuilder out of Graph class (#5064) (by Ailing)
[build] [bug] Ensure the assets folder is copied to the project directory (#5063) (by Frost Ming)
[bug] Remove operator ! for Expr (#5062) (by Yi Xu)
[Lang] [type] Disallow reading a whole bit_struct (#5061) (by Yi Xu)
[Lang] Struct Classes implementation (#4989) (by bsavery)
[Lang] [ir] Add short-circuit if-then-else operator (#5022) (by daylily)
[bug] Ndarray type should include primitive dtype as well (#5052) (by Ailing)
[Doc] Updated type system (#5054) (by Vissidarte-Herman)
[bug] Added type promotion support for atan2 (#5037) (by Zhanlue Yang)
[Lang] Build sparse matrix from ndarray (#4841) (by pengyu)
Set host_write to false for opengl ndarray (#5038) (by Ailing)
[ci] Run cpp tests via run_tests.py (#5035) (by yekuang)
Exit CI builds when download of prebuilt packages fails (#5043) (by PENGUINLIONG)
[Vulkan] Fixed vulkan backend crash on AOT examples (#5047) (by PENGUINLIONG)
[Lang] Fix potential precision bug when using math vector and matrix types (#5032) (by Zhao Liang)
[Metal] Support Ndarray (#4720) (by yekuang)
[Lang] [type] Refactor quant type definition APIs (#5036) (by Yi Xu)
[aot] Bind graph APIs to python and add mpm88 example (#5034) (by Ailing)
[aot] Move ArgKind as first argument in Arg class (by ailzhang)
[aot] Serialize built graph, deserialize and run. (by ailzhang)
[ci] Disable win cpu docker job test (#5033) (by Bo Qiao)
[doc] Update OS names (#5030) (by Bo Qiao)
fix fast_gui rgba bug (#5031) (by Zhao Liang)
[Bug] [type] Fix frontend type check for reading a whole bit_struct (#5027) (by Yi Xu)
[AOT] Support importing external Vulkan buffers (#5020) (by PENGUINLIONG)
[SIMT] Add match_all warp intrinsics (#4961) (by Zeyu Li)
[bug] Revert freeing ndarray memory when python GC triggers (#5019) (by Ailing)
[ci] Fix nightly macos (#5018) (by Bo Qiao)
[Llvm] Add AOT builder and loader (#5013) (by yekuang)
[aot] Build and run graph without serialization (by Ailing Zhang)
[test] Unify kernel setup for ndarray related tests (by Ailing Zhang)
[ci] [build] Enable ccache for windows docker (#5001) (by Frost Ming)
[refactor] Move get ndarray data ptr to program (#5012) (by pengyu)
[bug] Fixed numerical error for Atomic-Sub between unsigned values with different number of bits (#5011) (by Zhanlue Yang)
[llvm] Add serializable LlvmLaunchArgInfo (#4992) (by yekuang)
[doc] Update community section (#4943) (by yanqingzhang)
[SIMT] Add match_any warp intrinsics (#4921) (by Zeyu Li)
[Lang] [type] Fix parameter name 'range' for ti.types.quant.fixed (#5006) (by Yi Xu)
[misc] Version bump: v1.0.2 -> v1.0.3 (#5008) (by Haidong Lan)
[AOT] Supported inclusion of taichi as subdirectory for AOT modules (#5007) (by PENGUINLIONG)
[Doc] Branding updates. Also tests netlify. (#4994) (by Vissidarte-Herman)
[refactor] Get rid of data_ptr_ in Ndarray (by Ailing Zhang)
[refactor] Move ndarray fast fill methods to Program (by Ailing Zhang)
[refactor] Free ndarray's memory when python GC triggers (by Ailing Zhang)
[refactor] Construct ndarray from existing DeviceAllocation. (by Ailing Zhang)
[test] Add test for Ndarray from DeviceAllocation (by Ailing Zhang)
[refactor] Program owns allocated ndarrays. (by Ailing Zhang)
[Doc] Fix netlify cache & sync doc without pr content (#5003) (by Justin)
[test] Fix a few mis-configured ndarray tests (#5000) (by Ailing)
Update README.md (by Vissidarte-Herman)
[Lang] [type] Refactor quantized_types module and make quant APIs public (#4985) (by Yi Xu)
[Doc] Update trouble shooting URL in bug report template (#4988) (by Haidong Lan)
[Doc] Updated URL (#4990) (by Vissidarte-Herman)
[Doc] Fix docs deploy netlify test configuration (#4991) (by Justin)
[llvm] Use serializer for LLVM cache (#4982) (by yekuang)
Provision of prebuilt LLVM 10 for VS2022 (#4987) (by PENGUINLIONG)
[Workflow] Update release_test.sh (#4960) (by Chuandong Yan)
[cuda] Add block and grid level intrinsic for cuda backend (#4977) (by YuZhang)
[bug] Fix infinite recursion of get_offline_cache_key_of_snode_impl() (#4983) (by PGZXB)
[misc] Add ASTSerializer::visit(ReferenceExpression *) (#4984) (by PGZXB)
[llvm] Support both BC and LL cache format (#4979) (by yekuang)
[refactor] Improve serializer and cleanup utils (#4980) (by yekuang)
[Build] Improve Windows build script (#4955) (by PENGUINLIONG)
[llvm] Make cache writer support BC format (#4978) (by yekuang)
[ci] [build] Containerize Windows CPU build and test (#4933) (by Bo Qiao)
[llvm] Make codegen produce static llvm::Module (#4975) (by yekuang)
[test] Add an ndarray test in C++. (#4972) (by Ailing)
[build] Fixed Ilegal Instruction Error when importing PaddlePaddle module (#4969) (by Zhanlue Yang)
[llvm] Create ModuleToFunctionConverter (#4962) (by yekuang)
[bug] [simt] Fix the problem that some intrinsics are never called (#4957) (by Yi Xu)
[vulkan] Set kApiVersion to VK_API_VERSION_1_3 (#4970) (by Haidong Lan)
[ci] Add new buildbot with latest driver for Linux/Vulkan test (#4953) (by Bo Qiao)
[RFC] AOT for all SNodes (#4806) (by yekuang)
[llvm] Move cache directory to dump() (#4963) (by yekuang)
[lang] Add reference type support on real functions (#4889) (by Lin Jiang)
[refactor] Some renamings (#4959) (by yekuang)
[refactor] Add ArrayMetadata to store the array runtime size (#4950) (by yekuang)
[lang] [bug] Implement Expression serializing and fix some bugs (#4931) (by PGZXB)
[Lang] Add more functions to math module (#4939) (by Zhao Liang)
[Build] Improved building on Windows (#4925) (by PENGUINLIONG)
[ci] Fix Nightly (#4948) (by Bo Qiao)
[build] Limit -Werror to Clang-compiler only (#4947) (by Zhanlue Yang)
[refactor] [llvm] Remove struct_compiler_ as a member variable (#4945) (by yekuang)
[build] Turned off -Werror temporarily for issues with performance-bot (#4946) (by Zhanlue Yang)
[refactor] Remove unused snode_trees in ProgramImpl interface (#4942) (by yekuang)
[doc] Updated documentations for implicit type casting rules (#4885) (by Zhanlue Yang)
[build] Turn on -Werror on Linux and Mac platforms (#4928) (by Zhanlue Yang)
[build] Enable -Werror on Linux & Mac (#4941) (by Zhanlue Yang)
[SIMT] Add uni_sync warp intrinsics (#4927) (by 0xzhang)
[lang] Fix type check warnings for ti.Mesh (#4930) (by Chang Yu)
[Lang] Support sparse matrix datatype and storage format configuration (#4673) (by pengyu)
[Doc] Updated relative path (#4929) (by Vissidarte-Herman)
[refactor] Simplify Matrix's initializer (#4923) (by yekuang)
[build] Warning Suppression PR #4: Fixed warnings with MacOS (#4926) (by Zhanlue Yang)
[build] Warning Suppression PR #3: Eliminate warnings from third-party headers (#4920) (by Zhanlue Yang)
[SIMT] Add activemask warp intrinsics (#4918) (by Zeyu Li)
[build] Warning Suppression PR #1: Turned on -Wno-ignored-attributes & Removed unused functions (#4916) (by Zhanlue Yang)
[refactor] Create MatrixImpl to differentiate Taichi and Python scopes (#4853) (by yekuang)
[SIMT] Add syncwarp warp intrinsics (#4917) (by Zeyu Li)
[build] Warning Suppression PR #2: Fixed codebase warnings (#4909) (by Zhanlue Yang)
[test] Exit on error during Paddle windows test (#4910) (by Bo Qiao)
[Doc] Updated broken links (#4912) (by Vissidarte-Herman)
remove debug print (#4883) (by yixu)
[test] Cancel tests for Paddle on GPU (#4914) (by 0xzhang)
[Lang] [test] Copy-free interaction between Taichi and PaddlePaddle (#4886) (by 0xzhang)
Use Ninja generator on Windows and skip generator test (#4896) (by Frost Ming)
[vulkan] Add new VMA vulkan functions. (#4893) (by Bob Cao)
[vulkan] Fix typo for waitSemaphoreCount (#4892) (by Gabriel H)
[Build] [refactor] Define Cmake OpenGL runtime target (#4887) (by Bo Qiao)
[aot] [vulkan] Expose symbols for AOT (#4879) (by yekuang)
[bug] Fixed type promotion rule for bit-shift operations (#4884) (by Zhanlue Yang)
[Build] [refactor] Use keywords instead of plain target_link_libraries CMake (#4864) (by Bo Qiao)
[metal] Migrate runtime's MTLBuffer allocation to unified device API (#4865) (by yekuang)
[error] [lang] Improved error messages for illegal slicing or indexing to ti.field (#4873) (by Zhanlue Yang)
[Doc] Updated links that may break. (#4874) (by Vissidarte-Herman)
[metal] Complete Device API (#4862) (by yekuang)
[vulkan] Device API explicit semaphores (#4852) (by Bob Cao)
[build] Change the library output dir for export core (#4880) (by Frost Ming)
[refactor] Add ASTSerializer and use it to generate offline-cache-key (#4863) (by PGZXB)
[ci] Use the updated docker image for libtaichi_export_core (#4881) (by Bo Qiao)
[Doc] Add limitation about TLS optimization (#4877) (by Ailing)
[Build] [refactor] Define runtime build target (#4838) (by Bo Qiao)
[ci] Add libtaichi_export_core build for desktop in CI (#4871) (by Ailing)
[build] [bug] Fix a bug of skbuild that loses the root package_dir (#4875) (by Frost Ming)
[Bug] Remove redundant AllocStmt when lowering FrontendWhileStmt (#4870) (by Zhanlue Yang)
[misc] Bump version to v1.0.2 (#4867) (by Taichi Gardener)
[build] Install export core library to build dir (#4866) (by Frost Ming)
[Build] Switch to scikit-build as the build backend (#4624) (by Frost Ming)

v1.0.2

2 years ago

Highlights:

The v1.0.2 release is a patch fix that improves Taichi's stability on multiple platforms, especially for GGUI and the Vulkan backend.

Bug fixes
- Remove redundant AllocStmt when lowering FrontendWhileStmt (#4870) (by Zhanlue Yang)
Build system
- Define Cmake OpenGL runtime target (#4887) (by Bo Qiao)
- Use keywords instead of plain target_link_libraries CMake (#4864) (by Bo Qiao)
- Define runtime build target (#4838) (by Bo Qiao)
- Switch to scikit-build as the build backend (#4624) (by Frost Ming)
Documentation
- Add limitation about TLS optimization (#4877) (by Ailing)

Full changelog:

[ci] Fix Nightly (#4948) (by Bo Qiao)
[ci] [build] Containerize Windows CPU build and test (#4933) (by Bo Qiao)
[vulkan] Set kApiVersion to VK_API_VERSION_1_3 (#4970) (by Haidong Lan)
[ci] Add new buildbot with latest driver for Linux/Vulkan test (#4953) (by Bo Qiao)
[vulkan] Add new VMA vulkan functions. (#4893) (by Bob Cao)
[vulkan] Fix typo for waitSemaphoreCount (#4892) (by Gabriel H)
[Build] [refactor] Define Cmake OpenGL runtime target (#4887) (by Bo Qiao)
[Build] [refactor] Use keywords instead of plain target_link_libraries CMake (#4864) (by Bo Qiao)
[vulkan] Device API explicit semaphores (#4852) (by Bob Cao)
[build] Change the library output dir for export core (#4880) (by Frost Ming)
[ci] Use the updated docker image for libtaichi_export_core (#4881) (by Bo Qiao)
[Doc] Add limitation about TLS optimization (#4877) (by Ailing)
[Build] [refactor] Define runtime build target (#4838) (by Bo Qiao)
[ci] Add libtaichi_export_core build for desktop in CI (#4871) (by Ailing)
[build] [bug] Fix a bug of skbuild that loses the root package_dir (#4875) (by Frost Ming)
[Bug] Remove redundant AllocStmt when lowering FrontendWhileStmt (#4870) (by Zhanlue Yang)
[misc] Bump version to v1.0.2 (#4867) (by Taichi Gardener)
[build] Install export core library to build dir (#4866) (by Frost Ming)
[Build] Switch to scikit-build as the build backend (#4624) (by Frost Ming)

v1.0.1

2 years ago

Highlights:

Automatic differentiation
- Implement ti.ad.no_grad to skip autograd (#4751) (by Shawn Yao)
Bug fixes
- Fix and refactor type check for atomic ops (#4858) (by Yi Xu)
- Fix and refactor type check for local stores (#4843) (by Yi Xu)
- Fix implicit cast warning for global stores (#4834) (by Yi Xu)
Documentation
- Updated URL (#4847) (by Vissidarte-Herman)
- LLVM sparse runtime design doc (#4790) (by yekuang)
- Proofread Getting started (#4682) (by Vissidarte-Herman)
- Editorial review to fields (advanced) (#4686) (by Vissidarte-Herman)
- Update docstring for ti.Mesh (#4818) (by Chang Yu)
- Remove redundant semicolon in path (#4801) (by gaoxinge)
Error messages
- Show warning when serialize=True is set on a struct for (#4844) (by Lin Jiang)
- Provide source code info in warnings (#4840) (by Yi Xu)
Language and syntax
- Add single character property for vector swizzle && test (#4845) (by Zhao Liang)
- Remove obsolete vectypes class (#4831) (by LiangZhao)
- Add support for keyword arguments (#4794) (by Lin Jiang)
- Support swizzles on all Matrix/Vector types (#4828) (by yekuang)
- Add 2d and 3d rotation functions to math module (#4822) (by Zhao Liang)
- Walkaround Vulkan backend behavior which changes cwd on Mac (#4812) (by TiGeekMan)
- Add mod function to math module (#4809) (by Zhao Liang)
- Support in-place operator of ti.Matrix in python scope (#4799) (by Lin Jiang)
- Move short-circuit boolean logic into AST-to-IR passes (#4580) (by daylily)
- Promote output type of log, exp, and sqrt ops (#4622) (by Andrew Sun)
- Fix integral type promotion rules (e.g., u8 + u8 now leads to u8 instead of i32) (#4789) (by Yuanming Hu)
- Add basic complex arithmetic and add a mandelbrot example (#4780) (by Zhao Liang)
SIMT programming
- Add shfl_down_f32 intrinsic. (#4819) (by Chun Cai)

Full changelog:

[gui] Avoid implicit type casts in staging_buffer (#4861) (by Yi Xu)
[lang] Add better error detection for swizzle patterens (#4860) (by yekuang)
[Bug] [ir] Fix and refactor type check for atomic ops (#4858) (by Yi Xu)
[Doc] Updated URL (#4847) (by Vissidarte-Herman)
[bug] Fix bug that building with TI_EXPORT_CORE:BOOL=ON failed (#4850) (by PGZXB)
[Error] Show warning when serialize=True is set on a struct for (#4844) (by Lin Jiang)
[lang] Group related Matrix methods closer (#4836) (by yekuang)
[Lang] Add single character property for vector swizzle && test (#4845) (by Zhao Liang)
[Bug] [ir] Fix and refactor type check for local stores (#4843) (by Yi Xu)
[Error] Provide source code info in warnings (#4840) (by Yi Xu)
[misc] Update pre-commit hooks (#4713) (by pre-commit-ci[bot])
[Bug] [ir] Fix implicit cast warning for global stores (#4834) (by Yi Xu)
[mesh] Remove link hints from ti.Mesh (#4825) (by yixu)
[Lang] Remove obsolete vectypes class (#4831) (by LiangZhao)
[doc] Fix doc link (#4835) (by yekuang)
[Doc] LLVM sparse runtime design doc (#4790) (by yekuang)
[Lang] Add support for keyword arguments (#4794) (by Lin Jiang)
[Lang] Support swizzles on all Matrix/Vector types (#4828) (by yekuang)
[test] Add simple test for offline-cache-key of compile-config (#4805) (by PGZXB)
[vulkan] Device API blending (#4815) (by Bob Cao)
[spirv] Fix int casts (#4814) (by Bob Cao)
[gui] Only call ImGui_ImplVulkan_Shutdown if it's initialized (#4827) (by Ailing)
[ci] Use a new PAT for project with org permission (#4826) (by Frost Ming)
[Lang] Add 2d and 3d rotation functions to math module (#4822) (by Zhao Liang)
[Doc] Proofread Getting started (#4682) (by Vissidarte-Herman)
[Doc] Editorial review to fields (advanced) (#4686) (by Vissidarte-Herman)
[bug] Fix bug that building with gcc9.4 will fail (#4823) (by PGZXB)
[SIMT] Add shfl_down_f32 intrinsic. (#4819) (by Chun Cai)
[workflow] Add issues to project when issue opened (#4816) (by Frost Ming)
[vulkan] Fix vulkan initialization on macOS with cpu backend (#4813) (by Bob Cao)
[Doc] [mesh] Update docstring for ti.Mesh (#4818) (by Chang Yu)
[vulkan] Fix Vulkan device score bug (#4803) (by Andrew Sun)
[Lang] Walkaround Vulkan backend behavior which changes cwd on Mac (#4812) (by TiGeekMan)
[misc] Add SNode to offline-cache key (#4716) (by PGZXB)
[Lang] Add mod function to math module (#4809) (by Zhao Liang)
[doc] Fix doc of running C++ tests (#4798) (by Yi Xu)
[Lang] Support in-place operator of ti.Matrix in python scope (#4799) (by Lin Jiang)
[Lang] [ir] Move short-circuit boolean logic into AST-to-IR passes (#4580) (by daylily)
[lang] Fix frontend type check for sqrt, log, exp (#4797) (by Yi Xu)
[Doc] Remove redundant semicolon in path (#4801) (by gaoxinge)
[Lang] [ir] Promote output type of log, exp, and sqrt ops (#4622) (by Andrew Sun)
[ci] Update ci images to use latest git (#4792) (by Bo Qiao)
[Lang] Fix integral type promotion rules (e.g., u8 + u8 now leads to u8 instead of i32) (#4789) (by Yuanming Hu)
[Lang] Add basic complex arithmetic and add a mandelbrot example (#4780) (by Zhao Liang)
Update index.md (#4791) (by Bob Cao)
[spirv] Add 16 bit float immediate number (#4787) (by Bob Cao)
[ci] Update ubuntu 18.04 image to use latest git (#4785) (by Frost Ming)
[lang] Store relations with 16-bit type (#4779) (by Chang Yu)
[Autodiff] Implement ti.ad.no_grad to skip autograd (#4751) (by Shawn Yao)
[misc] Remove some unnecessary attributes from offline-cache key of compile-config (#4770) (by PGZXB)
[doc] Update install instruction with "--upgrade" (#4775) (by Yuanming Hu)
Expose VboHelpers class (#4773) (by Ailing)
Bump version to v1.0.1 (#4774) (by Taichi Gardener)
[refactor] Merge Kernel.argument_names and argument_annotations (#4753) (by dongqi shen)
[dx11] Constant buffer binding and AtomicIncrement in RAND_STATE (#4650) (by quadpixels)

v1.0.0

2 years ago

v1.0.0 was released on April 13, 2022.

Compatibility changes

License change

Taichi's license is changed from MIT to Apache-2.0 after a public vote in #4607.

Python 3.10 support

This release supports Python 3.10 on all supported operating systems (Windows, macOS, and Linux).

Manylinux2014-compatible wheels

Before v1.0.0, Taichi works only on Linux distributions that support glibc 2.27+ (for example Ubuntu 18.04+). As of v1.0.0, in addition to the normal Taichi wheels, Taichi provides the manylinux2014-compatible wheels to work on most modern Linux distributions, including CentOS 7.

The normal wheels support all backends; the incoming manylinux2014-compatible wheels support the CPU and CUDA backends only. Choose the wheels that work best for you.
If you encounter any issue when installing the wheels, try upgrading your pip to the latest version first.

Deprecations

This release deprecates ti.ext_arr() and uses ti.types.ndarray() instead. ti.types.ndarray() supports both Taichi Ndarrays and external arrays, for example NumPy arrays.
Taichi plans to drop support for Python 3.6 in the next minor release (v1.1.0). If you have any questions or concerns, please let us know at #4772.

New features

Non-Python deployment solution

By working together with OPPO US Research Center, Taichi delivers Taichi AOT, a solution for deploying kernels in non-Python environments, such as in mobile devices.

Compiled Taichi kernels can be saved from a Python process, then loaded and run by the provided C++ runtime library. With a set of APIs, your Python/Taichi code can be easily deployed in any C++ environment. We demonstrate the simplicity of this workflow by porting the implicit FEM (finite element method) demo released in v0.9.0 to an Android application. Download the Android package and find out what Taichi AOT has to offer! If you want to try out this solution, please also check out the taichi-aot-demo repo.

# In Python app.py
module = ti.aot.Module(ti.vulkan) 
module.add_kernel(my_kernel, template_args={'x': x})
module.save('my_app')

The following code snippet shows the C++ workflow for loading the compiled AOT modules.

// Initialize Vulkan program pipeline
taichi::lang::vulkan::VulkanDeviceCreator::Params evd_params;
evd_params.api_version = VK_API_VERSION_1_2;
auto embedded_device =
    std::make_unique<taichi::lang::vulkan::VulkanDeviceCreator>(evd_params);

std::vector<uint64_t> host_result_buffer;
host_result_buffer.resize(taichi_result_buffer_entries);
taichi::lang::vulkan::VkRuntime::Params params;
params.host_result_buffer = host_result_buffer.data();
params.device = embedded_device->device();
auto vulkan_runtime = std::make_unique<taichi::lang::vulkan::VkRuntime>(std::move(params));

// Load AOT module saved from Python
taichi::lang::vulkan::AotModuleParams aot_params{"my_app", vulkan_runtime.get()};
auto module = taichi::lang::aot::Module::load(taichi::Arch::vulkan, aot_params);
auto my_kernel = module->get_kernel("my_kernel");

// Allocate device buffer
taichi::lang::Device::AllocParams alloc_params;
alloc_params.host_write = true;
alloc_params.size = /*Ndarray size for `x`*/;
alloc_params.usage = taichi::lang::AllocUsage::Storage;
auto devalloc_x = embedded_device->device()->allocate_memory(alloc_params);

// Execute my_kernel without Python environment
taichi::lang::RuntimeContext host_ctx;
host_ctx.set_arg_devalloc(/*arg_id=*/0, devalloc_x, /*shape=*/{128}, /*element_shape=*/{3, 1});
my_kernel->launch(&host_ctx);

Note that Taichi only supports the Vulkan backend in the C++ runtime library. The Taichi team is working on supporting more backends.

Real functions (experimental)

All Taichi functions are inlined into the Taichi kernel during compile time. However, the kernel becomes lengthy and requires longer compile time if it has too many Taichi function calls. This becomes especially obvious if a Taichi function involves compile-time recursion. For example, the following code calculates the Fibonacci numbers recursively:

@ti.func
def fib_impl(n: ti.template()):
    if ti.static(n <= 0):
        return 0
    if ti.static(n == 1):
        return 1
    return fib_impl(n - 1) + fib_impl(n - 2)

@ti.kernel
def fibonacci(n: ti.template()):
    print(fib_impl(n))

In this code, fib_impl() recursively calls itself until n reaches 1 or 0. The total time of the calls to fib_impl() increases exponentially as n grows, so the length of the kernel also increases exponentially. When n reaches 25, it takes more than a minute to compile the kernel.

This release introduces "real function", a new type of Taichi function that compiles independently instead of being inlined into the kernel. It is an experimental feature and only supports scalar arguments and scalar return value for now.

You can use it by decorating the function with @ti.experimental.real_func. For example, the following is the real function version of the code above.

@ti.experimental.real_func
def fib_impl(n: ti.i32) -> ti.i32:
    if n <= 0:
        return 0
    if n == 1:
        return 1
    return fib_impl(n - 1) + fib_impl(n - 2)

@ti.kernel
def fibonacci(n: ti.i32):
    print(fib_impl(n))

The length of the kernel does not increase as n grows because the kernel only makes a call to the function instead of inlining the whole function. As a result, the code takes far less than a second to compile regardless of the value of n.

The main differences between a normal Taichi function and a real function are listed below:

You can write return statements in any part of a real function, while you cannot write return statements inside the scope of non-static if / for / while statements in a normal Taichi function.
A real function can be called recursively at runtime, while a normal Taichi function only supports compile-time recursion.
The return value and arguments of a real function must be type hinted, while the type hints are optional in a normal Taichi function.

Type annotations for literals

Previously, you cannot explicitly give a type to a literal. For example,

@ti.kernel
def foo():
    a = 2891336453  # i32 overflow (>2^31-1)

In the code snippet above, 2891336453 is first turned into a default integer type (ti.i32 if not changed). This causes an overflow. Starting from v1.0.0, you can write type annotations for literals:

@ti.kernel
def foo():
    a = ti.u32(2891336453)  # similar to 2891336453u in C

Top-level loop configurations

You can use ti.loop_config to control the behavior of the subsequent top-level for-loop. Available parameters are:

block_dim: Sets the number of threads in a block on GPU.
parallelize: Sets the number of threads to use on CPU.
serialize: If you set serialize to True, the for-loop runs serially, and you can write break statements inside it (Only applies on range/ndrange for-loops). Setting serialize to True Equals setting parallelize to 1.

Here are two examples:

@ti.kernel
def break_in_serial_for() -> ti.i32:
    a = 0
    ti.loop_config(serialize=True)
    for i in range(100):  # This loop runs serially
        a += i
        if i == 10:
            break
    return a

break_in_serial_for()  # returns 55

n = 128
val = ti.field(ti.i32, shape=n)

@ti.kernel
def fill():
    ti.loop_config(parallelize=8, block_dim=16)
    # If the kernel is run on the CPU backend, 8 threads will be used to run it
    # If the kernel is run on the CUDA backend, each block will have 16 threads
    for i in range(n):
        val[i] = i

`math` module

This release adds a math module to support GLSL-standard vector operations and to make it easier to port GLSL shader code to Taichi. For example, vector types, including vec2, vec3, vec4, mat2, mat3, and mat4, and functions, including mix(), clamp(), and smoothstep(), act similarly to their counterparts in GLSL. See the following examples:

Vector initialization and swizzling

You can use the rgba, xyzw, uvw properties to get and set vector entries:

import taichi.math as tm

@ti.kernel
def example():
    v = tm.vec3(1.0)  # (1.0, 1.0, 1.0)
    w = tm.vec4(0.0, 1.0, 2.0, 3.0)
    v.rgg += 1.0  # v = (2.0, 3.0, 1.0)
    w.zxy += tm.sin(v)

Matrix multiplication

Each Taichi vector is implemented as a column vector. Ensure that you put the the matrix before the vector in a matrix multiplication.

@ti.kernel
def example():
    M = ti.Matrix([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
    v = tm.vec3(1, 2, 3)
    w = (M @ v).xyz  # [1, 2, 3]

GLSL-standard functions

@ti.kernel
def example():
    v = tm.vec3(0., 1., 2.)
    w = tm.smoothstep(0.0, 1.0, v.xyz)
    w = tm.clamp(w, 0.2, 0.8)

CLI command `ti gallery`

This release introduces a CLI command ti gallery, allowing you to select and run Taichi examples in a pop-up window. To do so:

Open a terminal:

ti gallery

A window pops up:

Click to run any example in the pop-up window. The console prints the corresponding source code at the same time.

Improvements

Enhanced matrix type

As of v1.0.0, Taichi accepts matrix or vector types as parameters and return values. You can use ti.types.matrix or ti.types.vector as the type annotations.

Taichi also supports basic, read-only matrix slicing. Use the mat[:,:] syntax to quickly retrieve a specific portion of a matrix. See Slicings for more information.

The following code example shows how to get numbers in four corners of a 3x3 matrix mat:

import taichi as ti

ti.init()

@ti.kernel
def foo(mat: ti.types.matrix(3, 3, ti.i32)) -> ti.types.matrix(2, 2, ti.i32)
    corners = mat[::2, ::2]
    return corners
  
mat = ti.Matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
corners = foo(mat)  # [[1 3] [7 9]]

Note that in a slice, the lower bound, the upper bound, and the stride must be constant integers. If you want to use a variable index together with a slice, you should set ti.init(dynamic_index=True). For example:

import taichi as ti

ti.init(dynamic_index=True)

@ti.kernel
def foo(mat: ti.types.matrix(3, 3, ti.i32), ind: ti.i32) -> ti.types.matrix(3, 1, ti.i32):
    col = mat[:, ind]
    return col
  
mat = ti.Matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
col = foo(mat, 2)  # [3 6 9]

More flexible Autodiff: Kernel Simplicity Rule removed

Flexiblity is key to the user experience of an automatic-differentiation (AD) system. Before v1.0.0, Taichi AD system requires that a differentiable Taichi kernel only consist multiple simply nested for-loops (shown in task1 below). This was once called the Kernel Simplicity Rule (KSR). KSR prevents Taichi's users from writing differentiable kernels with multiple serial for-loops (shown in task2 below) or with a mixture of serial for-loop and non-for statements (shown in task3 below).

# OK: multiple simply nested for-loops
@ti.kernel
def task1():
    for i in range(2):
        for j in range(3):
            for k in range(3):
                y[None] += x[None]

# Error: multiple serial for-loops
@ti.kernel
def task2():
    for i in range(2):
        for j in range(3):
            y[None] += x[None]
        for j in range(3):
            y[None] += x[None]

# Error: a mixture of serial for-loop and non-for
@ti.kernel
def task3():
    for i in range(2):
        y[None] += x[None]
        for j in range(3):
            y[None] += x[None]

With KSR being removed from this release, code with different kinds of for-loops structures can be differentiated, as shown in the snippet below.

# OK: A complicated control flow that is still differentiable in Taichi
for j in range(2):
    for i in range(3):
        y[None] += x[None]
    for i in range(3):
        for ii in range(2):
            y[None] += x[None]
        for iii in range(2):
            y[None] += x[None]
            for iv in range(2):
                y[None] += x[None]
    for i in range(3):
        for ii in range(2):
            for iii in range(2):
                y[None] += x[None]

Taichi provides a demo to demonstrate how to implement a differentiable simulator using this enhanced Taichi AD system.

f-string support in an `assert` statement

This release supports including an f-string in an assert statement as an error message. You can include scalar variables in the f-string. See the example below:

import taichi as ti

ti.init(debug=True)

@ti.kernel
def assert_is_zero(n: ti.i32):
    assert n == 0, f"The number is {n}, not zero"

assert_is_zero(42)  # TaichiAssertionError: The number is 42, not zero

Note that the assert statement works only in debug mode.

Documentation changes

Taichi language reference

This release comes with the first version of the Taichi language specification, which attempts to provide an exhaustive description of the syntax and semantics of the Taichi language and makes a decent reference for Taichi's users and developers when they determine if a specific behavior is correct, buggy, or undefined.

API changes

Deprecated

Deprecated	Replaced by
`ti.ext_arr()`	`ti.types.ndarray()`

Full changelog

[example] Add diff sph demo (#4769) (by Mingrui Zhang)
[autodiff] Fix nullptr during adjoint codegen (#4771) (by Ye Kuang)
[bug] Fix kernel profiler on CPU backend (#4768) (by Lin Jiang)
[example] Fix taichi_dynamic example (#4767) (by Yi Xu)
[aot] Provide a convenient API to set devallocation as argument (#4762) (by Ailing)
[Lang] Deprecate ti.pyfunc (#4764) (by Lin Jiang)
[misc] Bump version to v1.0.0 (#4763) (by Yi Xu)
[SIMT] Add all_sync warp intrinsics (#4718) (by Yongmin Hu)
[doc] Taichi spec: calls, unary ops, binary ops and comparison (#4663) (by squarefk)
[SIMT] Add any_sync warp intrinsics (#4719) (by Yongmin Hu)
[Doc] Update community standard (#4759) (by notginger)
[Doc] Propose the RFC process (#4755) (by Ye Kuang)
[Doc] Fixed a broken link (#4758) (by Vissidarte-Herman)
[Doc] Taichi spec: conditional expressions and simple statements (#4728) (by Xiangyun Yang)
[bug] [lang] Let matrix initialize to the target type (#4744) (by Lin Jiang)
[ci] Fix ci nightly (#4754) (by Bo Qiao)
[doc] Taichi spec: compound statements, if, while (#4658) (by Lin Jiang)
[build] Simplify build command for android (#4752) (by Ailing)
[lang] Add PolygonMode enum for rasterizer (#4750) (by Ye Kuang)
[Aot] Support template args in AOT module add_kernel (#4748) (by Ye Kuang)
[lang] Support in-place operations on math vectors (#4738) (by Lin Jiang)
[ci] Add python 3.6 and 3.10 to nightly release (#4740) (by Bo Qiao)
[Android] Fix Android get height issue (#4743) (by Ye Kuang)
Updated logo (#4745) (by Vissidarte-Herman)
[Error] Raise an error when non-static condition is passed into ti.static_assert (#4735) (by Lin Jiang)
[Doc] Taichi spec: For (#4689) (by Lin Jiang)
[SIMT] [cuda] Use correct source lane offset for warp intrinsics (#4734) (by Bo Qiao)
[SIMT] Add shfl_xor_i32 warp intrinsics (#4642) (by Yongmin Hu)
[Bug] Fix warnings (#4730) (by Peng Yu)
[Lang] Add vector swizzle feature to math module (#4629) (by TiGeekMan)
[Doc] Taichi spec: static expressions (#4702) (by Lin Jiang)
[Doc] Taichi spec: assignment expressions (#4725) (by Xiangyun Yang)
[mac] Fix external_func test failures on arm backend (#4733) (by Ailing)
[doc] Fix deprecated tools APIs in docs, tests, and examples (#4729) (by Yi Xu)
[ci] Switch to self-hosted PyPI for nightly release (#4706) (by Bo Qiao)
[Doc] Taichi spec: boolean operations (#4724) (by Xiangyun Yang)
[doc] Fix deprecated profiler APIs in docs, tests, and examples (#4726) (by Yi Xu)
[spirv] Ext arr name should include arg id (#4727) (by Ailing)
[SIMT] Add shfl_sync_i32/f32 warp intrinsics (#4717) (by Yongmin Hu)
[Lang] Add 2x2/3x3 matrix solve with Guass elimination (#4634) (by Peng Yu)
[metal] Tweak Device to support Ndarray (#4721) (by Ye Kuang)
[build] Fix non x64 linux builds (#4715) (by Bob Cao)
[Doc] Fix 4 typos in doc (#4714) (by Jiayi Weng)
[simt] Subgroup reduction primitives (#4643) (by Bob Cao)
[misc] Remove legacy LICENSE.txt (#4708) (by Yi Xu)
[gui] Make GGUI VBO configurable for mesh (#4707) (by Yuheng Zou)
[Docs] Change License from MIT to Apache-2.0 (#4701) (by notginger)
[Doc] Update docstring for module misc (#4644) (by Zhao Liang)
[doc] Proofread GGUI.md (#4676) (by Vissidarte-Herman)
[refactor] Remove Expression::serialize and add ExpressionHumanFriendlyPrinter (#4657) (by PGZXB)
[Doc] Remove extension_libraries in doc site (#4696) (by LittleMan)
[Lang] Let assertion error message support f-string (#4700) (by Lin Jiang)
[Doc] Taichi spec: prims, attributes, subscriptions, slicings (#4697) (by Yi Xu)
[misc] Add compile-config to offline-cache key (#4681) (by PGZXB)
[refactor] Remove legacy usage of ext_arr/any_arr in codebase (#4698) (by Yi Xu)
[doc] Taichi spec: pass, return, break, and continue (#4656) (by Lin Jiang)
[bug] Fix chain assignment (#4695) (by Lin Jiang)
[Doc] Refactored GUI.md (#4672) (by Vissidarte-Herman)
[misc] Update linux version name (#4685) (by Jiasheng Zhang)
[bug] Fix ndrange when start > end (#4690) (by Lin Jiang)
[bug] Fix bugs in test_offline_cache.py (#4674) (by PGZXB)
[Doc] Fix gif link (#4694) (by Ye Kuang)
[Lang] Add math module to support glsl-style functions (#4683) (by LittleMan)
[Doc] Editorial updates (#4688) (by Vissidarte-Herman)
Editorial updates (#4687) (by Vissidarte-Herman)
[ci] [windows] Add Dockerfile for Windows build and test (CPU) (#4667) (by Bo Qiao)
[Doc] Taichi spec: list and dictionary displays (#4665) (by Yi Xu)
[CUDA] Fix the fp32 to fp64 promotion due to incorrect fmax/fmin call (#4664) (by Haidong Lan)
[misc] Temporarily disable a flaky test (#4669) (by Yi Xu)
[bug] Fix void return (#4654) (by Lin Jiang)
[Workflow] Use pre-commit hooks to check codes (#4633) (by Frost Ming)
[SIMT] Add ballot_sync warp intrinsics (#4641) (by Wimaxs)
[refactor] [cuda] Refactor offline-cache and support it on arch=cuda (#4600) (by PGZXB)
[Error] [doc] Add TaichiAssertionError and add assert to the lang spec (#4649) (by Lin Jiang)
[Doc] Taichi spec: parenthesized forms; expression lists (#4653) (by Yi Xu)
[Doc] Updated definition of a 0D field. (#4651) (by Vissidarte-Herman)
[Doc] Taichi spec: variables and scope; atoms; names; literals (#4621) (by Yi Xu)
[doc] Fix broken links and update docs. (#4647) (by Chengchen(Rex) Wang)
[Bug] Fix broken links (#4646) (by Peng Yu)
[Doc] Refactored field.md (#4618) (by Vissidarte-Herman)
[gui] Allow to configure the texture data type (#4630) (by Gabriel H)
[vulkan] Fixes the string comparison when querying extensions (#4638) (by Bob Cao)
[doc] Add docstring for ti.loop_config (#4625) (by Lin Jiang)
[SIMT] Add shfl_up_i32/f32 warp intrinsics (#4632) (by Yu Zhang)
[Doc] Examples directory update (#4640) (by dongqi shen)
[vulkan] Choose better devices (#4614) (by Bob Cao)
[SIMT] Implement ti.simt.warp.shfl_down_i32 and add stubs for other warp-level intrinsics (#4616) (by Yuanming Hu)
[refactor] Refactor Identifier::id_counter to global and local counter (#4581) (by PGZXB)
[android] Disable XDG on non-supported platform (#4612) (by Gabriel H)
[gui] [aot] Allow set_image to use user VBO (#4611) (by Gabriel H)
[Doc] Add docsting for Camera class in ui module (#4588) (by Zhao Liang)
[metal] Implement buffer_fill for unified device API (#4595) (by Ye Kuang)
[Lang] Matrix 3x3 eigen decomposition (#4571) (by Peng Yu)
[Doc] Set up the basis of Taichi specification (#4603) (by Yi Xu)
[gui] Make GGUI VBO configurable for particles (#4610) (by Yuheng Zou)
[Doc] Update with Python 3.10 support (#4609) (by Bo Qiao)
[misc] Bump version to v0.9.3 (#4608) (by Taichi Gardener)
[Lang] Deprecate ext_arr/any_arr in favor of types.ndarray (#4598) (by Yi Xu)
[Doc] Adjust CPU GUI document layout (#4605) (by Peng Yu)
[Doc] Refactored Type system. (#4584) (by Vissidarte-Herman)
[lang] Fix vector matrix ndarray to numpy layout (#4597) (by Bo Qiao)
[bug] Fix bug that caching kernels with same AST will fail (#4582) (by PGZXB)

v0.9.2

2 years ago

Highlights:

CI/CD workflow
- Generate manylinux2014-compatible wheels with CUDA backend in release workflow (#4550) (by Yi Xu)
Command line interface
- Fix a few bugs in taichi gallery command (#4548) (by Zhao Liang)
Documentation
- Fixed broken links. (#4563) (by Vissidarte-Herman)
- Refactored README.md (#4549) (by Vissidarte-Herman)
- Create CODE_OF_CONDUCT (#4564) (by notginger)
- Update syntax.md (#4557) (by Vissidarte-Herman)
- Update docstring for ndrange (#4486) (by Zhao Liang)
- Minor updates: It is recommended to type hint arguments and return values (#4510) (by Vissidarte-Herman)
- Refactored Kernels and functions. (#4496) (by Vissidarte-Herman)
- Add initial variable and fragments (#4457) (by Justin)
Language and syntax
- Add taichi gallery command for user to choose and run example in gui (#4532) (by TiGeekMan)
- Add ti.serialize and ti.loop_config (#4525) (by Lin Jiang)
- Support simple matrix slicing (#4488) (by Xiangyun Yang)
- Remove legacy ways to construct matrices (#4521) (by Yi Xu)

Full changelog:

[lang] Replace keywords in python (#4606) (by Jiasheng Zhang)
[lang] Fix py36 block_dim bug (#4601) (by Jiasheng Zhang)
[ci] Fix release script bug (#4599) (by Jiasheng Zhang)
[aot] Support return in vulkan aot (#4593) (by Ailing)
[ci] Release script add test for tests/python/examples (#4590) (by Jiasheng Zhang)
[misc] Write version info right after creation of uuid (#4589) (by Jiasheng Zhang)
[gui] Make GGUI VBO configurable (#4575) (by Ye Kuang)
[test] Fix ill-formed test_binary_func_ret (#4587) (by Yi Xu)
Update differences_between_taichi_and_python_programs.md (#4583) (by Vissidarte-Herman)
[misc] Fix a few warnings (#4572) (by Ye Kuang)
[aot] Remove redundant module_path argument (#4573) (by Ailing)
[bug] [opt] Fix some bugs when deal with real function (#4568) (by Xiangyun Yang)
[build] Guard llvm usage inside TI_WITH_LLVM (#4570) (by Ailing)
[aot] [refactor] Add make_new_field for Metal (#4559) (by Bo Qiao)
[llvm] [lang] Add support for multiple return statements in real function (#4536) (by Lin Jiang)
[test] Add test for offline-cache (#4562) (by PGZXB)
Format updates (#4567) (by Vissidarte-Herman)
[aot] Add KernelTemplate interface (#4558) (by Ye Kuang)
[test] Eliminate the warnings in test suite (#4556) (by Frost Ming)
[Doc] Fixed broken links. (#4563) (by Vissidarte-Herman)
[Doc] Refactored README.md (#4549) (by Vissidarte-Herman)
[Doc] Create CODE_OF_CONDUCT (#4564) (by notginger)
[misc] Reset counters in Program::finalize() (#4561) (by PGZXB)
[misc] Add TI_CI env to CI/CD (#4551) (by Jiasheng Zhang)
[ir] Add basic tests for Block (#4553) (by Ye Kuang)
[refactor] Fix error message (#4552) (by Ye Kuang)
[Doc] Update syntax.md (#4557) (by Vissidarte-Herman)
[gui] Hack to make GUI.close() work on macOS (#4555) (by Ye Kuang)
[aot] Fix get_kernel API semantics (#4554) (by Ye Kuang)
[opt] Support offline-cache for kernel with arch=cpu (#4500) (by PGZXB)
[CLI] Fix a few bugs in taichi gallery command (#4548) (by Zhao Liang)
[ir] Small optimizations to codegen (#4442) (by Bob Cao)
[CI] Generate manylinux2014-compatible wheels with CUDA backend in release workflow (#4550) (by Yi Xu)
[misc] Metadata update (#4539) (by Jiasheng Zhang)
[test] Parametrize the test cases with pytest.mark (#4546) (by Frost Ming)
[Doc] Update docstring for ndrange (#4486) (by Zhao Liang)
[build] Default symbol visibility to hidden for all targets (#4545) (by Gabriel H)
[autodiff] Handle multiple, mixed Independent Blocks (IBs) within multi-levels serial for-loops (#4523) (by Mingrui Zhang)
[bug] [lang] Cast the arguments of real function to the desired types (#4538) (by Lin Jiang)
[Lang] Add taichi gallery command for user to choose and run example in gui (#4532) (by TiGeekMan)
[bug] Fix bug that calling std::getenv when cpp-tests running will fail (#4537) (by PGZXB)
[vulkan] Fix performance (#4535) (by Bob Cao)
[Lang] Add ti.serialize and ti.loop_config (#4525) (by Lin Jiang)
[Lang] Support simple matrix slicing (#4488) (by Xiangyun Yang)
Update vulkan_api.cpp (#4533) (by Bob Cao)
[lang] Quick fix for mesh_local analyzer (#4529) (by Chang Yu)
[test] Show arch info in the verbose test report (#4528) (by Frost Ming)
[aot] Add binding_id of root/gtmp/rets/args bufs to CompiledOffloadedTask (#4522) (by Ailing)
[vulkan] Relax a few test precisions for vulkan (#4524) (by Ailing)
[build] Option to use LLD (#4513) (by Bob Cao)
[misc] [linux] Implement XDG Base Directory support (#4514) (by ruro)
[Lang] [refactor] Remove legacy ways to construct matrices (#4521) (by Yi Xu)
[misc] Make result of irpass::print hold more information (#4517) (by PGZXB)
[refactor] Misc improvements over AST helper functions (#4398) (by daylily)
[misc] [build] Bump catch external library 2.13.3 -> 2.13.8 (#4516) (by ruro)
[autodiff] Reduce the number of ad stack using knowledge of derivative formulas (#4512) (by Mingrui Zhang)
[ir] [opt] Fix a bug about 'continue' stmt in cfg_build (#4507) (by Xiangyun Yang)
[Doc] Minor updates: It is recommended to type hint arguments and return values (#4510) (by Vissidarte-Herman)
[ci] Fix the taichi repo name by hardcode (#4506) (by Frost Ming)
[build] Guard dx lib search with TI_WITH_DX11 (#4505) (by Ailing)
[ci] Reduce the default device memory usage for GPU tests (#4508) (by Bo Qiao)
[Doc] Refactored Kernels and functions. (#4496) (by Vissidarte-Herman)
[aot] [refactor] Refactor AOT field API for Vulkan (#4490) (by Bo Qiao)
[ci] Fix: fill in the pull request body created by bot (#4503) (by Frost Ming)
[ci] Skip in steps rather than the whole job (#4499) (by Frost Ming)
[ci] Add a Dockerfile for building manylinux2014-compatible Taichi wheels with CUDA backend (#4491) (by Yi Xu)
[ci] Automate release publishing (#4428) (by Frost Ming)
[fix] dangling ti.func decorator in euler.py (#4492) (by Zihua Wu)
[ir] Fix a bug in simplify pass (#4489) (by Xiangyun Yang)
[test] Add test for recursive real function (#4477) (by Lin Jiang)
[Doc] Add initial variable and fragments (#4457) (by Justin)
[misc] Add a convenient script for testing compatibility of Taichi releases. (#4485) (by Chengchen(Rex) Wang)
[misc] Version bump: v0.9.1 -> v0.9.2 (#4484) (by Chengchen(Rex) Wang)
[ci] Update gpu docker image to test python 3.10 (#4472) (by Bo Qiao)

v0.9.1

2 years ago

Highlights:

CI/CD workflow
- Cleanup workspace before window test (#4405) (by Jian Zeng)
Documentation
- Update docstrings for functions in ops (#4465) (by Zhao Liang)
- Update docstring for functions in misc (#4474) (by Zhao Liang)
- Update docstrings in misc (#4446) (by Zhao Liang)
- Update docstring for functions in operations (#4427) (by Zhao Liang)
- Update PyTorch interface documentation (#4311) (by Andrew Sun)
- Update docstring for functions in operations (#4413) (by Zhao Liang)
- Update docstring for functions in operations (#4392) (by Zhao Liang)
- Fix broken links (#4368) (by Ye Kuang)
- Re-structure the articles: getting-started, gui (#4360) (by Ye Kuang)
Error messages
- Add error message when the number of elements in kernel arguments exceed (#4444) (by Xiangyun Yang)
- Add error for invalid snode size (#4460) (by Lin Jiang)
- Add error messages for wrong type annotations of literals (#4462) (by Yi Xu)
- Remove the mentioning of ti.pyfunc in the error message (#4429) (by Lin Jiang)
Language and syntax
- Support sparse matrix builder datatype configuration (#4411) (by Peng Yu)
- Support type annotations for literals (#4440) (by Yi Xu)
- Support simple matrix slicing (#4420) (by Xiangyun Yang)
- Support kernel to return a matrix type value (#4062) (by Xiangyun Yang)
Vulkan backend
- Enable Vulkan device selection when using cuda (#4330) (by Bo Qiao)

Full changelog:

[bug] [llvm] Initialize the field to 0 when finalizing a field (#4463) (by Lin Jiang)
[Doc] Update docstrings for functions in ops (#4465) (by Zhao Liang)
[Error] Add error message when the number of elements in kernel arguments exceed (#4444) (by Xiangyun Yang)
[Doc] Update docstring for functions in misc (#4474) (by Zhao Liang)
[metal] Support device memory allocation/deallocation (#4439) (by Ye Kuang)
update docstring for exceptions (#4475) (by Zhao Liang)
[llvm] Support real function with single scalar return value (#4452) (by Lin Jiang)
[refactor] Remove LLVM logic from the generic Device interface (#4470) (by PGZXB)
[lang] Add decorator ti.experimental.real_func (#4458) (by Lin Jiang)
[ci] Add python 3.10 into nightly test and release (#4467) (by Bo Qiao)
[bug] Fix metal linker error when TI_WITH_METAL=OFF (#4469) (by Bo Qiao)
[Lang] Support sparse matrix builder datatype configuration (#4411) (by Peng Yu)
[Error] Add error for invalid snode size (#4460) (by Lin Jiang)
[aot] [refactor] Refactor AOT runtime API to use module (#4437) (by Bo Qiao)
[misc] Optimize verison check (#4461) (by Jiasheng Zhang)
[Error] Add error messages for wrong type annotations of literals (#4462) (by Yi Xu)
[misc] Remove some warnings (#4453) (by PGZXB)
[refactor] Move literal construction to expr module (#4448) (by Yi Xu)
[bug] [lang] Enable break in the outermost for not in the outermost scope (#4447) (by Lin Jiang)
[Doc] Update docstrings in misc (#4446) (by Zhao Liang)
[llvm] Support real function which has scalar arguments (#4422) (by Lin Jiang)
[Lang] Support type annotations for literals (#4440) (by Yi Xu)
[misc] Remove a unnecessary function (#4443) (by PGZXB)
[metal] Expose BufferMemoryView (#4432) (by Ye Kuang)
[Lang] Support simple matrix slicing (#4420) (by Xiangyun Yang)
[Doc] Update docstring for functions in operations (#4427) (by Zhao Liang)
[metal] Add Unified Device API skeleton code (#4431) (by Ye Kuang)
[refactor] Refactor llvm-offloaded-task-name mangling (#4418) (by PGZXB)
[Doc] Update PyTorch interface documentation (#4311) (by Andrew Sun)
[misc] Add deserialization tool for benchmarks (#4278) (by rocket)
[misc] Add matrix operations to micro-benchmarks (#4190) (by rocket)
[Error] Remove the mentioning of ti.pyfunc in the error message (#4429) (by Lin Jiang)
[metal] Add AotModuleLoader (#4423) (by Ye Kuang)
[Doc] Update docstring for functions in operations (#4413) (by Zhao Liang)
[vulkan] Support templated kernel in aot module (#4417) (by Ailing)
[vulkan] [aot] Add aot namespace Vulkan (#4419) (by Bo Qiao)
[Lang] Support kernel to return a matrix type value (#4062) (by Xiangyun Yang)
[test] Add a test for the ad_gravity example (#4404) (by FZC)
[Doc] Update docstring for functions in operations (#4392) (by Zhao Liang)
[CI] Cleanup workspace before window test (#4405) (by Jian Zeng)
[build] Enforce compatibility with manylinux2014 when TI_WITH_VULKAN=OFF (#4406) (by Yi Xu)
[ci] Update tag to projects (#4400) (by Bo Qiao)
[ci] Reduce test parallelism for m1 (#4394) (by Bo Qiao)
[aot] [vulkan] Add AotKernel and its Vulkan impl (#4387) (by Ye Kuang)
[vulkan] [aot] Move add_root_buffer to public members (#4396) (by Gabriel H)
[llvm] Remove LLVM functions related to a SNode tree from the module when the SNode tree is destroyed (#4356) (by Lin Jiang)
[test] disable serveral workflows on forks (#4393) (by Jian Zeng)
[ci] Windows build exits on the first error (#4391) (by Bo Qiao)
[misc] Upgrade test and docker image to support python 3.10 (#3986) (by Bo Qiao)
[aot] [vulkan] Output shapes/dims to AOT exported module (#4382) (by Gabriel H)
[test] Merge the py38 only cases into the main test suite (#4378) (by Frost Ming)
[vulkan] Refactor Runtime to decouple the SNodeTree part (#4380) (by Ye Kuang)
[lang] External Ptr alias analysis & demote atomics (#4273) (by Bob Cao)
[example] Fix implicit_fem example command line arguments (#4372) (by bx2k)
[mesh] Constructing mesh from data in memory (#4375) (by bx2k)
[refactor] Move aot_module files (#4374) (by Ye Kuang)
[test] Add test for exposed top-level APIs (#4361) (by Yi Xu)
[refactor] Move arch files (#4373) (by Ye Kuang)
[build] Build with Apple clang-13 (#4370) (by Ailing)
[test] [example] Add a test for print_offset example (#4355) (by Zhi Qi)
[test] Add a test for the game_of_life example (#4365) (by 0xzhang)
[test] Add a test for the nbody example (#4366) (by 0xzhang)
[Doc] Fix broken links (#4368) (by Ye Kuang)
[ci] Run vulkan and metal separately on M1 (#4367) (by Ailing)
[Doc] Re-structure the articles: getting-started, gui (#4360) (by Ye Kuang)
[Vulkan] Enable Vulkan device selection when using cuda (#4330) (by Bo Qiao)
[misc] Version bump: v0.9.0->v0.9.1 (#4363) (by Ailing)
[dx11] Materialize runtime, map and unmap (#4339) (by quadpixels)

Taichi Versions Save

v1.1.3

v1.1.2

v1.1.0

Highlights

New features

Quantized data types

Offline cache

Forward-mode automatic differentiation

SharedArray (experimental)

Texture (experimental)

Improvements

GGUI

Syntax

Important bug fixes

API changes

Added

Moved

Deprecated

Deprecation notice

Python 3.6

Taichi_GLSL

MacOS 10.14

Full changelog:

v1.0.4

v1.0.3

v1.0.2

v1.0.1

v1.0.0

Compatibility changes

License change

Python 3.10 support

Manylinux2014-compatible wheels

Deprecations

New features

Non-Python deployment solution

Real functions (experimental)

Type annotations for literals

Top-level loop configurations

math module

Vector initialization and swizzling

Matrix multiplication

GLSL-standard functions

CLI command ti gallery

Improvements

Enhanced matrix type

More flexible Autodiff: Kernel Simplicity Rule removed

f-string support in an assert statement

Documentation changes

Taichi language reference

API changes

Deprecated

Full changelog

v0.9.2

v0.9.1

`math` module

CLI command `ti gallery`

f-string support in an `assert` statement