Open deep learning compiler stack for cpu, gpu and specialized accelerators
The TVM community has worked since the v0.15.0 release to deliver the following new exciting improvements! This release version is:
The main tags are below (bold text is with lots of progress):
Please visit the full listing of commits for a complete view: v0.16.dev0...v0.16.0.rc0.
This new RFC explores how TVM can be utilized to generate code for the SME ISA to achieve improved inference performance on supported Arm®-based hardware implementing the SME extension.
Session.import_python_module
methodtvm.ir.make_node
DataflowBlock
s and encourage using ConvertToDataflow
extern.py
for Sphinxdma_wait
builtinopt_level
of tune_relay()
adjustabletransform.LazyGetInput
tanh
, exp
, negative
, and permute
let var = R.const
bindingscall_tir_inplace
in FuseTIR
and FuseOps
reconstruct_from_cache
kernel and add testBindTarget
to specify target for FP8 legalizationarm_cpu
targetscall_packed
semantics to support empty sinfo_argskv_state
and rnn_state
to wasm_runtimeThe TVM community has worked since the v0.14.0 release to deliver the following new exciting improvements! The main tags are below (bold text is with lots of progress):
Please visit the full listing of commits for a complete view: v0.14.0...v0.15.0.
TVM_MODULE_VTABLE
Macros.dtype
topi.rms_norm
with float32 upscaleci_arm
--rev
argumentaten::unflatten
aten::bitwise_and
aten::scaled_dot_product_attention
aten::linalg_vector_norm
param_debug_name_map
to node output name in fx-quantized graph node replacementtopi.nn.matmul
arm_cpu
InjectPermutedLayout
passget_mma_intrin_group
utilityBind
primitiveT.thread_binding
NOTE: This is last release version before unity branch switch as main branch. No unity features.
The TVM community has worked since the v0.14.0 release to deliver the following new exciting improvements! The main tags are below (bold text is with lots of progress):
Please visit the full listing of commits for a complete view: v0.14.0...v0.15.0.
TVM_MODULE_VTABLE
Macros.dtype
topi.rms_norm
with float32 upscaleci_arm
--rev
argumentaten::unflatten
aten::bitwise_and
aten::scaled_dot_product_attention
aten::linalg_vector_norm
param_debug_name_map
to node output name in fx-quantized graph node replacementtopi.nn.matmul
arm_cpu
InjectPermutedLayout
passget_mma_intrin_group
utilityBind
primitiveT.thread_binding
The TVM community has worked since the v0.13.0 release to deliver the following new exciting improvements! The main tags are below (bold text is with lots of progress):
Please visit the full listing of commits for a complete view: v0.13.0...v0.14.0.
arm_cpu
int8 conv2d strategy for dotprod and i8mm targetsarm_cpu
int8 conv2d schedule selection for 32-bit targetsCSourceModule
and StaticLibraryModule
Binary Serializableexport_library
parameters after file_name
keyword-onlyPackArgs
match_buffer
's in block visitor functions (#15153)arm_cpu
int8 conv2d interleaved schedulearm_cpu
specific pooling schedules"arm_cpu
specific pooling schedulesarm_cpu
specific pooling schedules"arm_cpu
specific pooling schedulesblack_format
by defaultshow_object_address
in printing by defaultrms_norm
into TOPIThe TVM community has worked since the v0.12.0 release to deliver the following new exciting improvements! The main tags are below (bold text is with lots of progress):
Please visit the full listing of commits for a complete view: v0.12.0...v0.13.0.
scatter_nd
type relationtune_tir
to tune IRModule of TIR Collectionsmatch_buffer
's in block visitor functionscompute-inline
for fusionunsafe_hide_buffer_access
var = arg_var
in ArgBinderreindex_cache_write
do not mutate init statement__name__
attr for parsed PrimFunc and IRModuleLOG(INFO)
from unsupported dtype legalization passreorder_block_iter_var
The TVM community has worked since the v0.12.0 release to deliver the following new exciting improvements! The main tags are below (bold text is with lots of progress):
Please visit the full listing of commits for a complete view: v0.12.0...v0.13.0.
scatter_nd
type relationtune_tir
to tune IRModule of TIR Collectionsmatch_buffer
's in block visitor functionscompute-inline
for fusionunsafe_hide_buffer_access
var = arg_var
in ArgBinderreindex_cache_write
do not mutate init statement__name__
attr for parsed PrimFunc and IRModuleLOG(INFO)
from unsupported dtype legalization passreorder_block_iter_var
This is a v0.11.1
bug fix release on top of v0.11.0
(see https://github.com/apache/tvm/issues/13899), incorporating a fix to the Python dependencies description.
The TVM community has worked since the v0.10.0 release to deliver the following new exciting improvements!
Metaschedule
TVMSCript metaprogramming
And many other general improvements to microTVM, code quality, CI, frontends, and more! Please visit the full listing of commits for a complete view: https://github.com/apache/tvm/compare/v0.10.0...v0.11.0.
These RFCs have been merged in apache/tvm-rfcs since the last release.
Note that this list is not comprehensive of all PRs and discussions since v0.10. Please visit the full listing of commits for a complete view: https://github.com/apache/tvm/compare/v0.10.0...v0.11.0.
target
test fixture in Hexagon tests (#12981)MultiLevelTiling
apply condition customizable (#13535)VerifyGPUCode
for quantized model workload (#13334)AutoInline
in ScheduleUsingAnchorTrace
(#13329)num_threads
parameter in tuning API (#13561)from-target
Defaults for x86 VNNI Targets (#13383)serial_number
to project options and tests (#13518)dense
-> add
to qnn.dense
-> add
-> requantize
(#13578)CombineParallelDense
slicing axis (#13597)T.int32x4
(#13361)ReverseComputeInline
(#13338)The TVM community has worked since the v0.10.0 release to deliver the following new exciting improvements!
Metaschedule
TVMSCript metaprogramming
And many other general improvements to microTVM, code quality, CI, frontends, and more! Please visit the full listing of commits for a complete view: https://github.com/apache/tvm/compare/v0.10.0...v0.11.0.
These RFCs have been merged in apache/tvm-rfcs since the last release.
Note that this list is not comprehensive of all PRs and discussions since v0.10. Please visit the full listing of commits for a complete view: https://github.com/apache/tvm/compare/v0.10.0...v0.11.0.
target
test fixture in Hexagon tests (#12981)MultiLevelTiling
apply condition customizable (#13535)VerifyGPUCode
for quantized model workload (#13334)AutoInline
in ScheduleUsingAnchorTrace
(#13329)num_threads
parameter in tuning API (#13561)from-target
Defaults for x86 VNNI Targets (#13383)serial_number
to project options and tests (#13518)dense
-> add
to qnn.dense
-> add
-> requantize
(#13578)CombineParallelDense
slicing axis (#13597)T.int32x4
(#13361)ReverseComputeInline
(#13338)