OAID Tengine Versions Save

Tengine is a lite, high performance, modular inference engine for embedded device

lite-v1.5-nvdla

2 years ago

Release v1.5 for NVDLA

Baseline version

  • lite-v1.5

Hardware backend support

  • Zynq UltraScale+ MPSoC ZCU102

Software Depend

  • Ubuntu 20.04
  • OpenCV 4.2
  • gcc 9.3.0
  • cmake 3.16.3

NVDLA type support

  • Small

NVDLA Operator support

  • Batchnorm
  • Concat
  • Convolution
  • Deconvolution
  • Eltwise
  • FC
  • Pooling
  • ReLU
  • Scale
  • Split

NVDLA Network support

Models Input Size Inference Time of ZCU102+NVDLA (ms)
ResNet18 3x32x32 12.6
YOLOv3-Tiny-ReLU 3x416x416 630.5
YOLOX-Nano-ReLU 3x416x416 1138.8

Tengine NVDLA example support

Reference Documents

The Ubuntu image for ZCU102

lite-v1.5

2 years ago

Release v1.5

New Demos and Examples

  • Pipeline Demos
    • face enroll
    • pedestrian distance estimation
    • arcface
    • centerface
    • scrfd
    • yolo
  • Examples
    • YOLOX
    • Segformer
    • Seghuman
    • Scrfd

New hardware backend support

  • Support NVDLA by OpenDLA

New Tools support

  • Align tool
    • ONNX align tool for compare the original onnx model with tmfile model
  • Convert tools
    • ONNX
    • Caffe
    • MXNet
    • Darknet
    • TensorFlow (WIP)
    • TFLite (WIP)
  • Optimize tools
    • segformer-opt
  • Quantization tools
    • ACIQ
    • DFQ
    • EasyQuant

New Online Documents

  • Remove the Markdown files of Online Documents into master branch

New Feature

  • Refactor the python api

CI/CD

  • Add model test module in CI action
  • Add operator test module in CI action
  • Add Backend devices runner in CI action
    • Khadas VIM3
    • Jeston AGX

P.S.

  • NV GPU we have tested with the following devices
    • GeForce RTX 3090
    • GeForce GTX 1080Ti
    • QUADRO RTX 8000
    • Jetson AGX/NX/NANO
  • VeriSilicon NPU we have tested with the following devices
    • A311D
    • S905D3
    • RV1109
    • RV1126
    • i.MX 8M Plus
    • JA310
  • NVDLA we have tested with the following devices
    • ZCU102

latest

2 years ago

Commits

  • 31903cf: Tensorflow serializer (#1109) (bzhang5) #1109
  • b52c4b0: fix bug: convert char '/' to '-' (#1110) (xiguadong) #1110
  • f676175: 文档错误 (#1111) (sysgiven) #1111
  • 3e71f04: update DFQ/EQ/Evaluate int8 perchannel quant tool (#1112) (BowShotDS) #1112
  • 945a371: Update convolution.c (#1113) (Thunder) #1113
  • 7bf86d1: Tensorflow serializer (#1114) (bzhang5) #1114

lite-v1.4-superedge

2 years ago

Release v1.4 for SuperEdge

Baseline version

  • lite-v1.4

Hardware backend support

  • Khadas VIM3 (A311D)

Software Depend

  • Ubuntu 20.04
  • OpenCV 4.2
  • gcc 9.3.0
  • cmake 3.16.3

NPU Network support

Models Inference Time of A311D (ms)
MobileNet v1 4.3
MobileNet v2 5.2
ResNet18 5.5
ResNet50 14.6
SqueezeNet v1.1 2.6
VGG16 18.7
YOLOv3 78.6
YOLOv5s 68.9
YOLOX-S 55.2

lite-v1.4-amlogic

2 years ago

Release v1.4 for Amlogic

Baseline version

  • lite-v1.4

Hardware backend support

  • A311D
  • S905D3

NPU Network support

Models Inference Time of A311D (ms)
MobileNet v1 4.3
MobileNet v2 5.2
ResNet18 5.5
ResNet50 14.6
SqueezeNet v1.1 2.6
VGG16 18.7
YOLOv3 78.6
YOLOv5s 68.9

lite-v1.4-allwinner

2 years ago

Release v1.4 for Allwinner

Baseline version

  • lite-v1.4

Hardware backend support

  • D1(RISC-V C906)

CPU Network support

Models
MobileNet v1
MobileNet v2
ResNet18
SqueezeNet v1.1
YOLO-Fastest

lite-v1.4-nxp

3 years ago

Release v1.4 for NXP

Baseline version

  • lite-v1.4

Hardware backend support

  • i.MX 8M Plus

NPU Network support

Models Inference Time(ms)
MobileNet v1 2.3
MobileNet v2 5.1
ResNet18 4.5
ResNet50 11.7
SqueezeNet v1.1 2.5
VGG16 22.8
YOLOv3 78.2

lite-v1.4

3 years ago

Release v1.4

New hardware backend support

  • Support RISC-V CPU for C906/C910
  • Support NV/AMD/Mali GPU by OpenCL

New Training Framework‘s model support

  • The tengine-convert-tool now supports PaddlePaddle 2.0 format models and will continue the work per users' requests. (please leave your requests on our Github issues)

Fix error

  • Refactor the code of register module
  • Refactor the code of compile module to support Visual Studio

CI/CD

  • Add code quality module in CI action

P.S.

  • NV GPU we have tested with the following devices
    • GeForce RTX 3090
    • GeForce GTX 1080Ti
    • QUADRO RTX 8000
    • Jetson AGX/NX/NANO
  • VeriSilicon NPU we have tested with the following devices
    • A311D
    • S905D3
    • i.MX 8M Plus
    • JA310

lite-v1.3

3 years ago

Release v1.3

New hardware backend support

  • Support NV GPU by CUDA and cuDNN
  • Support NV GPU by TensorRT Plugin
  • Support VeriSilicon NPU by TIM-VX Plugin

New Training Framework‘s model support

  • The tengine-convert-tool try to support the model of OneFlow

Fix error

  • Refactor the code of ACL Plugin to fix the bug of compile or inference on Mali GPU

CI/CD

  • Add code coverage mode in CI action
  • Add model test in CI action, such as classification, detection, recognition and segmentation

P.S.

  • NV GPU we have tested with the following devices
    • GeForce RTX 3090
    • GeForce GTX 1080Ti
    • QUADRO RTX 8000
    • Jetson AGX/NX/NANO
  • VeriSilicon NPU we have tested with the following devices
    • Khadas VIM3

lite-v1.2

3 years ago

Release v1.2

New feature

  • CPU affinity API
  • CPU profile tool
  • Inference mode support Int8 (symmetric, perchannel)
  • Release quantization Tools (Int8, UInt8)
  • Support compile with HarmonyOS
  • Support compile with Visual Studio 2019

New network support

  • alphapose
  • crnn
  • yolov4_tiny

New operator support

  • int8 reference op (experiment)

Performance

  • int8 peformance op with armv7/v8 (experiment)
  • int8 peformance op with x86-64 (experiment)