Releases: Samsung/ONE
ONE Release 1.25.0
Release Note 1.25.0
ONE Runtime
- Support ubuntu 20.04
CPU Backend Operation
- CPU backend supports per-channel hybrid quantization of int8 type weight and float activation. (TFLite's dynamic range quantization)
On-device Quantization
- onert supports new experimental API for on-device quantization.
- As the 1st step, onert supports per-channel hybrid quantization of int8/int16 type weight and float activation.
- API requires file path to export quantized model.
Minmax Recorder
- onert` support minmax recording of each layer as experimental feature. It is not supported by API yet.
- Output file format is HDF5. (File format may change later).
ONE Release 1.24.0
Release Note 1.24.0
ONE Compiler
- Introduce one-import-onnx extension interface
- onecc supports profiling of multiple backends with a single cfg file
- Enable more Quantize operator: FloorMod, Squeeze
- visq supports multi-out nodes
- onecc introduces
dynamic_batch_to_single_batch option
option
ONERT-MICRO 0.1.0
Release Notes for onert-micro 0.1.0
onert-micro is tiny runtime specialized for running NN model in MCU boards. Note that onert-micro is under active development and is subject to change.
Supported operations
For MCU board, we support 22 operations as follows :
ADD, FULLY_CONNECTED, CONV_2D, LOGISTIC ,GATHER, EXPAND_DIMS, PACK, RESHAPE, REDUCE_PROD, LESS, MUL, MAX_POOL_2D, CONCATENATION, SHAPE, SLICE, SUB, SPLIT, STRIDED_SLICE, TANH, SOFTMAX, WHILE, UNIDIRECTIONAL_SEQUENCE_LSTM
RNN Model
LSTM
onert-micro supports Keras model with LSTM operations. But, it should be converted to UNIDIRECTIONAL_SEQUENCE_LSTM operation in circle format.
GRU
onert-micro supports model with GRU Operations, which is converted from Keras Model. Please refer to #10465 to see GRU operation supported by onert-micro.
Benchmark
onert-micro shows better performance than tflite-micro especially in memory consumption, binary size.
The measurement is done on TizenRT running reference models on the development board with the following spec :
- 32-bit Arm Cortex-M33 200MHz
- 4MB RAM, 8MB Flash
Commit for measurement :
-
tflite-micro commit: tensorflow/tflite-micro@4e62ea7
-
onert-micro commit: c763867
L model
Params | Tflite micro | Onert-micro |
---|---|---|
Execution time(us)* | 2 912 700 | 2 953 000 |
RAM consumption(bytes) | 126 800 | 93 376 |
Binary file size overhead (bytes) | 57 676 | 32 248 |
T1 model
Params | Tflite micro | Onert-micro |
---|---|---|
Execution time(us)* | 1 340 | 1 510 |
RAM consumption(bytes) | 1 640 | 1 152 |
Binary file size overhead (bytes) | 35 040 | 19 432 |
T2 model
Params | Tflite micro** | Onert-micro |
---|---|---|
Execution time(us)* | N/A | 5 090 |
RAM consumption(bytes) | N/A | 3 360 |
Binary file size overhead (bytes) | N/A | 30 488 |
Model with GRU operations
- model link : https://github.com/Samsung/ONE/files/8368702/gru.zip
Params | Tflite micro** | Onert-micro |
---|---|---|
Execution time(us)* | N/A | 335 000 |
RAM consumption(bytes) | N/A | 14 816 |
Binary file size overhead (bytes) | N/A | 43 444 |
(*) Average for 100 inferences
(**) Tflite-micro has not launched this model
ONE Release 1.23.0
Release Note 1.23.0
ONE Compiler
- Support more Op(s): GeLU
- Support more option(s):
--fuse-gelu
- Support multiple backends compilation with a single configuration file
- Upgrade Circle schema to 0.5
ONE Release 1.22.1
Release Note 1.22.1
ONE Runtime
Multimodel nnpackage
- Runtime supports to run nnpackage with 3 or more models
- Runtime supports to run multimodel nnpackage with multiple subgraphs
- Runtime supports type casting when tensor's data type between edge is different
ONE Release 1.22.0
Release Note 1.22.0
ONE Compiler
- Introduce new optimization options:
unroll_unidirseqlstm
,forward_transpose_op
,fold_fully_connected
,fuse_prelu
- Support more Ops for fake quantization:
Depth2Space
,Space2Depth
,Pack
,Unpack
,Abs
- Support more Ops for quantization:
Abs
,ReduceProd
- Introduce visq tool for quantization error visualization
- Introduce Environment section into configuration file
- Improve speed of
convert_nchw_to_nhwc
option - Support
Add
,Mul
of index-type (int32, int64) tensors in one-quantize - Support ubuntu 20.04
ONE Release 1.21.0
Release Note 1.21.0
ONE Compiler
- Support unrolling of LSTM and RNN Ops in
one-import-onnx
tool - Introduced new tools
one-infer
,circle-operator
,circle-interpreter
- Introduced
Workflow
(WIP) inone-cmds
- New option
quant_config
inone-quantize
- New option
fake_quantize
inone-quantize
- More Ops supported: Densify
- More Ops for quantization: ReduceMax
- More Ops for mixed-precision quantization (MPQ): LeakyRelu, Neg, Relu6, Squeeze
- More Ops for
convert_nchw_to_nhwc
option: LogSoftmax, ReduceMax, SplitV, Softmax - New optimization options in
one-optimize
:replace_non_const_fc_with_bmm
,resolve_customop_splitv
,fold_densify
- Improved reshape elimination in
convert_nchw_to_nhwc
option. - Support fusion of Channel-wise Add + Relu with TConv
- Support negative axis in ArgMin/Max
- Show errors for unrecognized options in
one-optimize
- Fix shape inference for
StridedSlice
- Fix FuseBatchNormWithTConvPass to support TConv with bias
- Deprecate
--O1
option incircle2circle
- Support gcc-11
- Support limited Float16 for kernels constants with dequantization to Float32
ONE Runtime
Basic Multimodel nnpackage
- Runtime supports to run nnpackage with two models
Channel Wise Quantization on Conv2D and Depthwise Conv2D
- Conv2D and Depthwise Conv2D supports per-channel quantization of uint8 type.
Batch Execution with TRIX backend
- TRIX backend supports batch execution which run in parallel with multicore
ONE Release 1.20.0
Release Note 1.20.0
ONE Compiler
Compiler Frontend
- luci-interpreter supports multiple kernels with PAL layer including Cortext-M
- luci-interpreter supports integer tensor for partly kernels
- luci import support constant without coping to reduce memory for luci-interpreter
- Reduce duplicate codes to package released modules
- Limited support for ONNX LSTM/RNN unrolling while importing
- Limited support for ARM32 cross build
- Support new operator: SVDF
- New virtual CircleVariable to support tensor with variable
- Support quantization of BatchMatMul Op
- Support mixed(UINT8 + INT16) quantization
- Support backward propagation of quantization parameters
- Upgrade default python to version 3.8
- Support TensorFlow 2.8.0, ONNX-TF 1.10.0, ONNX 1.11.0
- Upgrade circle schema to follow tflite schema v3b
- Refactor to mio-tflite280, mio-circle04 with version and helpers methods
- Use one flatbuffers 2.0 version
- Drop support for TensorFlow 1.x
- Fix for several bugs, performance enhancements, and typos
ONE Runtime
Introduce TRIX backend
- TRIX backend supports trix binary with NHWC layout
- TRIX backend supports trix binary with input/output of Q8 and Q16 type
API supports new data type
- Symmetric Quantized int16 type named "NNFW_TYPE_TENSOR_QUANT16_SYMM_SIGNED"
ONE Release 1.19.0
Release Note 1.19.0
ONE Compiler
Compiler Frontend
circle-quantizer
supports input/output type option- Introduce configuration file for optimization options
ONE Release 1.18.0
Release Note 1.18.0
ONE Compiler
Compiler Frontend
- More optimization pass
- Fold DepthwiseConv2D
- Substitute SplitV to Split
- Expand BroadCast Const
- Force QuantParam