Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
StableHLO dialect currently supports quantization via: 1) Supporting `quant.uniform` element types. 2) Having dedicated ops like `uniform_quantize` / `uniform_dequantize`. 3) Allowing regular ops like `add` / `convolution` to take quantized tensors. This support was inherited from MHLO when StableHLO was bootstrapped, and MHLO support was motivated by mobile use cases and inherited from TFLite. As pointed out in #1149, StableHLO specification doesn't support quantization at the moment, and this is an important gap that we would like to fix before StableHLO v1.0 (see #588). To continue the discussion started in #1149 and to make progress towards v1.0, this pull request: A) Adds QuantizedType to the StableHLO specification, modelled after [TFLite quantization spec](https://www.tensorflow.org/lite/performance/quantization_spec). B) To start a conversation about the applications of QuantizedType and the semantics of quantized ops, proposes semantics for quantized `add`. TFLite quantization spec doesn't cover everything. It specs constraints on types (which we captured accordingly in this pull request), but it doesn't go into describing semantics of quantized ops. As a result, the proposed semantics for quantized `add` is intentionally naive, as compared with the much more involved implementations in the TensorFlow repository, e.g.: * [tfl.add](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/add.cc). * [tf.UniformQuantizedAdd](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/uniform_quant_ops/uniform_quantized_add_op.cc). upd: After community discussion, we removed the spec for quantized `add` leaving that for future work, since further alignment is required. --------- Co-authored-by: Eugene Burmako <[email protected]>
- Loading branch information