Forums - Inference timing issue in simulation mode during Quantisation

2 posts / 0 new
Last post
Inference timing issue in simulation mode during Quantisation
anant.phatak
Join Date: 20 Apr 22
Posts: 7
Posted: Mon, 2022-05-23 23:52

We are running QAic-exec in simulation mode.

When I perform the inference for the FP32 precision we are getting inference time of 191 sec for one image.

Once we set flags for INT8 quantisation we are getting inference time of 26 min per image. we have tried both static and dyanamic quantisation but we are getting similar timing for both the methods.

We have followed the following steps

1.Generate the profile for the model.

./qaic-exec -m=<ONNX model path> -input-list-file=<text file path> -dump-profile=<path to dump profile(.yaml file)>

2.Inference using static quantization.

./qaic-exec -m=<ONNX model path> -input-list-file=<text file path> -load-profile=<path to load profile(.yaml file)> -write-output-dir=<path to store the outputs> -quantization-precision-bias=Int8 -quantization-precision=Int8 -quantization-schema=symmetric_with_uint8

 

we observed that inference time for int8 quantization is way more than that taken by FP32 in simulator mode. Is this expected behaviour of QAic-exec in simulator mode?

If no, Please let us know the suitable command line arguments for the execution in simulator.

  • Up0
  • Down0
weihuan
Join Date: 12 Apr 20
Posts: 270
Posted: Sun, 2022-06-12 04:53

Dear customer,

The inference time is better if you input fixed tensor compared with float tensor, as float data needs more time to quantize to fixed points.

BR.

Wei

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.