You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to the ONNX model provided in the example (microsoft/Phi-3-mini-4k-instruct-onnx/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4), following the quantization parameters as described in the folder name (cpu, int4, rtn, block-32, acc-level-4), I attempted to convert the original phi-3-mini-4k model. The model was generated successfully. However, it fails when replacing the original model, regardless of the two conversion methods I used as follows. I found the size of .onnx is different from the provioded version. Are there any suggestions to convert the onnx model to match the official version?
FATAL EXCEPTION: main
Process: ai.onnxruntime.genai.demo, PID: 29596
java.lang.RuntimeException: Unable to start activity ComponentInfo{ai.onnxruntime.genai.demo/ai.onnxruntime.genai.demo.MainActivity}: java.lang.RuntimeException: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3742)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loopOnce(Looper.java:242)
at android.os.Looper.loop(Looper.java:359)
at android.app.ActivityThread.main(ActivityThread.java:8114)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
Caused by: java.lang.RuntimeException: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
at ai.onnxruntime.genai.demo.MainActivity.onCreate(MainActivity.java:51)
at android.app.Activity.performCreate(Activity.java:8356)
at android.app.Activity.performCreate(Activity.java:8335)
at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1385)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3723)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loopOnce(Looper.java:242)
at android.os.Looper.loop(Looper.java:359)
at android.app.ActivityThread.main(ActivityThread.java:8114)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
Caused by: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
at ai.onnxruntime.genai.demo.GenAIWrapper.loadModel(Native Method)
at ai.onnxruntime.genai.demo.GenAIWrapper.(GenAIWrapper.java:22)
at ai.onnxruntime.genai.demo.MainActivity.createGenAIWrapper(MainActivity.java:185)
at ai.onnxruntime.genai.demo.MainActivity.downloadModels(MainActivity.java:163)
at ai.onnxruntime.genai.demo.MainActivity.onCreate(MainActivity.java:48)
at android.app.Activity.performCreate(Activity.java:8356)
at android.app.Activity.performCreate(Activity.java:8335)
at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1385)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3723)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loopOnce(Looper.java:242)
at android.os.Looper.loop(Looper.java:359)
at android.app.ActivityThread.main(ActivityThread.java:8114)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
The text was updated successfully, but these errors were encountered:
According to the ONNX model provided in the example (microsoft/Phi-3-mini-4k-instruct-onnx/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4), following the quantization parameters as described in the folder name (cpu, int4, rtn, block-32, acc-level-4), I attempted to convert the original phi-3-mini-4k model. The model was generated successfully. However, it fails when replacing the original model, regardless of the two conversion methods I used as follows. I found the size of .onnx is different from the provioded version. Are there any suggestions to convert the onnx model to match the official version?
**Method 1:
convert to ONNX
!/data/miniconda/envs/onnx_env/bin/optimum-cli export onnx
--model /data/www/llm_api/dist/{model}/
--task text-generation-with-past ./{model}
quantization
!bash run_quant.sh --model_input=./{model}
--model_output=./{model}/quantized/model
--algorithm=RTN --block_size=32 --accuracy_level=4
**Method 2:
build ONNX
!/data/miniconda/envs/onnx_env/bin/python3 -m onnxruntime_genai.models.builder
-m /data/www/llm_api/dist/{model}/
-o ./{model}/
-p int4
-e cpu
--extra_options int4_block_size=32
--extra_options int4_accuracy_level=4 \
The exception message in Android is as attached.
FATAL EXCEPTION: main
Process: ai.onnxruntime.genai.demo, PID: 29596
java.lang.RuntimeException: Unable to start activity ComponentInfo{ai.onnxruntime.genai.demo/ai.onnxruntime.genai.demo.MainActivity}: java.lang.RuntimeException: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3742)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loopOnce(Looper.java:242)
at android.os.Looper.loop(Looper.java:359)
at android.app.ActivityThread.main(ActivityThread.java:8114)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
Caused by: java.lang.RuntimeException: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
at ai.onnxruntime.genai.demo.MainActivity.onCreate(MainActivity.java:51)
at android.app.Activity.performCreate(Activity.java:8356)
at android.app.Activity.performCreate(Activity.java:8335)
at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1385)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3723)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loopOnce(Looper.java:242)
at android.os.Looper.loop(Looper.java:359)
at android.app.ActivityThread.main(ActivityThread.java:8114)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
Caused by: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
at ai.onnxruntime.genai.demo.GenAIWrapper.loadModel(Native Method)
at ai.onnxruntime.genai.demo.GenAIWrapper.(GenAIWrapper.java:22)
at ai.onnxruntime.genai.demo.MainActivity.createGenAIWrapper(MainActivity.java:185)
at ai.onnxruntime.genai.demo.MainActivity.downloadModels(MainActivity.java:163)
at ai.onnxruntime.genai.demo.MainActivity.onCreate(MainActivity.java:48)
at android.app.Activity.performCreate(Activity.java:8356)
at android.app.Activity.performCreate(Activity.java:8335)
at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1385)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3723)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loopOnce(Looper.java:242)
at android.os.Looper.loop(Looper.java:359)
at android.app.ActivityThread.main(ActivityThread.java:8114)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
The text was updated successfully, but these errors were encountered: