Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to convert the original phi-3-mini-4k and put it in the provided APP, but get the exception. #452

Open
scchiustone opened this issue Aug 6, 2024 · 0 comments

Comments

@scchiustone
Copy link

According to the ONNX model provided in the example (microsoft/Phi-3-mini-4k-instruct-onnx/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4), following the quantization parameters as described in the folder name (cpu, int4, rtn, block-32, acc-level-4), I attempted to convert the original phi-3-mini-4k model. The model was generated successfully. However, it fails when replacing the original model, regardless of the two conversion methods I used as follows. I found the size of .onnx is different from the provioded version. Are there any suggestions to convert the onnx model to match the official version?

**Method 1:

convert to ONNX

!/data/miniconda/envs/onnx_env/bin/optimum-cli export onnx
--model /data/www/llm_api/dist/{model}/
--task text-generation-with-past ./{model}

quantization

!bash run_quant.sh --model_input=./{model}
--model_output=./{model}/quantized/model
--algorithm=RTN --block_size=32 --accuracy_level=4

**Method 2:

build ONNX

!/data/miniconda/envs/onnx_env/bin/python3 -m onnxruntime_genai.models.builder
-m /data/www/llm_api/dist/{model}/
-o ./{model}/
-p int4
-e cpu
--extra_options int4_block_size=32
--extra_options int4_accuracy_level=4 \

The exception message in Android is as attached.

FATAL EXCEPTION: main
Process: ai.onnxruntime.genai.demo, PID: 29596
java.lang.RuntimeException: Unable to start activity ComponentInfo{ai.onnxruntime.genai.demo/ai.onnxruntime.genai.demo.MainActivity}: java.lang.RuntimeException: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
  at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3742)
  at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
  at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
  at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
  at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
  at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
  at android.os.Handler.dispatchMessage(Handler.java:106)
  at android.os.Looper.loopOnce(Looper.java:242)
  at android.os.Looper.loop(Looper.java:359)
  at android.app.ActivityThread.main(ActivityThread.java:8114)
  at java.lang.reflect.Method.invoke(Native Method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
Caused by: java.lang.RuntimeException: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
  at ai.onnxruntime.genai.demo.MainActivity.onCreate(MainActivity.java:51)
  at android.app.Activity.performCreate(Activity.java:8356)
  at android.app.Activity.performCreate(Activity.java:8335)
  at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1385)
  at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3723)
  at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
  at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
  at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
  at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
  at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
  at android.os.Handler.dispatchMessage(Handler.java:106)
  at android.os.Looper.loopOnce(Looper.java:242)
  at android.os.Looper.loop(Looper.java:359)
  at android.app.ActivityThread.main(ActivityThread.java:8114)
  at java.lang.reflect.Method.invoke(Native Method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
Caused by: ai.onnxruntime.genai.demo.GenAIException: Deserialize tensor model.layers.21.mlp.gate_proj.MatMul.weight_Q4 failed.GetFileLength for /data/user/0/ai.onnxruntime.genai.demo/files/model.onnx.data failed:Invalid fd was supplied: -1
  at ai.onnxruntime.genai.demo.GenAIWrapper.loadModel(Native Method)
  at ai.onnxruntime.genai.demo.GenAIWrapper.(GenAIWrapper.java:22)
  at ai.onnxruntime.genai.demo.MainActivity.createGenAIWrapper(MainActivity.java:185)
  at ai.onnxruntime.genai.demo.MainActivity.downloadModels(MainActivity.java:163)
  at ai.onnxruntime.genai.demo.MainActivity.onCreate(MainActivity.java:48)
  at android.app.Activity.performCreate(Activity.java:8356)
  at android.app.Activity.performCreate(Activity.java:8335)
  at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1385)
  at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3723)
  at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3879)
  at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:108)
  at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
  at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
  at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2373)
  at android.os.Handler.dispatchMessage(Handler.java:106)
  at android.os.Looper.loopOnce(Looper.java:242)
  at android.os.Looper.loop(Looper.java:359)
  at android.app.ActivityThread.main(ActivityThread.java:8114)
  at java.lang.reflect.Method.invoke(Native Method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant