-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Run inference with CB/non-CB SpD Models #155
Conversation
Hey all, to help with this effort, I previously also created a minimal inference script for SpD |
c8d956a
to
715ed2f
Compare
Hello all, followed the implementation of Apoorva's inference app in this is far from perfect and needs much work and aligning to merge with QEfficient. However, I have validated this app works only for the general idea is that this PR will be used to validate features to export tlm/dlm work as expected: An example of the statistics that this returns is below: $ python3 QEfficient/generation/spd_text_generation_inference.py
generated_output=', welcome my forum i new new this my is first.\n.\n am can help with any question problem youI happy to you with any i.\n.\n am here to you any any can.\nI to help you help any can.\nI to help help any i can am help any i i.\n am help help any i can to help you help any can am to help any i am help help any i can to help help any i am help you any am to you any can am help any can am help you any am help help any am am help you am help you any can help you am help am help'
ttft=0.0804790819529444
avg_decode_throughput=22.607819810201747
avg_num_accepted_tokens=1.9838709677419355
generated_output=', welcome my forum i new new this my is first.\n.\n am can help with any question problem youI happy to you with any i.\n.\n am here to you any any can.\nI to help you help any can.\nI to help help any i can am help any i i.\n am help help any i can to help you help any can am to help any i am help help any i can to help help any i am help you any am to you any can am help any can am help you any am help help any am am help you am help you any can help you am help am help'
ttft=0.07963072706479579
avg_decode_throughput=22.61431604245997
avg_num_accepted_tokens=1.9838709677419355
|
b9fc5e9
to
b3c6da4
Compare
ac52060
to
f59a988
Compare
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
…M/DLM Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
Signed-off-by: eplatero <[email protected]>
No description provided.