You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are adapting OPEA applications for the AMD platform and faced the issue that launching the eval tests give us negative numbers for Input Tokens per Second, Input Tokens. And also Tokens per Second is too high: ~25K.
We use TGI LLM engine.
Our process:
From evals/benchmark/ we modify benchmark.yaml (picture is attached)
Main settings:
we use ["faqgen"] in examples deployment_type - should be set to "Docker," as we are using deployment via Docker. service_port - The backend port where the service API is available, can be checked in the Docker Compose output when starting the service. Currently, 18881 is used for deployment with GPU, and 19888 for the service deployed on CPU.
To start the test, use the command:
`python benchmark.py
And here is result from the benchmark tests with strange numbers:
Can you please check that it works fine on your side?
The text was updated successfully, but these errors were encountered:
joshuayao
changed the title
Performance benchmarks for FaqGen / DocSum are calculated incorrectly (we see negative numbers and very high values)
[Bug] Performance benchmarks for FaqGen / DocSum are calculated incorrectly (we see negative numbers and very high values)
Dec 6, 2024
We are adapting OPEA applications for the AMD platform and faced the issue that launching the eval tests give us negative numbers for Input Tokens per Second, Input Tokens. And also Tokens per Second is too high: ~25K.
We use TGI LLM engine.
Our process:
From evals/benchmark/ we modify benchmark.yaml (picture is attached)
Main settings:
we use ["faqgen"] in examples
deployment_type
- should be set to "Docker," as we are using deployment via Docker.service_port
- The backend port where the service API is available, can be checked in the Docker Compose output when starting the service. Currently, 18881 is used for deployment with GPU, and 19888 for the service deployed on CPU.To start the test, use the command:
`python benchmark.py
And here is result from the benchmark tests with strange numbers:
Can you please check that it works fine on your side?
The text was updated successfully, but these errors were encountered: