-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Prometheus metric support #236
base: main
Are you sure you want to change the base?
Conversation
|
||
# Separate function that can raise exceptions used for testing | ||
# to assert correct errors and messages. | ||
def run(): | ||
# TMA-1900: refactor CLI handler | ||
logging.init_logging() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not have been deleted, it is needed for the logger to function properly
@@ -46,6 +50,31 @@ def run(): | |||
else: # profile | |||
args.func(args, extra_args) | |||
|
|||
if getattr(args, "enable_prometheus", False): | |||
# Fix: This doesn't actually get logged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can fix this by adding main to logger.py
@@ -70,6 +70,7 @@ def to_lowercase(self): | |||
DEFAULT_SYNTHETIC_FILENAME = "synthetic_data.json" | |||
DEFAULT_WARMUP_REQUEST_COUNT = 0 | |||
DEFAULT_BACKEND = "tensorrtllm" | |||
DEFAULT_PROMETHEUS_PORT = 8002 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the default port for the Tritonserver metrics. I would change this to be 9090 which is what prometheus uses in its getting started configuration file.
Support a Prometheus metric endpoint in GenAI-Perf. This is a proof of concept. With this design, Prometheus metrics can be enabled via the CLI. If they are enabled, the metrics are exported at that endpoint. This would only be supported for the
profile
subcommand to begin with, though it may be useful to expand it toanalyze
. To avoid breaking anything, that is out of scope for this proof of concept.When Prometheus metrics are enabled, the user must terminate the program when they want GenAI-Perf to exit. This design is necessary because otherwise, the metrics would be up for a very short amount of time, since the endpoint is closed when GenAI-Perf exits.
TODO: Only system metrics are displayed right now, so add the other metrics. Also, it may be good to omit request_goodput if goodput is not used.
Example command:
genai-perf profile -v -m gpt2 --service-kind openai --endpoint-type completions --num-requests 5 --num-prompts 5 --enable-prometheus --prometheus-port 8002
Example metrics: