-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a scrape timeout parameter in docker compose #435
Comments
In term of documentation I believe we should do 2 things:
|
Note: this setting is also briefly mentionned in https://boavizta.github.io/cloud-scanner/how-to/set-up-dashboard.html#adapting-configuration-for-production-use, we may have to link to the newly created page and remove the details from this paragraph. |
The example scrape interval is already mentioned in the sample prometheus config file.
So this issue is more a documentaion issue (but we could be more explicit in the comment of the prometheus config file) |
To help debug the possible timeout when scrapping metrics (in docker compose example), you can check the status for individual scraping targets here: http://localhost:9090/targets?search= The global prometheus UI is at: http://localhost:9090/ |
Signed-off-by: Julien Nioche <[email protected]>
See 8d4ab37 which explicits the config in Prometheus |
Problem
Producing metrics for a large number of instances takes too long (something like 25 seconds for 100+ instances).
As a result promethues times out before cloud scanner returns metrics and we see no data in the dashboard (nor in prometheus).
Solution
As a short terme workaround we can increase the
scrape_timeout
in prometheus config. The default is 10 second, we could include an example of setting the timeout to 60 seconds.Also needs to be mentionned in the docs.
Long term solution is to optimize the way we gather data and return metrics but this is another story #392
Alternatives
Additional context or elements
This condition can be detected in the prometheus UI, by checking the status / targets page which returns details about scrape time for different targets.
The text was updated successfully, but these errors were encountered: