Releases: ashwanthkumar/marathon-alerts
v0.3.0-RC3
I started to realise that work done for #17, produces too much noise on the STDERR logs. I decided to split the metrics we collect into regular and debug. The following table now displays the updated list of metrics displayed normally and some via --debug
flag.
Metrics
We collect some metrics internally in marathon-alerts. They're dumped periodically to STDERR. You can find the list of metrics and it's usage in the following table
Metric | Description |
---|---|
alerts-suppressed-cleaned | Number of alerts we cleaned up because they got expired from suppress duration. |
notifications-total | Total number of notifications we sent from AlertManager to NotificationManager |
notifications-warning | Number of Warning check notifications we sent from AlertManager to NotificationManager |
notifications-critical | Number of Critical check notifications we sent from AlertManager to NotificationManager |
notifications-resolved | Number of Pass (aka Resolved) check notifications we sent from AlertManager to NotificationManager |
Debug Metrics
Apart from the standard metrics above, we also collect quite a few other metrics, mostly for debugging purposes. You can enable these metrics if run marathon-alerts
with a --debug
flag.
Metric | Description |
---|---|
alerts-suppressed-called | Number of times we called AlertManager.cleanUpSupressedAlerts() |
alerts-process-check-called | Number of times we called AlertManager.processCheck() |
alerts-manager-stopped | Number of times we called AlertManager.Stop() |
apps-checker-stopped | Number of times we called AppChecker.Stop() |
apps-checker-marathon-all-apps-api | Number of times we called Marathon's /v2/apps API |
apps-checker-marathon-app-api | Number of times we called Marathon's /v2/apps/<id> API |
apps-checker-alerts-sent | Number of checks we sent to AlertManager from AppChecker |
apps-checker-check-<name> | Number of checks identified by <name> we sent to AlertManager |
apps-checker-app-<id> | Number of checks for an app identified by <id> we sent to AlertManager |
apps-checker-<id>-<name> | Number of checks identified by <name> for an app identified by <id> we sent to AlertManager |
SHA256 Checksums
04a9cb4b0063df3dd2caa4ce9ae936360a369c150a13eec44d119726c00d6e18 marathon-alerts-linux-amd64
v0.3.0-RC2
- Added Name() to Notifier interface
- Because of #18, we've now moved completely to glide based dependency management.
- Removed the map-utils from here and using it from http://github.com/ashwanthkumar/golang-utils
- First cut stab at #17. The following metrics are now available
- Fixed a bug in how we compute "Times" for a check for an app.
Metric | Description |
---|---|
alerts-suppressed-cleaned | Number of alerts we cleaned up because they got expired from suppress duration. |
alerts-suppressed-called | Number of times we called AlertManager.cleanUpSupressedAlerts() |
alerts-process-check-called | Number of times we called AlertManager.processCheck() |
alerts-manager-stopped | Number of times we called AlertManager.Stop() |
notifications-total | Total number of notifications we sent from AlertManager to NotificationManager |
notifications-warning | Number of Warning check notifications we sent from AlertManager to NotificationManager |
notifications-critical | Number of Critical check notifications we sent from AlertManager to NotificationManager |
notifications-resolved | Number of Pass (aka Resolved) check notifications we sent from AlertManager to NotificationManager |
apps-checker-stopped | Number of times we called AppChecker.Stop() |
apps-checker-marathon-all-apps-api | Number of times we called Marathon's /v2/apps API |
apps-checker-marathon-app-api | Number of times we called Marathon's /v2/apps/ API |
apps-checker-alerts-sent | Number of checks we sent to AlertManager from AppChecker |
apps-checker-check- | Number of checks identified by we sent to AlertManager |
apps-checker-app- | Number of checks for an app identified by we sent to AlertManager |
apps-checker-- | Number of checks identified by for an app identified by we sent to AlertManager |
Checksum
30d3e415e94896eec19827ee5d3028ab6c2046bfd4636ae64bd6929047ca33dd marathon-alerts-linux-amd64
v0.3.0-RC1
Changes
- Fixes #13
- Slack alert now has 'Times' field
- Changing all instances of
fail
tocritical
. Now we have 2 alerting levels - Warning and Critical. - Changed the defaults for
Critical
threshold to0.5
andWarning
threshold to0.75
- Apps can now subscribe to specific checks using
alerts.checks.subscribe
label. - Refactoring some tests to remove complicated setup for asserting on channels
Checksum
91233a81c56cc4744e8c6f21c9f7dcab2e78c2f05a386d726e123c02de640992 marathon-alerts-linux-amd64
List of contributors
Commits | Contributors |
---|---|
19 | Ashwanth Kumar |
1 | Alexander Weber |
Generated by git shortlog -s -n --no-merges v0.2.3..v0.3.0-RC1
for the marathon-alerts repository
v0.2.3
- Fixing the bug in slack notifier about how it alerts if no owner is specified.
7c9a7a37dd1cbf80500984e60c279793790a46155ec0187b2a97e7d813ab4761 marathon-alerts-linux-amd64
v0.2.2
- Added marathon.json.conf script to deploy via
marathonctl deploy
- Writing a PID file upon start. Useful while doing healthcheck on Marathon.
Going forward from this release - we'll now be publishing linux versions. I don't see a point of publishing binaries for Mac. Let me know if you think otherwise.
2fbb6aec119556e2fbcb364c9eebd5005eb94b0aa99a3b489158b69602a58852 marathon-alerts-linux-amd64
v0.2.1
- Fixes #1
App Label Configurations
You can now override certain portions of the alerts via App labels
Property | Description | Example |
---|---|---|
alerts.enabled | Controls if the alerts for the app should be enabled or disabled. Defaults - true | false |
alerts.min-healthy.fail.threshold | Failure threshold for min-healthy check. Defaults - --check-min-healthy-fail-threshold |
0.5 |
alerts.min-healthy.warn.threshold | Warning threshold for min-healthy check. Defaults - --check-min-healthy-warn-threshold |
0.4 |
alerts.slack.webhook | Comma separated list of Slack webhooks to send slack notifications. Overrides - --slack-webhook |
http://hooks.slack.com/.../ |
alerts.slack.channel | #Channel / @ User to post the alert into. Overrides - --slack-channel |
z_development |
alerts.slack.owners | Comma separated list of users who should be tagged in the alert. Overrides - --slack-owner |
ashwanthkumar,slackbot |
Checksum
c8fb04a06c5f97f41c1eadab5b26d0625799ea00a1c266b5079a1438c2841c56 marathon-alerts-darwin-amd64
044f8698fa2668fdf4d5efaea2b3880e7ae9ccb0d6043dfd03cd7919d3fff43a marathon-alerts-linux-amd64
0.1.1
- Fixing the issue were when the process is started it sends out a "Passed" message spam to Slack channel.
- More logs around *Manager co-routines starting.
- Using netgo and disabled CGO in the build environment.
v.0.1.0
Initial release of marathon-alerts with min-healthy check and slack notifier.
SHA256 Checksums
36acf66a284094db131ccc0b9d17c0661ba13c2f084e410b9c0b16c54e31f8ce marathon-alerts-darwin-amd64
5aef669168b0123dd3510716fd4476de897f6e27a3d593f4a5083d95b6697863 marathon-alerts-linux-amd64