-
Notifications
You must be signed in to change notification settings - Fork 28
Statsrelay fault tolerance #64
Comments
+1 |
This is intended... If statsrelay diverted metrics to a different statsd instances, then you'd potentially have two statsd instances writing the same key,timestamp tuple to graphite with different values, neither of which would include all data for that key. Statsrelay's use case is really more focused around performance, where a single statsd/statsite process cannot keep up with the volume of metrics you're sending. |
Thank you for the explanation @JeremyGrosser... if this is the intention, what would you recommend should be done in the case of failed nodes? |
You might want to take a look at the Lyft fork (https://github.com/lyft/statsrelay), it supports sending metrics to multiple backends simultaneously... This way you could run two sets of carbon servers for redundancy. InfluxDB is worth a look too... It would replace your carbon servers for persistence and has it's own replication/sharding implementation. I had quite a few issues last time I tried it, but I've heard it's gotten more stable since then. |
Hey guys,
I was under the impression that if a statsd host goes down, the statsrelay would divert metrics that would otherwise be routed to the dead statsd host -- and this takes advantage of the consistent hashring. But it seems like this doesn't actually happen. Is this intended, or a bug?
The text was updated successfully, but these errors were encountered: