More resilient metrics logging #2

conorsch · 2016-03-22T19:07:12Z

Right now, the topbeat service outputs directly to logstash on the logserver, and from there makes it into Elasticsearch. The filebeat services does the same, but with log files rather than metrics.

The downside to the topbeat config is that metrics are discarded if not immediately received by logstash. A better solution is the approach taken by filebeat, which will remember the offset of the last successfully shipped log line and resume there once the connection to logstash is restored.

It should be possible to set the topbeat config to log to local files, then use filebeat to monitor those files and ship them into logstash. Then we get the best of both worlds. So:

disable logstash output in topbeat config
enable local logfiles in topbeat config
enable logging of logfiles for topbeat in filebeat config
write a JSON logstash parser for ingesting the topbeat metrics

Then, even in the event of service disruption, we'll still be able to collect and analyze metrics data, rather than having gaps in the metrics config.

The text was updated successfully, but these errors were encountered:

conorsch · 2016-04-21T01:04:19Z

Not longer convinced this is even possible—the log-to-file settings for topbeat seem to log only metadata about service health, e.g. connections to the remote logstash agent. Further reading warranted to be sure.

msheiny · 2017-07-06T15:56:13Z

The answer to this I think is having a redis intermediate system that all the beats clients send to and that logstash reads from. That way we can reboot and schedule downtime on logstash without losing data. Of course if redis goes down... same problem. Just mentioning that here because a lot of documentation I see around the interwebz it is very common for an org to have redis as a buffer in the mix of their ELK architecture.

I realize this was a really old ticket.. figured it was worth a comment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More resilient metrics logging #2

More resilient metrics logging #2

conorsch commented Mar 22, 2016 •

edited

Loading

conorsch commented Apr 21, 2016

msheiny commented Jul 6, 2017

More resilient metrics logging #2

More resilient metrics logging #2

Comments

conorsch commented Mar 22, 2016 • edited Loading

conorsch commented Apr 21, 2016

msheiny commented Jul 6, 2017

conorsch commented Mar 22, 2016 •

edited

Loading