Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More resilient metrics logging #2

Open
4 tasks
conorsch opened this issue Mar 22, 2016 · 2 comments
Open
4 tasks

More resilient metrics logging #2

conorsch opened this issue Mar 22, 2016 · 2 comments

Comments

@conorsch
Copy link
Contributor

conorsch commented Mar 22, 2016

Right now, the topbeat service outputs directly to logstash on the logserver, and from there makes it into Elasticsearch. The filebeat services does the same, but with log files rather than metrics.

The downside to the topbeat config is that metrics are discarded if not immediately received by logstash. A better solution is the approach taken by filebeat, which will remember the offset of the last successfully shipped log line and resume there once the connection to logstash is restored.

It should be possible to set the topbeat config to log to local files, then use filebeat to monitor those files and ship them into logstash. Then we get the best of both worlds. So:

  • disable logstash output in topbeat config
  • enable local logfiles in topbeat config
  • enable logging of logfiles for topbeat in filebeat config
  • write a JSON logstash parser for ingesting the topbeat metrics

Then, even in the event of service disruption, we'll still be able to collect and analyze metrics data, rather than having gaps in the metrics config.

@conorsch
Copy link
Contributor Author

Not longer convinced this is even possible—the log-to-file settings for topbeat seem to log only metadata about service health, e.g. connections to the remote logstash agent. Further reading warranted to be sure.

@msheiny
Copy link
Contributor

msheiny commented Jul 6, 2017

The answer to this I think is having a redis intermediate system that all the beats clients send to and that logstash reads from. That way we can reboot and schedule downtime on logstash without losing data. Of course if redis goes down... same problem. Just mentioning that here because a lot of documentation I see around the interwebz it is very common for an org to have redis as a buffer in the mix of their ELK architecture.

I realize this was a really old ticket.. figured it was worth a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants