Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider a single default log location instead of two #3546

Closed
PettitWesley opened this issue Jan 26, 2023 · 2 comments
Closed

Consider a single default log location instead of two #3546

PettitWesley opened this issue Jan 26, 2023 · 2 comments

Comments

@PettitWesley
Copy link
Contributor

PettitWesley commented Jan 26, 2023

Summary

The ECS Agent outputs its logs to two locations by default:

  1. Seelog writes and rotates log files in /var/log/ecs.
  2. ECS_LOG_DRIVER defaults to json-file which means that logs are also written to disk in /var/lib/docker/containers.

Description

The ECS Agent outputs its logs to two locations by default:

  1. Seelog writes and rotates log files in /var/log/ecs.
  2. ECS_LOG_DRIVER defaults to json-file which means that logs are also written to disk in /var/lib/docker/containers.

This causes the following minor inconveniences:

  1. Disk space used by Agent logs are duplicated.
  2. A lot of log collection tools use the json-file /var/lib/docker/containers log files to collect container logs. IIRC, the datadog agent can do this. This is also inconvenient for me in the current project I am working on to collect EC2 cluster logs using a Daemon Service: [ECS] [request]: Full Support for Running Fluent Bit as Daemon Service to collect all logs on an EC2 node containers-roadmap#1876 . I collect task logs from the /var/lib/docker/containers files, and agent logs get collected from there too. Since those logs files are just named with the container ID, there is no easy way to determine that logs came from the Agent. Whereas, with task logs I can use the Agent introspection metadata endpoint to match the container ID with which task they came from.

Discussion

My understanding is that this default agent experience also has the following benefits:

  1. A customer can easily log into their instance and run docker ps and docker logs to get Agent logs.
  2. My understanding is that if the agent crashes suddenly, the last logs may not be written by seelog to /var/log/ecs immediately whereas they will be written to the json-file log driver files. Which is useful for debugging
  3. The ECS Log collection script uses the /var/log/ecs path to collect logs: https://github.com/aws/amazon-ecs-logs-collector

For #1, it should be noted that docker now always created log files in /var/lib/docker/containers called "cache" log files: https://docs.docker.com/config/containers/logging/dual-logging/ which work with docker logs command.

So since the agent is sending to stdout, you do not need to explicitly configure the json-file log driver to use docker logs.

For #2, explicitly configuring the json-file log driver continues to be useful. This is because I can report that in my testing, cache log driver files are removed immediately as soon as the container stops. Whereas, the json-file log driver files continue to stick around for some time (if someone knows how they get cleaned up for stopped containers please let me know, I can't find this in any docker docs).

However, is use case #2 something that is needed for prod workloads or just developing the agent?

Basically my main question here is: Should prod deployments of ECS Agent always by default write their logs to two locations?

@amogh09
Copy link
Contributor

amogh09 commented Feb 1, 2023

Hi @PettitWesley, thanks for raising this issue.

I am finding the reason for duplicate logging by default and will update this thread when I have more information. In the meanwhile, it is possible to turn off instance logging (Seelog writes) by setting ECS_LOGLEVEL_ON_INSTANCE=none or by explicitly setting ECS_LOG_DRIVER to a valid logging driver in /etc/ecs/ecs.config file and restarting the Agent as described in the Agent configuration docs. Accepted values for ECS_LOG_DRIVER are ["awslogs","fluentd","gelf","json-file","journald","logentries","splunk","syslog"].

@sparrc
Copy link
Contributor

sparrc commented Mar 17, 2023

Thank you for the suggestion. We don't have a super strong reason at the moment to stop logging to /var/log/ecs, as it's also how we currently debug customer issues using the ecs logs collector: https://github.com/aws/amazon-ecs-logs-collector.

As far as the json-file driver goes, we do like to have more logs by default in the event of customer issues. I think to disable the double logging, it could be possible to disable that log driver with the ECS_LOG_DRIVER env var.

I'm going to close as a "won't fix" as changing this behavior seems like it could have some risk, in the case that customers may be relying on json-file logging.

@sparrc sparrc closed this as completed Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants