Consider a single default log location instead of two #3546

PettitWesley · 2023-01-26T23:47:00Z

Summary

The ECS Agent outputs its logs to two locations by default:

Seelog writes and rotates log files in /var/log/ecs.
ECS_LOG_DRIVER defaults to json-file which means that logs are also written to disk in /var/lib/docker/containers.

Description

The ECS Agent outputs its logs to two locations by default:

Seelog writes and rotates log files in /var/log/ecs.
ECS_LOG_DRIVER defaults to json-file which means that logs are also written to disk in /var/lib/docker/containers.

This causes the following minor inconveniences:

Disk space used by Agent logs are duplicated.
A lot of log collection tools use the json-file /var/lib/docker/containers log files to collect container logs. IIRC, the datadog agent can do this. This is also inconvenient for me in the current project I am working on to collect EC2 cluster logs using a Daemon Service: [ECS] [request]: Full Support for Running Fluent Bit as Daemon Service to collect all logs on an EC2 node containers-roadmap#1876 . I collect task logs from the /var/lib/docker/containers files, and agent logs get collected from there too. Since those logs files are just named with the container ID, there is no easy way to determine that logs came from the Agent. Whereas, with task logs I can use the Agent introspection metadata endpoint to match the container ID with which task they came from.

Discussion

My understanding is that this default agent experience also has the following benefits:

A customer can easily log into their instance and run docker ps and docker logs to get Agent logs.
My understanding is that if the agent crashes suddenly, the last logs may not be written by seelog to /var/log/ecs immediately whereas they will be written to the json-file log driver files. Which is useful for debugging
The ECS Log collection script uses the /var/log/ecs path to collect logs: https://github.com/aws/amazon-ecs-logs-collector

For #1, it should be noted that docker now always created log files in /var/lib/docker/containers called "cache" log files: https://docs.docker.com/config/containers/logging/dual-logging/ which work with docker logs command.

So since the agent is sending to stdout, you do not need to explicitly configure the json-file log driver to use docker logs.

For #2, explicitly configuring the json-file log driver continues to be useful. This is because I can report that in my testing, cache log driver files are removed immediately as soon as the container stops. Whereas, the json-file log driver files continue to stick around for some time (if someone knows how they get cleaned up for stopped containers please let me know, I can't find this in any docker docs).

However, is use case #2 something that is needed for prod workloads or just developing the agent?

Basically my main question here is: Should prod deployments of ECS Agent always by default write their logs to two locations?

The text was updated successfully, but these errors were encountered:

amogh09 · 2023-02-01T00:26:38Z

Hi @PettitWesley, thanks for raising this issue.

I am finding the reason for duplicate logging by default and will update this thread when I have more information. In the meanwhile, it is possible to turn off instance logging (Seelog writes) by setting ECS_LOGLEVEL_ON_INSTANCE=none or by explicitly setting ECS_LOG_DRIVER to a valid logging driver in /etc/ecs/ecs.config file and restarting the Agent as described in the Agent configuration docs. Accepted values for ECS_LOG_DRIVER are ["awslogs","fluentd","gelf","json-file","journald","logentries","splunk","syslog"].

sparrc · 2023-03-17T16:55:25Z

Thank you for the suggestion. We don't have a super strong reason at the moment to stop logging to /var/log/ecs, as it's also how we currently debug customer issues using the ecs logs collector: https://github.com/aws/amazon-ecs-logs-collector.

As far as the json-file driver goes, we do like to have more logs by default in the event of customer issues. I think to disable the double logging, it could be possible to disable that log driver with the ECS_LOG_DRIVER env var.

I'm going to close as a "won't fix" as changing this behavior seems like it could have some risk, in the case that customers may be relying on json-file logging.

PettitWesley mentioned this issue Jan 30, 2023

WIP: ECS Fluent Bit Daemon Collector aws/aws-for-fluent-bit#499

Closed

amogh09 added the workaround available label Feb 1, 2023

sparrc closed this as completed Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider a single default log location instead of two #3546

Consider a single default log location instead of two #3546

PettitWesley commented Jan 26, 2023 •

edited

Loading

amogh09 commented Feb 1, 2023 •

edited

Loading

sparrc commented Mar 17, 2023

Consider a single default log location instead of two #3546

Consider a single default log location instead of two #3546

Comments

PettitWesley commented Jan 26, 2023 • edited Loading

Summary

Description

Discussion

amogh09 commented Feb 1, 2023 • edited Loading

sparrc commented Mar 17, 2023

PettitWesley commented Jan 26, 2023 •

edited

Loading

amogh09 commented Feb 1, 2023 •

edited

Loading