Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameterize start-amazon-cloudwatch-agent for portability. #1319

Open
commiterate opened this issue Aug 26, 2024 · 2 comments
Open

Parameterize start-amazon-cloudwatch-agent for portability. #1319

commiterate opened this issue Aug 26, 2024 · 2 comments

Comments

@commiterate
Copy link

commiterate commented Aug 26, 2024

Issue

start-amazon-cloudwatch-agent currently:

  1. Looks for /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json.
  2. Translates it into separate .toml (for CloudWatch Logs + Metrics), .yaml (for X-Ray + OpenTelemetry), and .json (for environment variables) configuration files by executing /opt/aws/amazon-cloudwatch-agent/bin/config-translator.
  3. Writes these files to /opt/aws/amazon-cloudwatch-agent/etc.
  4. Calls amazon-cloudwatch-agent with these generated configuration files and a PID file path of /opt/aws/amazon-cloudwatch-agent/var/amazon-cloudwatch-agent.pid.

This tool currently hardcodes the expected installation path as /opt/aws/amazon-cloudwatch-agent, breaking users who don't have the package installed there or creating problems for Linux distributions with immutable installation directories.

It would be nice if start-amazon-cloudwatch-agent exposed CLI parameters to customize:

  1. The expected installation directory.
    • Or can the program just use the current directory and assume config-translator and amazon-cloudwatch-agent are also present?
  2. The input unified JSON configuration file path.
    • For example, let users set it to /etc/amazon-cloudwatch-agent/amazon-cloudwatch-agent.json.
  3. The output directory of the generated .toml and .yaml configuration files and any runtime files (e.g. the .pid file).
    • For example, let users set it to /run/amazon-cloudwatch-agent.

Background

Currently, we're trying to add the amazon-cloudwatch-agent package to the Nix package manager. This involves creating a systemd service configuration so NixOS (a Linux distribution using Nix as the system package manager, taking the place of apt/dnf/apk/etc.) users can enable the service (typically for NixOS EC2 AMIs).

Nix installs all packages under the Nix store. The Nix store is a read-only content-addressable store where packages are installed to their own directory named {input hash}-{package-name}-{version}.

This allows multiple versions of a program (e.g. go) to be installed, with individual shells getting a modified PATH environment variable to point to the desired version.

For example, amazon-cloudwatch-agent looks like this in the Nix store:

/nix/store/
├── {input hash}-{package-name}-{version}/
│   └── ...
└── p93y5qyqgwgby3sm9cdd83plal035c9z-amazon-cloudwatch-agent-v1.300045.0/
    └── bin/
        ├── CWAGENT_VERSION
        ├── amazon-cloudwatch-agent
        ├── amazon-cloudwatch-agent-config-wizard
        ├── config-downloader
        ├── config-translator
        └── start-amazon-cloudwatch-agent

This, unfortunately, creates 2 problems for start-amazon-cloudwatch-agent.

  1. The program can't assume it's installed at /opt/aws/amazon-cloudwatch-agent.
  2. The Nix store is read-only for all users except the special Nix build user (builds Nix packages and writes the results to the Nix store).
    • start-amazon-cloudwatch-agent cannot make config-translator write generated configuration files to the package install directory (/nix/store/p93y5qyqgwgby3sm9cdd83plal035c9z-amazon-cloudwatch-agent-v1.300045.0).
    • amazon-cloudwatch-agent cannot write its PID file to the package directory.

NixOS/nixpkgs#337212 (comment)

This raises a few questions:

  1. Why can't amazon-cloudwatch-agent work with the unified JSON configuration file?
  2. What is the PID file used for?

The reason for question 1 seems to be a limitation of OpenTelemetry's Go SDK (used by the CloudWatch Agent for exporting traces) which requires a URI (e.g. local or remote file URI) for configuration.

https://pkg.go.dev/go.opentelemetry.io/collector/confmap#ResolverSettings

https://github.com/aws/amazon-cloudwatch-agent/blob/v1.300045.0/service/configprovider/provider.go#L14-L25

Question 2 seems less clear as the PID file seems to be optional from looking at the agent code. This might be so the CloudWatch Agent can upload its own process metrics with procstat (docs)?

One option on the NixOS side is to make systemd execute amazon-cloudwatch-agent directly. However, this also requires re-implementing the logic to first change the user (if agent.run_as_user is a non-root user) then call config-translator. Granted, this isn't a lot of work but if upstream decides to make start-amazon-cloudwatch-agent do more, downstream needs to stay in sync.

@philipmw
Copy link

I have a PR to add Amazon CloudWatch Agent package and its NixOS service module:
NixOS/nixpkgs#341688

I'd appreciate a review to help ship this successfully.

@commiterate
Copy link
Author

@philipmw Tagged you in my existing Nixpkgs PR I opened last month for the CloudWatch Agent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants
@philipmw @commiterate and others