Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ruby app metrics support #172

Merged
merged 40 commits into from
Sep 20, 2023
Merged

Conversation

schneems
Copy link
Contributor

@schneems schneems commented Jul 19, 2023

Ruby supports "language specific metrics (beta)" https://devcenter.heroku.com/articles/language-runtime-metrics-ruby through the barnes gem https://github.com/heroku/barnes and the heroku/metrics buildpack https://github.com/heroku/heroku-buildpack-metrics. This PR ports the logic of heroku/metrics to the Ruby CNB buildpack.

Implementation

At build time, this buildpack will download a statsd client (named "agentmon") from S3 and place it on the path. There are currently no explicit interfaces for running background/daemon processes via the CNB spec, however there is an interface that guarantees it will execute arbitrary code (exec.d). This interface is intended to set dynamic environment variables at runtime. As a side effect, this interface is used to schedule a background daemon via the start-stop-daemon command docs.

The start-stop-daemon calls our program, a new rust script in bin/agentmon_loop. This rust script will boot agentmon in a loop if the necessary environment variables are set.

GUS-W-13115697.
GUS-W-13815707.

@schneems schneems force-pushed the schneems/metrics-agent-download-rebase branch from d417ef9 to 72ecafa Compare July 19, 2023 18:58
@schneems schneems changed the title Schneems/metrics agent download rebase CNB heroku/metrics buildpack replacement Jul 19, 2023
@schneems schneems force-pushed the schneems/metrics-agent-download-rebase branch 2 times, most recently from 64215f5 to 6abe4b9 Compare July 19, 2023 19:04
@schneems schneems force-pushed the schneems/metrics-agent-download-rebase branch 2 times, most recently from 4cd19d4 to 06133d0 Compare August 9, 2023 18:14
@schneems schneems marked this pull request as ready for review August 9, 2023 18:40
@schneems schneems requested a review from a team as a code owner August 9, 2023 18:40
Copy link
Member

@Malax Malax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool to see progress on language metrics, I love that feature! I timeboxed the review to two hours and stopping now, I will do a second pass on this after the discussions I added in this review are resolved.

In general: This doesn't seem to be Ruby specific and could be of use for other languages as well - it might be worth extracting out at a later point in time. However, since this relies on the Ruby helpers in this repo it's already tied to it to some degree and will "calcify" more over time. It might therefore make sense to extract it now (or immediately after this PR is merged) to avoid it being long-term stuck in this repo.

buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/layers/install_agentmon.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/layers/install_agentmon.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/layers/install_agentmon.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/layers/install_agentmon.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/main.rs Outdated Show resolved Hide resolved
@schneems
Copy link
Contributor Author

I updated agentmon_loop I'm still looking at your other comments for main.rs and the layer probably tomorrow.

@schneems
Copy link
Contributor Author

Updated to address the rest of the comments. I reworked a lot. Moved the bash script that schedules the daemon into a rust script. Moved to a hardcoded URL.

Copy link
Member

@joshwlewis joshwlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me; though I'd like to see the question about the generic buildpack resolved.

Cargo.toml Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/buildpack.toml Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/agentmon_loop.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/bin/launch_daemon.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/layers/install_agentmon.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/layers/install_agentmon.rs Outdated Show resolved Hide resolved
buildpacks/metrics-agent/src/layers/install_agentmon.rs Outdated Show resolved Hide resolved
commons/src/build_output.rs Outdated Show resolved Hide resolved
The classic `heroku/metrics` buildpack sets up a statsd daemon that the dyno can use to send Heroku language specific metrics. For Ruby the biggest one is Puma Pool Capacity.

This code will download the metrics client and put it on the path, however there's an issue affecting our ability to actually boot a daemon.

## Context

Classic buildpacks (Heroku v2) have acces to a `profile.d` directory they can use to write scripts which execute at boot time. This was largely needed to set build time environment variables because the directory structure of build and runtime was different. It was eventually used by the `heroku/metrics` buildpack to download an run a metrics daemon in the background.

Cloud Native Buildpacks (CNB) removed support for bash in an effort to simplify the spec and allow buildpacks to target extremely lightweight distributions that do not include `bash` (for example if you are running a pure JVM application, then there's no runtime need to have bash on the system, removing it reduces the image size and can be viewed as a reduction in security surface area).

As part of this work to remove bash from the spec the `profile.d`-like interface was removed from the CNB spec.

## Problem

Without a spec supported way to execute a script at startup there is no way for a buildpack to provide a background daemon.

## Possible solutions

- TBD
Errors in the middle of an `Installing ... ` section looked weird because it would start rendering on the same line:

```
# Heroku Statsd Metrics Agent

- Metrics agent
  - Installing .! ERROR: Could not install Statsd agent
!
! An error occured while downloading and installing the metrics agent
! the buildpack cannot continue
!

Debug information: Could not untar: failed to iterate over archive
ERROR: failed to build: exit status 1
ERROR: failed to build: executing lifecycle: failed with status code: 51
```

I changed it so it's now like this:

```

# Heroku Statsd Metrics Agent

- Metrics agent
  - Downloading ..
! ERROR: Could not install Statsd agent
!
! An error occured while downloading and installing the metrics agent
! the buildpack cannot continue
!
! Debug information:
Could not untar: failed to iterate over archive

ERROR: failed to build: exit status 1
ERROR: failed to build: executing lifecycle: failed with status code: 51
```

I think it would look better to indent the "debug information" at least. Maybe add an extra newline.
Since exec.d will run any program given to it I want to intentionally leak a bash process that will continue to run in the background. It works locally:

```
$ cat lol.sh
#!/usr/bin/env bash

echo "spawning agentmon" &

while true; do echo "pretend agentmon"; sleep 2; done &
⛄️ 3.1.4 🚀 /tmp
$ ./lol.sh
spawning agentmon
pretend agentmon
⛄️ 3.1.4 🚀 /tmp
$ echo "pretend agentmon
it works"
it works
⛄️ 3.1.4 🚀 /tmp
$ pretend agentmon
echo "lpretend agentmon
ol"
lol
⛄️ 3.1.4 🚀 /tmp
$ pretend agentmon
pretend agentmon
epretend agentmon
echo "pretend agentmon
It keeps runnipretend agentmon
ng even apretend agentmon
s I can pretend agentmon
use the systpretend agentmon
em"
It keeps running even as I can use the system
⛄️ 3.1.4 🚀 /tmp
$ pretend agentmon
```

However when I try to replicate this with exec.d it never seems to end the `exec.d` program and never yields control to a different process:

```
$ cargo libcnb package; docker rmi my-image; pack build  my-image --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics  --path /tmp/47b6249d5e0a353f91910f848a700061 --pull-policy never
🔍 Locating buildpacks...
📦 [1/2] Building heroku/ruby
Determining automatic cross-compile settings...
Building binaries (x86_64-unknown-linux-musl)...
    Finished dev [unoptimized] target(s) in 0.89s
Writing buildpack directory...
Successfully wrote buildpack directory: ../../target/buildpack/debug/heroku_ruby (12.64 MiB)
📦 [2/2] Building heroku/statsd-metrics
Determining automatic cross-compile settings...
Building binaries (x86_64-unknown-linux-musl)...
    Blocking waiting for file lock on package cache
    Blocking waiting for file lock on package cache
    Blocking waiting for file lock on package cache
   Compiling heroku-statsd-metrics v0.0.0 (/Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/buildpacks/metrics-agent)
warning: unused import: `DownloadAgentmon`
 --> buildpacks/metrics-agent/src/main.rs:3:40
  |
3 | use crate::layers::download_agentmon::{DownloadAgentmon, DownloadAgentmonError};
  |                                        ^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: `heroku-statsd-metrics` (bin "heroku-statsd-metrics") generated 1 warning (run `cargo fix --bin "heroku-statsd-metrics"` to apply 1 suggestion)
    Finished dev [unoptimized] target(s) in 7.67s
Writing buildpack directory...
Successfully wrote buildpack directory: ../../target/buildpack/debug/heroku_statsd-metrics (6.58 MiB)
✨ Packaging successfully finished!

💡 To test your buildpack locally with pack, run:
pack build my-image-name \
  --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_ruby \
  --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics \
  --path /path/to/application

/Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_ruby
/Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics
Error: No such image: my-image
===> ANALYZING
Image with name "my-image" not found
===> DETECTING
heroku/statsd-metrics 2.0.0
===> RESTORING
===> BUILDING

# Heroku Statsd Metrics Agent

- Done
===> EXPORTING
Adding layer 'heroku/statsd-metrics:layer_name'
Adding layer 'buildpacksio/lifecycle:launch.sbom'
Adding 1/1 app layer(s)
Adding layer 'buildpacksio/lifecycle:launcher'
Adding layer 'buildpacksio/lifecycle:config'
Adding label 'io.buildpacks.lifecycle.metadata'
Adding label 'io.buildpacks.build.metadata'
Adding label 'io.buildpacks.project.metadata'
no default process type
Saving my-image...
*** Images (516b6641ecda):
      my-image
Successfully built image my-image
⛄️ 3.1.4 🚀 /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby (schneems/metrics-agent-download-rebase)
$ docker run -it --rm --entrypoint='/cnb/lifecycle/launcher' my-image 'sleep 60'
spawning agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
^[pretend agentmon
^C%
⛄️ 3.1.4 🚀 /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby (schneems/metrics-agent-download-rebase)
$ docker run -it --rm --entrypoint='/cnb/lifecycle/launcher' my-image 'bash'
spawning agentmon
pretend agentmon

lol
pretend agentmon
echo lopretend agentmon
l
pretend agentmon
ls
pretend agentmon
cd
echpretend agentmon
o "it never pretend agentmon
ends"
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
```

Here's the test directory (needs barnes in the Gemfile.lock to execute)

```
$ ls /tmp/47b6249d5e0a353f91910f848a700061
Gemfile		Gemfile.lock
⛄️ 3.1.4 🚀 /tmp/47b6249d5e0a353f91910f848a700061
$ cat /tmp/47b6249d5e0a353f91910f848a700061/Gemfile
source "https://rubygems.org"

gem "barnes"
⛄️ 3.1.4 🚀 /tmp/47b6249d5e0a353f91910f848a700061
$ cat /tmp/47b6249d5e0a353f91910f848a700061/Gemfile.lock
GEM
  remote: https://rubygems.org/
  specs:
    barnes (0.0.9)
      multi_json (~> 1)
      statsd-ruby (~> 1.1)
    multi_json (1.15.0)
    statsd-ruby (1.5.0)

PLATFORMS
  x86_64-darwin-22

DEPENDENCIES
  barnes

BUNDLED WITH
   2.4.15
```
## Fix URL logic

On the plane I assumed the URL was a redirect to the most current file. It is not. It returns a plain text body with the most recent URL. I fixed this logic.

## Cache download logic

The URL is versioned. For example:

```
https://agentmon-releases.s3.amazonaws.com/agentmon-0.3.1-linux-amd64.tar.gz
```

So I used it as a cache key to not have to download the binary when the URL hasn't changed.

## Background process

While my initial experiment wasn't successful David suggested to try out `start-stop-daemon` (though it is only supported on Ubuntu).

The `start-stop-daemon` can successfully boot a background process and allow `execd` to return.

For example this works:

```rust
        let background_script = layer_path.join("agentmon_script");
        let execd_script = layer_path.join("agentmon_exec.d");

        write_bash_script(
            &background_script,
            r#"
            while true; do
                echo 'pretend agentmon' >> /tmp/agentmon.txt
                sleep 2
            done
        "#,
        )
        .unwrap();

        // Intentionally leak background process
        let background_script = background_script.canonicalize().unwrap();
        let background_script = background_script.display();
        write_bash_script(
            &execd_script,
            format!(
                r#"start-stop-daemon --start --background --exec "{background_script}""#
            ),
        )
        .unwrap();
```

With this output:

```
$  docker run -it --rm --entrypoint='/cnb/lifecycle/launcher' my-image 'bash'
e3b22af1a126:/workspace$
e3b22af1a126:/workspace$
e3b22af1a126:/workspace$ ls /tmp
agentmon.txt
e3b22af1a126:/workspace$ tail -f /tmp/agentmon.txt
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
pretend agentmon
```

As to why this is needed. From David:

> [...] it appears that Go waits for all processes where PPID = launcher's PID (1), instead of the spawned PID. Still digging to understand why, since from my understanding of the code so far, that should not be the case.

> launcher uses Cmd.Run()  which does Cmd.Start() and then Cmd.Wait()
> For Wait(), internally, Go does wait4(pid, &status, 0, &rusage): https://cs.opensource.google/go/go/+/refs/tags/go1.20.6:src/os/exec_unix.go;l=16-60
But as you can see, before that, it first calls blockUntilWaitable() which is implemented here for systems that have waitid(), such as Linux: https://cs.opensource.google/go/go/+/refs/tags/go1.20.6:src/os/wait_waitid.go
The signature for that is int waitid(idtype_t idtype, id_t id, siginfo_t *infop, int options);, see https://man7.org/linux/man-pages/man2/waitpid.2.html
> I am starting to wonder if this instead has to do with open FDs which the sleep() would also inherit, since launcher obviously reads from the process.

I don't know why one works and the other doesn't. That needs more investigation. My immediate goal is to get agentmon working on CNBs.

TODO:

- Add an integration test
- Move the while-loop agentmon bash script to a Rust script
```
$ cargo libcnb package &&
docker rmi my-image --force  &&
pack build my-image --buildpack target/buildpack/debug/heroku_statsd-metrics --path /tmp/47b6249d5e0a353f91910f848a700061/ --pull-policy never
🔍 Locating buildpacks...
📦 [1/2] Building heroku/ruby
Determining automatic cross-compile settings...
Building binaries (x86_64-unknown-linux-musl)...
    Finished dev [unoptimized] target(s) in 0.48s
Writing buildpack directory...
Successfully wrote buildpack directory: ../../target/buildpack/debug/heroku_ruby (12.64 MiB)
📦 [2/2] Building heroku/statsd-metrics
Determining automatic cross-compile settings...
Building binaries (x86_64-unknown-linux-musl)...
    Blocking waiting for file lock on package cache
    Blocking waiting for file lock on package cache
    Finished dev [unoptimized] target(s) in 0.68s
    Finished dev [unoptimized] target(s) in 0.51s
Writing buildpack directory...
Successfully wrote buildpack directory: ../../target/buildpack/debug/heroku_statsd-metrics (12.15 MiB)
✨ Packaging successfully finished!

💡 To test your buildpack locally with pack, run:
pack build my-image-name \
  --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_ruby \
  --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics \
  --path /path/to/application

/Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_ruby
/Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics
Untagged: my-image:latest
Deleted: sha256:b73cef10f9747c93c0589f68bd30b76298f57a178c9e3efb68dbe77a9bd70cdc
Deleted: sha256:9c2fb27824ec55b3e82d5245a623de6ae85ffbcfd91d9ca93cda6e220fb4ae28
Deleted: sha256:0864c2235caabe94562658b4003c64d2834a8d15915bffdb080d9d6d39d37ad1
Deleted: sha256:e0f324b7867aecebbe166b6a773859cd8a9779aed246c0e6029430b5558bfc95
Deleted: sha256:c578d8034a4c45399cf1a55207eb25d07464681fe44caf88b94b6082dee74df8
Deleted: sha256:d45865c0091f9fde9ec70ba62d1d8dd48b3cbb6bcad730be8fed56344bc36ae3
===> ANALYZING
Image with name "my-image" not found
===> DETECTING
heroku/statsd-metrics 2.0.0
===> RESTORING
===> BUILDING

# Heroku Statsd Metrics Agent

- Metrics agent
  - Downloading ..... (2.596s)
  - Writing scripts
- Done (finished in 2.602s)
===> EXPORTING
Adding layer 'heroku/statsd-metrics:statsd-metrics-agent'
Adding layer 'buildpacksio/lifecycle:launch.sbom'
Adding 1/1 app layer(s)
Adding layer 'buildpacksio/lifecycle:launcher'
Adding layer 'buildpacksio/lifecycle:config'
Adding label 'io.buildpacks.lifecycle.metadata'
Adding label 'io.buildpacks.build.metadata'
Adding label 'io.buildpacks.project.metadata'
no default process type
Saving my-image...
*** Images (b73cef10f974):
      my-image
Reusing cache layer 'heroku/statsd-metrics:statsd-metrics-agent'
Successfully built image my-image
⛄️ 3.1.4 🚀 /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby (schneems/metrics-agent-download-rebase)
$ docker run -it --rm --entrypoint='/cnb/lifecycle/launcher' -e HEROKU_METRICS_URL=https://example.com my-image 'bash'
1bb0e5cf43f5:/workspace$ ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
heroku       1  2.0  0.0   4624  3792 pts/0    Ss   22:18   0:00 bash
heroku      14 11.3  0.0   2176     4 ?        S    22:18   0:00 /layers/heroku_statsd-metrics/statsd-metrics-agent/agentmon_loop --path /layers/heroku_statsd-metrics/statsd-metrics-agent/b
heroku      16  0.0  0.0   7060  1584 pts/0    R+   22:18   0:00 ps -aux
```

Manually running the script seems to work:

```
1bb0e5cf43f5:/workspace$ PORT=3000 /layers/heroku_statsd-metrics/statsd-metrics-agent/agentmon_loop --path /layers/heroku_statsd-metrics/statsd-metrics-agent/bin/agentmon
agentmon: Listening on :3000...
```

I'm not sure how we test this.
Asserts that an app build with the heroku statsd metrics buildpack that has `barnes` in the Gemfile.lock will have a process `agentmon_loop` running in the background on dyne boot with both HEROKU_METRICS_URL and DYNO env vars are present at boot.
Not everything on docker is a buildpack, we use prefixes to make it clear at a glance what is.
- Fix bad docstrings
- Move sleep_for to a constant
- Refactor logic into smaller, more consistent functions
- Added tests for argument generation
Agentmon hasn't seen a release in 6 years. We can hardcode the url for now.

When the URL doesn't change we can keep the last download around. I always want to update the scripts on disk so then any changes to the files will be picked up, so we'll always call update or create, and never keep.

We could get fancy and use MetadataDigest to cache those too, but writing them is pretty much instantaneous so I don't think it buys us anything.

This change allows us to remove the `cached` dependency.
@schneems schneems force-pushed the schneems/metrics-agent-download-rebase branch from 802b064 to 64a9ffe Compare August 28, 2023 21:28
@schneems schneems requested review from edmorley and Malax September 11, 2023 20:04
The `download_to_dir` did more than download, it also untar-d and set executable permissions. 

The `agentmon_download` was a very loose wrapper around `download_to_dir`.

I made removed the chmod functionality from `download_to_dir` and renamed it `download_untar`. I moved the chmod functionality into `agentmond_download` and renamed it `agentmon_install`. It's a minor change but I think it's an improvement.
This presents a tip to the user to tell them how to enable metrics collection.
@schneems schneems force-pushed the schneems/metrics-agent-download-rebase branch from b0694c0 to c61dd9b Compare September 11, 2023 21:34
@schneems
Copy link
Contributor Author

I think there's a race condition in the tests previously it was failing with

thread 'test_barnes_app' panicked at 'assertion failed: `(left contains right)`
left (unescaped):
    PID TTY      STAT   TIME COMMAND
      1 ?        Rs     0:00 ps x
     14 ?        R      0:00 [start-stop-daem]


left (escaped): `"    PID TTY      STAT   TIME COMMAND\n      1 ?        Rs     0:00 ps x\n     14 ?        R      0:00 [start-stop-daem]\n"`
right: `"agentmon_loop --path"`', buildpacks/ruby/tests/integration_test.rs:138:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Now it's failing with

---- test_barnes_app stdout ----
    PID TTY      STAT   TIME COMMAND
      1 ?        Rs     0:00 ps x
     14 ?        R      0:00 start-stop-daemon --start --background --exec /layers/heroku_ruby/metrics_agent/agentmon_loop -- --path /layers/heroku_ruby/metrics_agent/bin/agentmon


thread 'test_barnes_app' panicked at 'assertion failed: `(left contains right)`
left (unescaped):
    PID TTY      STAT   TIME COMMAND
      1 ?        Rs     0:00 ps x
     14 ?        R      0:00 start-stop-daemon --start --background --exec /layers/heroku_ruby/metrics_agent/agentmon_loop -- --path /layers/heroku_ruby/metrics_agent/bin/agentmon


left (escaped): `"    PID TTY      STAT   TIME COMMAND\n      1 ?        Rs     0:00 ps x\n     14 ?        R      0:00 start-stop-daemon --start --background --exec /layers/heroku_ruby/metrics_agent/agentmon_loop -- --path /layers/heroku_ruby/metrics_agent/bin/agentmon\n"`
right: `"agentmon_loop --path"`', buildpacks/ruby/tests/integration_test.rs:[138](https://github.com/heroku/buildpacks-ruby/actions/runs/6153895381/job/16698442352?pr=172#step:8:139):21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I think it looks like this:

  • start-stop-daemon is called, this can show up as one of the above outputs
  • that will cause the agentmon_loop --path <> command to be spawned, but maybe not right away

So depending on when we read the results of ps we can get a different result. I could possibly start the container and then loop/sleep a few times to try to get some determinisim. Maybe there's a better way to assert that:

  • The script we want to boot does in fact boot at launch
  • It does not immediately exit (which might indicate a problem in our code)

Another tool to use is to look at the specified output log file (if we set AGENTMON_DEBUG=1). It would probably be best to print that out anyway in case an error comes up.

Cargo.toml Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
buildpacks/ruby/src/bin/agentmon_loop.rs Show resolved Hide resolved
buildpacks/ruby/src/layers/metrics_agent_install.rs Outdated Show resolved Hide resolved
buildpacks/ruby/tests/integration_test.rs Outdated Show resolved Hide resolved
@edmorley edmorley dismissed their stale review September 12, 2023 12:48

I don't have any more blocking comments, though CI is failing and there's a pending review request for Manuel. Cancelling my "changes requested" so that Manuel can approve once he's reviewed.

The method we're using to launch a daemon process does not give us guarantees that `agentmon_loop --path` will be executed by the time the process boots. To address this we can loop and re-check for some period of time.

This method utilizes the output log from agentmon loop as directed by the spawn daemon command.
@schneems schneems force-pushed the schneems/metrics-agent-download-rebase branch from c78629e to 81d2fa4 Compare September 12, 2023 19:52
schneems and others added 3 commits September 12, 2023 15:11
This is the default. It is only needed when multiple build packs are specified
@schneems schneems force-pushed the schneems/metrics-agent-download-rebase branch from a521b5d to 40a5b81 Compare September 12, 2023 20:16
@schneems
Copy link
Contributor Author

I fixed the tests (#172 (comment)) to be more deterministic, but it's still based on a timeout (which can fail if the machine is slow or docker is slow due to timing conditions). It's better than it was, though. Please take a look. Specifically, this line does not seem great https://github.com/heroku/buildpacks-ruby/pull/172/files#diff-17c88161984214fafac5afdc89a5919f2dbcfb1165f2560fd255809316095fe8R131.

I applied all of your requested changes. Thanks for the 👀 . I've asked @Malax for another round of reviews.

Copy link
Member

@edmorley edmorley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

buildpacks/ruby/CHANGELOG.md Show resolved Hide resolved
buildpacks/ruby/tests/integration_test.rs Show resolved Hide resolved
Copy link
Member

@Malax Malax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the requested changes and sorry for the delay with the re-review! Apart from the checksum check, I see nothing blocking.

@schneems
Copy link
Contributor Author

I've updated to use a checksum with the libherokubuildpack shared tooling.

@schneems schneems enabled auto-merge (squash) September 19, 2023 14:38
@schneems schneems merged commit 4b56b94 into main Sep 20, 2023
5 checks passed
@schneems schneems deleted the schneems/metrics-agent-download-rebase branch September 20, 2023 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants