Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Introduce new metrics buildpack The classic `heroku/metrics` buildpack sets up a statsd daemon that the dyno can use to send Heroku language specific metrics. For Ruby the biggest one is Puma Pool Capacity. This code will download the metrics client and put it on the path, however there's an issue affecting our ability to actually boot a daemon. ## Context Classic buildpacks (Heroku v2) have acces to a `profile.d` directory they can use to write scripts which execute at boot time. This was largely needed to set build time environment variables because the directory structure of build and runtime was different. It was eventually used by the `heroku/metrics` buildpack to download an run a metrics daemon in the background. Cloud Native Buildpacks (CNB) removed support for bash in an effort to simplify the spec and allow buildpacks to target extremely lightweight distributions that do not include `bash` (for example if you are running a pure JVM application, then there's no runtime need to have bash on the system, removing it reduces the image size and can be viewed as a reduction in security surface area). As part of this work to remove bash from the spec the `profile.d`-like interface was removed from the CNB spec. ## Problem Without a spec supported way to execute a script at startup there is no way for a buildpack to provide a background daemon. ## Possible solutions - TBD * Update build output Errors in the middle of an `Installing ... ` section looked weird because it would start rendering on the same line: ``` # Heroku Statsd Metrics Agent - Metrics agent - Installing .! ERROR: Could not install Statsd agent ! ! An error occured while downloading and installing the metrics agent ! the buildpack cannot continue ! Debug information: Could not untar: failed to iterate over archive ERROR: failed to build: exit status 1 ERROR: failed to build: executing lifecycle: failed with status code: 51 ``` I changed it so it's now like this: ``` # Heroku Statsd Metrics Agent - Metrics agent - Downloading .. ! ERROR: Could not install Statsd agent ! ! An error occured while downloading and installing the metrics agent ! the buildpack cannot continue ! ! Debug information: Could not untar: failed to iterate over archive ERROR: failed to build: exit status 1 ERROR: failed to build: executing lifecycle: failed with status code: 51 ``` I think it would look better to indent the "debug information" at least. Maybe add an extra newline. * Attempting workaround with exec.d Since exec.d will run any program given to it I want to intentionally leak a bash process that will continue to run in the background. It works locally: ``` $ cat lol.sh #!/usr/bin/env bash echo "spawning agentmon" & while true; do echo "pretend agentmon"; sleep 2; done & ⛄️ 3.1.4 🚀 /tmp $ ./lol.sh spawning agentmon pretend agentmon ⛄️ 3.1.4 🚀 /tmp $ echo "pretend agentmon it works" it works ⛄️ 3.1.4 🚀 /tmp $ pretend agentmon echo "lpretend agentmon ol" lol ⛄️ 3.1.4 🚀 /tmp $ pretend agentmon pretend agentmon epretend agentmon echo "pretend agentmon It keeps runnipretend agentmon ng even apretend agentmon s I can pretend agentmon use the systpretend agentmon em" It keeps running even as I can use the system ⛄️ 3.1.4 🚀 /tmp $ pretend agentmon ``` However when I try to replicate this with exec.d it never seems to end the `exec.d` program and never yields control to a different process: ``` $ cargo libcnb package; docker rmi my-image; pack build my-image --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics --path /tmp/47b6249d5e0a353f91910f848a700061 --pull-policy never 🔍 Locating buildpacks... 📦 [1/2] Building heroku/ruby Determining automatic cross-compile settings... Building binaries (x86_64-unknown-linux-musl)... Finished dev [unoptimized] target(s) in 0.89s Writing buildpack directory... Successfully wrote buildpack directory: ../../target/buildpack/debug/heroku_ruby (12.64 MiB) 📦 [2/2] Building heroku/statsd-metrics Determining automatic cross-compile settings... Building binaries (x86_64-unknown-linux-musl)... Blocking waiting for file lock on package cache Blocking waiting for file lock on package cache Blocking waiting for file lock on package cache Compiling heroku-statsd-metrics v0.0.0 (/Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/buildpacks/metrics-agent) warning: unused import: `DownloadAgentmon` --> buildpacks/metrics-agent/src/main.rs:3:40 | 3 | use crate::layers::download_agentmon::{DownloadAgentmon, DownloadAgentmonError}; | ^^^^^^^^^^^^^^^^ | = note: `#[warn(unused_imports)]` on by default warning: `heroku-statsd-metrics` (bin "heroku-statsd-metrics") generated 1 warning (run `cargo fix --bin "heroku-statsd-metrics"` to apply 1 suggestion) Finished dev [unoptimized] target(s) in 7.67s Writing buildpack directory... Successfully wrote buildpack directory: ../../target/buildpack/debug/heroku_statsd-metrics (6.58 MiB) ✨ Packaging successfully finished! 💡 To test your buildpack locally with pack, run: pack build my-image-name \ --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_ruby \ --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics \ --path /path/to/application /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_ruby /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics Error: No such image: my-image ===> ANALYZING Image with name "my-image" not found ===> DETECTING heroku/statsd-metrics 2.0.0 ===> RESTORING ===> BUILDING # Heroku Statsd Metrics Agent - Done ===> EXPORTING Adding layer 'heroku/statsd-metrics:layer_name' Adding layer 'buildpacksio/lifecycle:launch.sbom' Adding 1/1 app layer(s) Adding layer 'buildpacksio/lifecycle:launcher' Adding layer 'buildpacksio/lifecycle:config' Adding label 'io.buildpacks.lifecycle.metadata' Adding label 'io.buildpacks.build.metadata' Adding label 'io.buildpacks.project.metadata' no default process type Saving my-image... *** Images (516b6641ecda): my-image Successfully built image my-image ⛄️ 3.1.4 🚀 /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby (schneems/metrics-agent-download-rebase) $ docker run -it --rm --entrypoint='/cnb/lifecycle/launcher' my-image 'sleep 60' spawning agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon ^[pretend agentmon ^C% ⛄️ 3.1.4 🚀 /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby (schneems/metrics-agent-download-rebase) $ docker run -it --rm --entrypoint='/cnb/lifecycle/launcher' my-image 'bash' spawning agentmon pretend agentmon lol pretend agentmon echo lopretend agentmon l pretend agentmon ls pretend agentmon cd echpretend agentmon o "it never pretend agentmon ends" pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon ``` Here's the test directory (needs barnes in the Gemfile.lock to execute) ``` $ ls /tmp/47b6249d5e0a353f91910f848a700061 Gemfile Gemfile.lock ⛄️ 3.1.4 🚀 /tmp/47b6249d5e0a353f91910f848a700061 $ cat /tmp/47b6249d5e0a353f91910f848a700061/Gemfile source "https://rubygems.org" gem "barnes" ⛄️ 3.1.4 🚀 /tmp/47b6249d5e0a353f91910f848a700061 $ cat /tmp/47b6249d5e0a353f91910f848a700061/Gemfile.lock GEM remote: https://rubygems.org/ specs: barnes (0.0.9) multi_json (~> 1) statsd-ruby (~> 1.1) multi_json (1.15.0) statsd-ruby (1.5.0) PLATFORMS x86_64-darwin-22 DEPENDENCIES barnes BUNDLED WITH 2.4.15 ``` * Run agentmon with Ubuntu's start-stop-daemon ## Fix URL logic On the plane I assumed the URL was a redirect to the most current file. It is not. It returns a plain text body with the most recent URL. I fixed this logic. ## Cache download logic The URL is versioned. For example: ``` https://agentmon-releases.s3.amazonaws.com/agentmon-0.3.1-linux-amd64.tar.gz ``` So I used it as a cache key to not have to download the binary when the URL hasn't changed. ## Background process While my initial experiment wasn't successful David suggested to try out `start-stop-daemon` (though it is only supported on Ubuntu). The `start-stop-daemon` can successfully boot a background process and allow `execd` to return. For example this works: ```rust let background_script = layer_path.join("agentmon_script"); let execd_script = layer_path.join("agentmon_exec.d"); write_bash_script( &background_script, r#" while true; do echo 'pretend agentmon' >> /tmp/agentmon.txt sleep 2 done "#, ) .unwrap(); // Intentionally leak background process let background_script = background_script.canonicalize().unwrap(); let background_script = background_script.display(); write_bash_script( &execd_script, format!( r#"start-stop-daemon --start --background --exec "{background_script}""# ), ) .unwrap(); ``` With this output: ``` $ docker run -it --rm --entrypoint='/cnb/lifecycle/launcher' my-image 'bash' e3b22af1a126:/workspace$ e3b22af1a126:/workspace$ e3b22af1a126:/workspace$ ls /tmp agentmon.txt e3b22af1a126:/workspace$ tail -f /tmp/agentmon.txt pretend agentmon pretend agentmon pretend agentmon pretend agentmon pretend agentmon ``` As to why this is needed. From David: > [...] it appears that Go waits for all processes where PPID = launcher's PID (1), instead of the spawned PID. Still digging to understand why, since from my understanding of the code so far, that should not be the case. > launcher uses Cmd.Run() which does Cmd.Start() and then Cmd.Wait() > For Wait(), internally, Go does wait4(pid, &status, 0, &rusage): https://cs.opensource.google/go/go/+/refs/tags/go1.20.6:src/os/exec_unix.go;l=16-60 But as you can see, before that, it first calls blockUntilWaitable() which is implemented here for systems that have waitid(), such as Linux: https://cs.opensource.google/go/go/+/refs/tags/go1.20.6:src/os/wait_waitid.go The signature for that is int waitid(idtype_t idtype, id_t id, siginfo_t *infop, int options);, see https://man7.org/linux/man-pages/man2/waitpid.2.html > I am starting to wonder if this instead has to do with open FDs which the sleep() would also inherit, since launcher obviously reads from the process. I don't know why one works and the other doesn't. That needs more investigation. My immediate goal is to get agentmon working on CNBs. TODO: - Add an integration test - Move the while-loop agentmon bash script to a Rust script * Fix spelling * Move background loop logic from bash to Rust ``` $ cargo libcnb package && docker rmi my-image --force && pack build my-image --buildpack target/buildpack/debug/heroku_statsd-metrics --path /tmp/47b6249d5e0a353f91910f848a700061/ --pull-policy never 🔍 Locating buildpacks... 📦 [1/2] Building heroku/ruby Determining automatic cross-compile settings... Building binaries (x86_64-unknown-linux-musl)... Finished dev [unoptimized] target(s) in 0.48s Writing buildpack directory... Successfully wrote buildpack directory: ../../target/buildpack/debug/heroku_ruby (12.64 MiB) 📦 [2/2] Building heroku/statsd-metrics Determining automatic cross-compile settings... Building binaries (x86_64-unknown-linux-musl)... Blocking waiting for file lock on package cache Blocking waiting for file lock on package cache Finished dev [unoptimized] target(s) in 0.68s Finished dev [unoptimized] target(s) in 0.51s Writing buildpack directory... Successfully wrote buildpack directory: ../../target/buildpack/debug/heroku_statsd-metrics (12.15 MiB) ✨ Packaging successfully finished! 💡 To test your buildpack locally with pack, run: pack build my-image-name \ --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_ruby \ --buildpack /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics \ --path /path/to/application /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_ruby /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby/target/buildpack/debug/heroku_statsd-metrics Untagged: my-image:latest Deleted: sha256:b73cef10f9747c93c0589f68bd30b76298f57a178c9e3efb68dbe77a9bd70cdc Deleted: sha256:9c2fb27824ec55b3e82d5245a623de6ae85ffbcfd91d9ca93cda6e220fb4ae28 Deleted: sha256:0864c2235caabe94562658b4003c64d2834a8d15915bffdb080d9d6d39d37ad1 Deleted: sha256:e0f324b7867aecebbe166b6a773859cd8a9779aed246c0e6029430b5558bfc95 Deleted: sha256:c578d8034a4c45399cf1a55207eb25d07464681fe44caf88b94b6082dee74df8 Deleted: sha256:d45865c0091f9fde9ec70ba62d1d8dd48b3cbb6bcad730be8fed56344bc36ae3 ===> ANALYZING Image with name "my-image" not found ===> DETECTING heroku/statsd-metrics 2.0.0 ===> RESTORING ===> BUILDING # Heroku Statsd Metrics Agent - Metrics agent - Downloading ..... (2.596s) - Writing scripts - Done (finished in 2.602s) ===> EXPORTING Adding layer 'heroku/statsd-metrics:statsd-metrics-agent' Adding layer 'buildpacksio/lifecycle:launch.sbom' Adding 1/1 app layer(s) Adding layer 'buildpacksio/lifecycle:launcher' Adding layer 'buildpacksio/lifecycle:config' Adding label 'io.buildpacks.lifecycle.metadata' Adding label 'io.buildpacks.build.metadata' Adding label 'io.buildpacks.project.metadata' no default process type Saving my-image... *** Images (b73cef10f974): my-image Reusing cache layer 'heroku/statsd-metrics:statsd-metrics-agent' Successfully built image my-image ⛄️ 3.1.4 🚀 /Users/rschneeman/Documents/projects/work/buildpacks/buildpacks-ruby (schneems/metrics-agent-download-rebase) $ docker run -it --rm --entrypoint='/cnb/lifecycle/launcher' -e HEROKU_METRICS_URL=https://example.com my-image 'bash' 1bb0e5cf43f5:/workspace$ ps -aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND heroku 1 2.0 0.0 4624 3792 pts/0 Ss 22:18 0:00 bash heroku 14 11.3 0.0 2176 4 ? S 22:18 0:00 /layers/heroku_statsd-metrics/statsd-metrics-agent/agentmon_loop --path /layers/heroku_statsd-metrics/statsd-metrics-agent/b heroku 16 0.0 0.0 7060 1584 pts/0 R+ 22:18 0:00 ps -aux ``` Manually running the script seems to work: ``` 1bb0e5cf43f5:/workspace$ PORT=3000 /layers/heroku_statsd-metrics/statsd-metrics-agent/agentmon_loop --path /layers/heroku_statsd-metrics/statsd-metrics-agent/bin/agentmon agentmon: Listening on :3000... ``` I'm not sure how we test this. * Add metrics integration test with WIP libcnb-test feature Asserts that an app build with the heroku statsd metrics buildpack that has `barnes` in the Gemfile.lock will have a process `agentmon_loop` running in the background on dyne boot with both HEROKU_METRICS_URL and DYNO env vars are present at boot. * Enable agentmon logging * Clippy * Fix tool key * Prefix image with `buildpack-` Not everything on docker is a buildpack, we use prefixes to make it clear at a glance what is. * Update agentmon_loop - Fix bad docstrings - Move sleep_for to a constant - Refactor logic into smaller, more consistent functions - Added tests for argument generation * Remove unneeded enum contortions * Inline binary logic and consolidate test logic * Hardcode agentmon url, improve caching Agentmon hasn't seen a release in 6 years. We can hardcode the url for now. When the URL doesn't change we can keep the last download around. I always want to update the scripts on disk so then any changes to the files will be picked up, so we'll always call update or create, and never keep. We could get fancy and use MetadataDigest to cache those too, but writing them is pretty much instantaneous so I don't think it buys us anything. This change allows us to remove the `cached` dependency. * Move metrics logic inside of heroku/ruby * Revert "Update build output" This reverts commit 5f449e6. * Changelog entry and fix tests * Update buildpacks/ruby/Cargo.toml Co-authored-by: Ed Morley <[email protected]> * Prefer unwrap_or_else over match with Ok(_) * Remove changelog prefix * Prefer unwrap_or_else over match with Ok(_) * Prefer HasMap::from() over mut HashMap::new() * Fix docs * Use `Path::try_exists()` instead of `exists()` Differentiate between "no such file" versus "there's a file here, but something stops us from accessing it. * Fix stringly typed errors The error result of `absolute_path_exists` was `String` to normalize the error type and allow inserting the path name. This refactors that logic into a proper error enum. * Env var key is static & add disable instructions To ensure we're not misspelling the key name somewhere in the output I made it a static. I also added notes on how to disable the feature consistent with how it currently works (and also how all env var based enable/disable flags work in the existing buildpack I.e. they check for presence, not for content. This is also consistent with the Ruby language as `0`, `"0"`, and `"false"` (string false) are all "truthy". The only "falsey" values in Ruby are `false` (boolean false ) and `nil` * Static env var keys and tests Ed mentioned "There are three error variants, but only one is tested? Should there be tests for the other cases?" So I added test cases for them. Moved env var key name to statics for consistency. * Make error more specific > Ed: This struct is only returned by one function, rather than being a general error struct for the whole binary. Should it have a more specific name than Error? eg BuildArgsError or similar? Here's what that looks like. I'm okay with either. I think this is better if someone needs to add a different error in the future as this will guide them to make a new enum where appropriate. * Simplify spawning daemon * Match functionality to function name The `download_to_dir` did more than download, it also untar-d and set executable permissions. The `agentmon_download` was a very loose wrapper around `download_to_dir`. I made removed the chmod functionality from `download_to_dir` and renamed it `download_untar`. I moved the chmod functionality into `agentmond_download` and renamed it `agentmon_install`. It's a minor change but I think it's an improvement. * Log when barnes is installed, or isn't This presents a tip to the user to tell them how to enable metrics collection. * Address process booting determinism The method we're using to launch a daemon process does not give us guarantees that `agentmon_loop --path` will be executed by the time the process boots. To address this we can loop and re-check for some period of time. This method utilizes the output log from agentmon loop as directed by the spawn daemon command. * Update buildpacks/ruby/src/layers/metrics_agent_install.rs Co-authored-by: Ed Morley <[email protected]> * Remove unneeded BuildpackReference::Crate This is the default. It is only needed when multiple build packs are specified * Fix accidentally removed line * Remove unused workspace declarations * Add download checksum for agentmon TGZ * Update changelog to use Keep a Changelog format * Prefer shared tooling --------- Co-authored-by: Ed Morley <[email protected]>
- Loading branch information