Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add capability to set an instance identifier which labels all metrics #1198

Merged
merged 10 commits into from
Apr 24, 2024

Conversation

gvelez17
Copy link
Contributor

set instance identifier - #WS2-3187

Description

The current CAS metrics are being sent from the 6 different api nodes with the same name and labels, resulting in inaccurate metrics, stepping on totals, odd reset patterns. Each instance should have a distinct label. This PR uses the task id as the label and uses the new observability functionality of Instance Identifier to automatically label all metrics (see ceramicnetwork/observability#29)

How Has This Been Tested?

Describe the tests that you ran to verify your changes. Provide instructions for reproduction.

  • [ x] observability package new unit tests
  • unit tests in this pr

Definition of Done

Before submitting this PR, please make sure:

  • [x ] The work addresses the description and outcomes in the issue
  • [ x] I have added relevant tests for new or updated functionality
  • My code follows conventions, is well commented, and easy to understand
  • My code builds and tests pass without any errors or warnings
  • I have tagged the relevant reviewers
  • I have updated the READMEs of affected packages
  • I have made corresponding changes to the documentation
  • The changes have been communicated to interested parties

References:

Please list relevant documentation (e.g. tech specs, articles, related work etc.) relevant to this change, and note if the documentation has been updated.

@gvelez17 gvelez17 requested review from smrz2001 and 3benbox April 22, 2024 21:07
@codecov-commenter
Copy link

codecov-commenter commented Apr 22, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 81.48%. Comparing base (dc77998) to head (6cafafe).

Files Patch % Lines
src/app.ts 50.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1198      +/-   ##
===========================================
- Coverage    81.51%   81.48%   -0.04%     
===========================================
  Files           46       46              
  Lines         1829     1831       +2     
  Branches       289      290       +1     
===========================================
+ Hits          1491     1492       +1     
- Misses         338      339       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gvelez17 gvelez17 marked this pull request as ready for review April 22, 2024 21:49
Copy link
Contributor

@3benbox 3benbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question about how the value gets set.

config/env/dev.json Outdated Show resolved Hide resolved
@gvelez17 gvelez17 requested a review from samika98 April 23, 2024 21:08
Copy link
Contributor

@3benbox 3benbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -56,6 +56,7 @@
"traceRatio": "@@METRICS_TRACE_RATIO",
"exportIntervalMillis": "@@METRICS_EXPORT_INTERVAL_MS",
"exportTimeoutMillis": "@@METRICS_EXPORT_TIMEOUT_MS",
"instanceIdentifier": "@@ECS_CONTAINER_METADATA_URI",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to inject this through the Dockerfile via an ENTRYPOINT script and call it INSTANCE_ID or something like that.

We do have AWS specific stuff in the env vars and code but parsing a very specific ECS env var directly in the code seems undesirable.

entrypoint.sh:

#!/bin/bash

uri=$ECS_CONTAINER_METADATA_URI

# Parse the last portion of the URI
value=${uri##*/}

# Remove everything after the hyphen to get the task ID
value=${value%%-*}

export INSTANCE_ID=$value

exec "$@"

Dockerfile:

.
.
.
WORKDIR /

COPY entrypoint.sh .
RUN chmod +x ./entrypoint.sh

COPY runner.sh .

ENTRYPOINT ["./entrypoint.sh"]
CMD [ "./runner.sh" ]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have AWS specific stuff in the env vars and code but parsing a very specific ECS env var directly in the code seems undesirable.

Oh yeah, good call!

src/app.ts Outdated
@@ -129,6 +129,18 @@ export class CeramicAnchorApp {
)
Metrics.count('HELLO', 1)
logger.imp('Metrics exporter started')
if (this.config.metrics.instanceIdentifier) {
// In the API server, the instanceIdentifer is set from ECS_CONTAINER_METADATA_URI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If done using the entrypoint.sh script above, we won't need to parse this in the code here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right! the logic isn't quite the same - not sure if it matters, i was defaulting to the full uri if there wasn't a task id at the end - but i agree yours is better, since it moves it to the deployment and therefore we can make it very specific to what we expect!

@gvelez17 gvelez17 requested a review from smrz2001 April 24, 2024 19:27
@gvelez17 gvelez17 merged commit fa093e3 into develop Apr 24, 2024
6 checks passed
@gvelez17 gvelez17 deleted the feat/set-instance-identifier branch April 24, 2024 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants