Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added check to events-to-s3 for label canary #40

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bshien
Copy link
Contributor

@bshien bshien commented Nov 27, 2024

Description

Added check to events-to-s3 for label canary

Issues Resolved

Part of opensearch-project/opensearch-metrics#105

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Nov 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.47%. Comparing base (4a90494) to head (c07ade7).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #40      +/-   ##
==========================================
+ Coverage   99.44%   99.47%   +0.02%     
==========================================
  Files          10       10              
  Lines         181      190       +9     
  Branches       25       26       +1     
==========================================
+ Hits          180      189       +9     
  Misses          1        1              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -23,6 +24,28 @@ export default async function githubEventsToS3(app: Probot, context: any): Promi
//
const repoName = context.payload.repository?.name;
if (context.payload.repository?.private === false) {
// Handle canary event for monitoring purposes
if (context.name === 'label' && context.payload.label?.name === 's3-data-lake-app-canary-label') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure what you are trying to do.

Are you expecting a label called s3-data-lake-app-canary-label to be added to an issue, then sending a metrics to cloudwatch ?

Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain your use case here?

Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm trying to set up monitoring for this automation app running in the docker container in the metrics aws account so if the app goes down, we will get notified. There will be a lambda that will create and delete a label every 10 minutes, and once this app listens on that event, it will send a CW metric. There will be an alarm on that metric that triggers where there is no metric data(indicating that the automation app is down).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which repo you are doing this label creation/deletion?
Also is there any other ways to monitor this than touching a label?

Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm doing the label creation/deletion in opensearch-metrics. One sec, let me add a check for this.

Prudhvi and I discussed about it, there are a few alternative approaches like checking for logs but the canary seems best since it directly tests it in real time. For example, the issue with checking for logs is that certain times, like the weekend, have less events potentially causing the alarm to trigger where there is nothing wrong. The name I chose for this label is long and specific(s3-data-lake-app-canary-label) and it gets deleted immediately after creation, so it won't have much impact on anything else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you code in metrics repo should ensure that creation/deletion both return success before false positives being captured here.

Also I think you can force create label as well, so you might not need to delete the label anyway.
Just create a label with -Force param or similar would be enough, since you only want canary to add data to cloudwatch every X min to ensure there is no alarm.

@bshien bshien force-pushed the label-canary-monitor branch from b433512 to c07ade7 Compare November 27, 2024 21:46
@@ -23,6 +24,28 @@ export default async function githubEventsToS3(app: Probot, context: any): Promi
//
const repoName = context.payload.repository?.name;
if (context.payload.repository?.private === false) {
// Handle canary event for monitoring purposes
if (repoName === 'opensearch-metrics' && context.name === 'label' && context.payload.label?.name === 's3-data-lake-app-canary-label') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any small and simplified name please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a long name for the label might be better in case someone adds this label randomly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you can just use a random hash if you just want to test the addition and deletion.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Dec 9, 2024

Please rebase and update your version to 0.2.2.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants