-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEV-43744-alert-eval-api #1
Conversation
*DEV-31493 Override datasource URL and pass custom headers to alert rule evaluator *DEV-17927 - add LogzIoHeaders, Add error msg, change time to StartTime *DEV-30169, DEV-30170: api alert evaluation and alert processing (processing in comments) *DEV-16492 Support external alert evaluation *(no ticket) Upgrade to 8.4.0 *DEV-18900 - fix For check to be inclusive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed. the pr is not ready yet for review
pkg/api/alerting.go
Outdated
@@ -9,6 +9,7 @@ import ( | |||
|
|||
"github.com/grafana/grafana/pkg/api/dtos" | |||
"github.com/grafana/grafana/pkg/api/response" | |||
"github.com/grafana/grafana/pkg/models" // LOGZ.IO GRAFANA CHANGE :: DEV-17927 - add LogzIoHeaders |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the comment, I think better to use the reference to the new tickets. Also note in my PR I used a slightly different comment, so better be consistent (you can change mine if you wish).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/api/alerting.go
Outdated
@@ -179,7 +180,8 @@ func (hs *HTTPServer) AlertTest(c *contextmodel.ReqContext) response.Response { | |||
return response.Error(400, "The dashboard needs to be saved at least once before you can test an alert rule", nil) | |||
} | |||
|
|||
res, err := hs.AlertEngine.AlertTest(c.SignedInUser.GetOrgID(), dto.Dashboard, dto.PanelId, c.SignedInUser) | |||
// LOGZ.IO GRAFANA CHANGE :: DEV-17927 - add LogzIoHeaders | |||
res, err := hs.AlertEngine.AlertTest(c.SignedInUser.GetOrgID(), dto.Dashboard, dto.PanelId, c.SignedInUser, &models.LogzIoHeaders{RequestHeaders: c.Req.Header}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are LogzIoHeaders
required for test alert flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is technically part of the logzio datasources support, but because a lot of the implementation for the logz evaluation includes this i put it in this PR.
i don't know if at that point it's worth the time of taking out of this PR for the next , because the next PR anyway will be based on that and making these changes can add up a day of work i don't see any value for it 😅
in hindsight i would make the split if i knew about it from the beginning 😓
DEV-18005 - add requestId in error message
DEV-43744 , DEV-43883 , DEV-43895 Removed all code changes related to old alerting which isn't used, therefor not needed.
…-pasting thing relevant code into our API. !!still work in progress!! need to: *remove commented code *add logz headers somehow to request *fix scheduler so it will not evaluate so we can use it for the api *remove processing part
* not returning results in API and keeping the processing part in eval * added a logzio router to replace the regular router and send notifications to logz internal api (in config) * removed unused code of processing * changed the evaluation api to run as bulk
pkg/services/ngalert/api/api.go
Outdated
@@ -2,6 +2,7 @@ package api | |||
|
|||
import ( | |||
"context" | |||
"github.com/grafana/grafana/pkg/services/ngalert/schedule" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mark with // LOGZ.IO GRAFANA CHANGE
pkg/services/ngalert/api/api.go
Outdated
@@ -75,6 +76,7 @@ type API struct { | |||
Tracer tracing.Tracer | |||
AppUrl *url.URL | |||
UpgradeService migration.UpgradeService | |||
Schedule schedule.ScheduleService |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mark with // LOGZ.IO GRAFANA CHANGE
} | ||
|
||
// NewLogzioAlertingApi creates a new LogzioAlertingApi instance | ||
func NewLogzioAlertingApi(service *LogzioAlertingService) *LogzioAlertingApi { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we say its redundant? (leave only the service)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the structure of all APIs in this project. there is a class of "XXXApi" that has the route to the endpoint, and the function to call the service class that contains the main logic.
i think what we said is that maybe the LogzioAlertingService can be redundant and we can put the logic directly in this class. but actually i think this structure has some sense in it.. also i'm pretty sure we'll want it also for later when more logic/apis will be added (or at least it gives the infra for it).
@@ -4,6 +4,9 @@ import ( | |||
"context" | |||
"encoding/json" | |||
"fmt" | |||
|
|||
"github.com/grafana/grafana/pkg/services/ngalert/models" | |||
// LOGZ.IO GRAFANA CHANGE :: End |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the start :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably some merge messed this up 😅 will fix it 👍
@@ -467,6 +551,8 @@ func (sch *schedule) ruleRoutine(grafanaCtx context.Context, key ngmodels.AlertR | |||
attribute.Int64("results", int64(len(results))), | |||
)) | |||
} | |||
|
|||
logger.Debug("RuleRoutine evaluation: processing evaluation results") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should cleanup eventually all the logger.Debug(...) and comments ....
pkg/promlib/querydata/request.go
Outdated
traceCtx, end := s.trace(ctx, q) | ||
defer end() | ||
|
||
logger := s.log.FromContext(traceCtx) | ||
logger.Debug("Sending query", "start", q.Start, "end", q.End, "step", q.Step, "query", q.Expr) | ||
logger.Debug("fetching query", "query", q, "headers", headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix this to be the original - for some reason didn't have the log so i added it , not sure how this happened
…lert-eval-api # Conflicts: # pkg/services/ngalert/api/api.go # pkg/services/ngalert/schedule/schedule.go # pkg/setting/setting_unified_alerting.go
69867a8
to
46080fc
Compare
* Renamed logzio files to same standard * Changed request object for logzio router notification API and changed default url behaviour
What is this feature?
Logzio Alert Evaluation API
Explaining the main logic and changes:
We added a new internal api for evaluating alerts externally from the alert-manager.
External in this meaning is that the evaluation is called outside the alert-manager and not as part of the engine schedule as it is today.
To be able to reuse the evaluation code as much as possible, we are using the ScheduleService code to run alert evaluation.
In order to use ScheduleService, but without having it running the evaluation, I needed to add a configuration not to execute the evaluations from the scheduler. That way we have all the management of the rule routines and updates, but only the execution is disabled, and then can run from our API.
For running the external evaluation I also added another function for the Scheduler interface that the API is using.
External notification support -
In order to send the notification in an external API, added a LogzioRouter that implements the Send interface which sends the PostableAlerts to the url coming from configuration
Logzio datasources support required the following changes:
Why do we need this feature?
metrics alerts evaluation process
Who is this feature for?
Logzio Metrics Alerts
Which issue(s) does this PR fix?:
DEV-43744
DEV-43883
DEV-43889
Added code from the following tickets: