Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(schedules): criteria for triggering a build #893

Merged
merged 43 commits into from
Jul 10, 2023
Merged

Conversation

jbrockopp
Copy link
Contributor

@jbrockopp jbrockopp commented Jun 26, 2023

Co-authored-by: @ecrupper

xref: go-vela/community#538

Part of go-vela/community#772

Based off of #846

This change attempts to address feedback provided by David in Slack.

NOTE: Below snippets show the current code in main that has bugs or causes issues with expected behavior.

First Time

The first change removes this code which would immediately trigger a build for a schedule if it hasn't already ran:

// create a variable to track if a build should be triggered based off the schedule
trigger := false
// check if a build has already been triggered for the schedule
if schedule.GetScheduledAt() == 0 {
// trigger a build for the schedule since one has not already been scheduled
trigger = true
} else {

This is likely unexpected behavior from a user stand point so we should minimize the amount of triggered builds.

Processing Schedules

The second change modifies the logic used to determine how often we scan the schedules table:

// cut the configured minimum frequency duration for schedules in half
//
// We need to sleep for some amount of time before we attempt to process schedules
// setup in the database. Since the minimum frequency is configurable, we cut it in
// half and use that as the base duration to determine how long to sleep for.
base := c.Duration("schedule-minimum-frequency") / 2
logrus.Infof("sleeping for %v before scheduling builds", base)
// sleep for a duration of time before processing schedules
//
// This should prevent multiple servers from processing schedules at the same time by
// leveraging a base duration along with a standard deviation of randomness a.k.a.
// "jitter". To create the jitter, we use the configured minimum frequency duration
// along with a scale factor of 0.1.
time.Sleep(wait.Jitter(base, 0.1))

Currently, we're leveraging the schedule-minimum-frequency flag and jitter to control this behavior:

&cli.DurationFlag{
EnvVars: []string{"VELA_SCHEDULE_MINIMUM_FREQUENCY", "SCHEDULE_MINIMUM_FREQUENCY"},
Name: "schedule-minimum-frequency",
Usage: "minimum time between each schedule entry",
Value: 1 * time.Hour,
},

However, this leads to complications for schedules setup to run at a specific time every hour.

Now, we'll leverage a completely separate schedule-interval flag to enable admins to configure this behavior.

Timed Schedules

The final change modifies the logic used to determine if we should trigger a build for a schedule:

// parse the previous occurrence of the entry for the schedule
prevTime, err := gronx.PrevTick(schedule.GetEntry(), true)
if err != nil {
logrus.WithError(err).Warnf("%s for %s", baseErr, schedule.GetName())
continue
}
// parse the next occurrence of the entry for the schedule
nextTime, err := gronx.NextTick(schedule.GetEntry(), true)
if err != nil {
logrus.WithError(err).Warnf("%s for %s", baseErr, schedule.GetName())
continue
}
// parse the UNIX timestamp from when the last build was triggered for the schedule
t := time.Unix(schedule.GetScheduledAt(), 0).UTC()
// check if the time since the last triggered build is greater than the entry duration for the schedule
if time.Since(t) > nextTime.Sub(prevTime) {
// trigger a build for the schedule since it has not previously ran
trigger = true
}

This causes issues because adhocore/gronx rounds to the nearest whole interval.

i.e. if the current time is 2:03 with a */5 * * * * schedule, PrevTick() will be 2:00 and NextTick() will be 2:05

@jbrockopp jbrockopp self-assigned this Jun 26, 2023
@codecov
Copy link

codecov bot commented Jun 26, 2023

Codecov Report

Merging #893 (b6ee9d7) into main (71a484d) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #893   +/-   ##
=======================================
  Coverage   71.47%   71.47%           
=======================================
  Files         304      304           
  Lines       12450    12450           
=======================================
  Hits         8899     8899           
  Misses       3120     3120           
  Partials      431      431           

@jbrockopp jbrockopp marked this pull request as ready for review June 26, 2023 20:08
JordanSussman
JordanSussman previously approved these changes Jul 6, 2023
cmd/vela-server/schedule.go Outdated Show resolved Hide resolved
cmd/vela-server/schedule.go Outdated Show resolved Hide resolved
cmd/vela-server/schedule.go Outdated Show resolved Hide resolved
cmd/vela-server/schedule.go Outdated Show resolved Hide resolved
cmd/vela-server/schedule.go Outdated Show resolved Hide resolved
cmd/vela-server/schedule.go Outdated Show resolved Hide resolved
@plyr4
Copy link
Contributor

plyr4 commented Jul 6, 2023

looks good, just some minor tweaks to use the schedule we fetch from the database while iterating

@JordanSussman
Copy link
Collaborator

@jbrockopp should the test-docker-compose.yml file be removed?

@jbrockopp
Copy link
Contributor Author

@JordanSussman yeah I plan to remove that but initially had it there for testing purposes for Easton and I.

I'm finalizing one last change here shortly and then I'll also remove that file.

Copy link
Member

@wass3rw3rk wass3rw3rk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks 👍🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants