-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expression of trigger: Backup {#JOBTYPENAME} {#JOBNAME} in progress > {$VEEAM_JOB_HOURS_WARN:"{#JOBID}"}h #19
Comments
Hello @thmcon, When using last(LastStart) I was having the trigger active right when the backup was beginning ("early trigger"), which cleared after a few seconds. Using avg() resolved that, and I was not having any more "early triggers", but then only when using a time argument for the avg() function which was less than the data sampling rate. I have never had, however, a situation where it would trigger an alert several at 5 hours when the threshold was defined at 8 hours. It would either trigger right at the start of the job (and clear 5-10 seconds later), or if the job was actually running for more than 8 hours (one of the full backups, which runs on a weekly basis, takes around 12-14 hours, so I'd have the trigger active between the 8th hour and until the job completed). However, on the infrastructures that I'm managing, all the backups run on a daily basis (that backup job I mentioned is doing incrementals daily and a full on Saturdays). The normal "Backup jobs" run daily and the corresponding "Backup copy" run after them. You mentioned that the trigger is calculating "in the middle of the 5 days", so I would guess you have a bigger interval between backups which might affect the trigger behavior. Since I don't have a test infrastructure for this, please feel free to experiment with the trigger definition and see what works with you. Then please let us know what works and we could incorporate that information into the trigger. Before the testing, please use the template from this pull request here to make sure you are using the last version. @romainsi has yet to merge this (if he approves it) but I've tweaked that trigger on that version to try to avoid those "false triggers". Hope this helps. Regards. |
@aholiveira Thanks for your reply! I was using the main branch, which I get when I just click on the "code" tab in github (https://github.com/romainsi/zabbix-VB-R-SQL). I will use the one you have linked in your reply and then will investigate more in depth. Probably it will take some time until I can give a more detailed reply on that. |
Hello @aholiveira , my current observations and changes: let's assume that Veeam software might change the status flag sooner than the start time field and it takes some time until both values match the real situation I also oberved that in my scenario the value veeam.Jobs.LastStart for the copy job changes according to the backup job, but the the value veeam.Jobs.Result for the copy job always remains -1 (running) - even when the veeam gui says that it's idle. I also checked this in the Veeam database. That made my problems disappear. I will observe that further. Currently my trigger looks like this:
|
Hello @thmcon, Regards. |
Hello again, However, based on your input I've built a new expression that might trigger only in the correct conditions (the job is actually running for more than 8 hours). It should not trigger also at the start of the job. Expression rationale:
I've removed the use of avg() and am now only using last() Let me know how it goes. Regards. |
Hallo @aholiveira, thanks a lot!
Here you can see my copy jobs in veeam gui: this is my current version of veeam: I've updated my script and template to the pull request #17 you have linked above. I can let you know about the results tomorow after the backup jobs have been running tonight. Thanks & Regards, |
Good Morning, from the results of last night's backup I would say the trigger works correct now! There is still another issue with a job showing the wrong progress. I will create another issue for that one. |
Hello @thmcon, Regards. |
Hello, Regards. |
Hello @aholiveira, sorry for the delayed response.
No, I did never get "early triggers" in none of the different trigger settings (the first version in the template, my version and your latest version). I had that early trigger problem in my mind when I added the "last, #1" = "last, #2" expression for the job status item in my version of the trigger. My idea behind that was: the job status will be collected every 5 minutes and the trigger should only fire when the job status is below 0 (means job is running) two times in a row and the progress is not changing within 5 Minutes and the progress is not 100% and the duration is "too long" according to the macro value. My trigger expression was not perfect, at least regarding the assumption to have a job progress of at least 1% in 5 Minutes. Since I am using your last version of the expression, I only got this problem where my job in the database is 0% completed even when the veeam gui is saying it's idle. It is just this one job that is idle with 0 % progress. I think I have to apply the latest updates to veeam, but I can't do that on short notice. |
Also after upgrading veeam to the latest version I still do not get false "early triggers" |
Hi,
I receive strange problems regarding the trigger: Backup {#JOBTYPENAME} {#JOBNAME} in progress > {$VEEAM_JOB_HOURS_WARN:"{#JOBID}"}h
Jobs with duration < 5hrs get problems triggered that the job lasts longer than 8 hrs.
I investigated the trigger which is defined as:
last(/Veeam Backup And Replication/veeam.Jobs.Result['{#JOBTYPEID}','{#JOBID}'])<0 and (now()-avg(/Veeam Backup And Replication/veeam.Jobs.LastStart['{#JOBTYPEID}','{#JOBID}'],5m))>{$VEEAM_JOB_HOURS_WARN:"{#JOBID}"}*3600
I am pretty sure, the avg function is wrong here.
My thoughts about the current trigger logic:
For my understanding the avg-function should be removed completely and the trigger should work as expected like that:
last(/Veeam Backup And Replication/veeam.Jobs.Result['{#JOBTYPEID}','{#JOBID}'])<0 and (now()-last(/Veeam Backup And Replication/veeam.Jobs.LastStart['{#JOBTYPEID}','{#JOBID}']))>({$VEEAM_JOB_HOURS_WARN:"{#JOBID}"}*3600)
The text was updated successfully, but these errors were encountered: