You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implementing the following features would improve the performance of the indexer manager significantly:
Before jobs are submitted to lsf, the log directory should be wiped off. If logfiles are being overwritten it is more difficult to parse failed jobs.
When checking for statuses of jobs the script needs to make more resilient for missing statuses.
When the script finds a failed jobs it should check why the job failed and take appropriate action.
If a study stuck in UNKNW (unknown) status, it should be killed and resubmitted.
Certain, seemingly active jobs with RUN status are just hanging. I think it make sense to implement a test to see if the job is still actively running or not, if a job doesn't do anything for days, it should be restarted.
The text was updated successfully, but these errors were encountered:
Deleting old log files is now part of the data release plan: before submitting the jobs, old logfiles are now deleted as part of the solr indexing tasks.
Implementing the following features would improve the performance of the indexer manager significantly:
UNKNW
(unknown) status, it should be killed and resubmitted.RUN
status are just hanging. I think it make sense to implement a test to see if the job is still actively running or not, if a job doesn't do anything for days, it should be restarted.The text was updated successfully, but these errors were encountered: