Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kie-issues#777: Allow restricting jenkins agent labels... #1165

Merged
merged 1 commit into from
Jan 31, 2024
Merged

kie-issues#777: Allow restricting jenkins agent labels... #1165

merged 1 commit into from
Jan 31, 2024

Conversation

cimbalek
Copy link
Contributor

@cimbalek cimbalek commented Jan 26, 2024

#777
This change should allow restricting on which nodes pipeline can run. If environment variable 'FAULTY_NODES' is set or inherited from parent folders, nodes listed here will be added to the label expression to avoid them. This way we can have a single source (KIE folder in jenkins) when we can exclude faulty nodes before Apache admins remove the node, which is sometimes lengthy procedure.

@cimbalek cimbalek added the WIP label Jan 26, 2024
@cimbalek cimbalek removed the WIP label Jan 26, 2024
@cimbalek cimbalek changed the title WIP: [kie-issues#777] Allow restricting jenkins agent labels... [kie-issues#777] Allow restricting jenkins agent labels... Jan 26, 2024
@cimbalek cimbalek changed the title [kie-issues#777] Allow restricting jenkins agent labels... kie-issues#777: Allow restricting jenkins agent labels... Jan 26, 2024
@cimbalek
Copy link
Contributor Author

Should be working correctly.

  • History of basic testing jobs can be found here
  • Clonned Kogito nightly job which fails in tests (Because of use of the library function) works ok here as well (Label and FAULTY_NODE is set as contradiction purposely to see in the log how it translated to actual label)

@cimbalek
Copy link
Contributor Author

Possible next steps: Update jobs using docker agent to also use a label option

@rodrigonull
Copy link
Contributor

Thank you for the PR @cimbalek. The code LGTM, although I think it could be simplified using @jstastny-cz suggestion to use the env.AGENT_LABEL and set that env var in the KIE folder on Jenkins: https://ci-builds.apache.org/job/KIE/. In this case we wouldn't need to have a function to get the labels we would just need to set label env.AGENT_LABEL in all Jenkins / groovy files so then when we have a faulty node we could just go to the KIE folder configuration and update the AGENT_LABEL env var there to exclude the faulty nodes.
Using that approach we remove the need of an extra env var for the faulty nodes. WDYT?

@cimbalek
Copy link
Contributor Author

@rodrigonull I chose this approach because not all jobs in the future might need to use the same label. This way in the KIE folder (you mentioned) you set variable FAULTY_NODES instead of AGENT_LABEL. You then set the label you want for the job, and call the function, which then adds expression to avoid those faulty nodes. In the case faulty node appears, the process is the same. You go to KIE folder (or other parent folder) and update the FAULTY_NODES variable. No need to touch the code.

Copy link
Contributor

@rodrigonull rodrigonull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM. You just need to mock the getLabel function in the JenkinsfileNightly.groovy file for the tests to pass.

@@ -2,7 +2,7 @@ import org.jenkinsci.plugins.workflow.libs.Library

pipeline {
agent {
label 'ubuntu'
label util.getLabel('ubuntu')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file the library is being loaded after the agent section which will cause an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it better like this?

Copy link
Contributor

@domhanak domhanak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the docs and readme so that they explain how to use this variable and what is does.

@cimbalek
Copy link
Contributor Author

cimbalek commented Jan 30, 2024

Please update the docs and readme so that they explain how to use this variable and what is does.

@domhanak can you point me to a place where it would be best to mention this change? I was looking into documentation to update it and mention this change there, but it doesn't go deep into implementation details

@jstastny-cz
Copy link
Contributor

Taking a quick look.

Having the FAULTY_NODES env var is not a problem IMO, as long as it does not surface all around the place (which thanks to the library method is not the case at the moment). The footprint of the library method is minimal in the jenkinsfiles, at the same time, we do NOT want to bind this to the DSL processing phase, we'd need a pipelines seeds run in case of changing the filter.

One catch here is that in some cases the node might be failing only in certain aspects like docker env, but that does not affect pipelines that don't use docker (even the kogito-ci-build docker image we have), example is pure "orchestrating" pipeline, just triggering other jobs. Though this construct should serve only as a temporary workaround until things are figured out by the infra team, so I think we can take the blow and apply generally.

I'll take a closer look later today.

@domhanak
Copy link
Contributor

domhanak commented Jan 30, 2024

@cimbalek I guess here - https://github.com/apache/incubator-kie-kogito-pipelines/?tab=readme-ov-file#configuration-of-pipelines

Maybe add someting like ( just a suggestion, but I think this could work ):

Available Environment variables
VARIABLE

  • Description
  • How to set
  • Example

@cimbalek
Copy link
Contributor Author

@jstastny-cz I get it. In that case we would need to define several labels at the level of KIE folder. OR - you still can override this variable anywhere down the directory tree. Either with empty one or an extended one

@domhanak Yes, that's all I found and it seemed odd to me to add there such detail. Usually such documentation is achieved by inline comments in jenkinsfile code and/or the library function javadoc,. Others WDYT?

@jstastny-cz
Copy link
Contributor

@cimbalek problem that Dominik correctly points out is - "where/why the hack the env var is set?!"
"Where" part is covered by possible code comments ( that it's set on a folder).
But the "why" part - looking at the folder in Jenkins and wondering why the env is here - it's not clear at all.

We have readmes or confluence page, both seem a valid place.

I am personally responsible for a few pipelines docs debts, so if you track this as followup issue, I'd contribute also other missing info, while we'd sync in the issue on preferred destination and format.

@cimbalek
Copy link
Contributor Author

OK, thank you both. Now I have a picture how it could look. I'm going to add some documentation, but I am not very familiar with documentation language, so change requests will be more than welcome.

@cimbalek
Copy link
Contributor Author

please see 5dec5bf

Suggestions are welcome

Copy link
Contributor

@jstastny-cz jstastny-cz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just small readme formatting issue.

README.md Outdated Show resolved Hide resolved
@jstastny-cz
Copy link
Contributor

@domhanak or @rodrigonull we need one more approving review here.

@rodrigonull
Copy link
Contributor

@domhanak or @rodrigonull we need one more approving review here.

@jstastny-cz @cimbalek Can we update the tests to get them passing?

@rodrigonull rodrigonull self-requested a review January 31, 2024 11:02
@jstastny-cz
Copy link
Contributor

@rodrigonull @cimbalek of course, unless there's a good justification on why failing PR check is not a problem. From a quick look it seems like a missing adjustment of tests after the changes, so should be fixed.

Copy link
Contributor

@domhanak domhanak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the docs issue, appreciated.

Let's have green builds before merging please

This change should allow restricting on which nodes pipeline can run. If environment variable 'FAULTY_NODES' is set or inherited from parent folders, nodes listed here will be added to the label expression to avoid them. This way we can have a single source (KIE folder in jenkins) when we can exclude faulty nodes before Apache admins remove the node, which is sometimes lengthy procedure.
@cimbalek cimbalek merged commit 3ca94f0 into apache:main Jan 31, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants