Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Token Expiry Check and Refresh using Refresh Tokens #310

Merged
merged 1 commit into from
Jul 15, 2024

Conversation

krezovic
Copy link
Contributor

@krezovic krezovic commented May 1, 2024

This merge request adds support for checking expiration of issued access tokens and token refresh using refresh tokens

Strict Token Expiration will result in a 401 when access token has been expired and either no refresh token has been provided or refresh token itself has expired or has been revoked. This option is not enabled by default. I am not sure if the user should be automatically logged out in this case and I'm even more unsure on how to achieve this via code besides HTTP 302 to /logout or whatever the endpoint is.

Access Token refresh is automatically enabled when using well known configuration and the server supports "refresh_token" grant. Otherwise, it has to be manually enabled via checkbox toggle. In addition to refreshing the access token, the id token and user info is also refreshed and any new or updated groups are added for the currently logged in user.

Fixes: #100

Testing done

Submitter checklist

Preview Give feedback

@krezovic krezovic requested a review from a team as a code owner May 1, 2024 19:56
pom.xml Outdated Show resolved Hide resolved
@michael-doubez michael-doubez linked an issue May 1, 2024 that may be closed by this pull request
@michael-doubez michael-doubez added enhancement draft Work in progress labels May 1, 2024
@krezovic krezovic force-pushed the token_refresh branch 4 times, most recently from e737987 to 1a5e4f2 Compare May 1, 2024 22:31
Copy link

codecov bot commented May 1, 2024

Codecov Report

Attention: Patch coverage is 62.58503% with 55 lines in your changes missing coverage. Please review.

Project coverage is 71.81%. Comparing base (bca3705) to head (f7dcfd0).
Report is 11 commits behind head on master.

Files Patch % Lines
...va/org/jenkinsci/plugins/oic/OicSecurityRealm.java 63.02% 30 Missing and 14 partials ⚠️
...java/org/jenkinsci/plugins/oic/OicCredentials.java 64.00% 8 Missing and 1 partial ⚠️
...gins/oic/WellKnownOpenIDConfigurationResponse.java 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #310      +/-   ##
============================================
- Coverage     72.46%   71.81%   -0.65%     
- Complexity      201      233      +32     
============================================
  Files             9       11       +2     
  Lines           839      990     +151     
  Branches        119      143      +24     
============================================
+ Hits            608      711     +103     
- Misses          170      201      +31     
- Partials         61       78      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@krezovic
Copy link
Contributor Author

krezovic commented May 1, 2024

Hi @michael-doubez, this is a working draft where the most basic and common functionality works:

  • Added "Enable/disable refresh token" support - a simple checkbox when "manual config", detected when "automatic config" by querying "supproted_grant_types" discovery response attribute
  • Added "strict" mode (disabled by default) that indicates whether an expired access token and missing refresh token, or simply failed refresh request (due to various reasons) will result in "401" or we simply ignore it
  • Added configurable clock skew that prolongs lifetime of an access token for a pre-configured amount of seconds to adjust for any time sync issues. Defaults to 60 seconds
  • Any updated roles/groups are reflected on every token refresh, as they should be

Now, before I move on to update documentation and write some tests, I'd like to get some feedback if the approach is correct. In addition, I have additional points that I have noticed but not sure if I should handle or how to handle at all.

  • First and most obvious point: In the build pipeline there is a warning that "accessToken" and "refreshToken" look sensitive and may be stored in plain text files. I have no idea where the SecurityContext is stored, so do you think we should mark the tokens as credentials?
  • There is a concurrency issue where two parallel requests will trigger two token refresh requests at the same time. I am unable to find a good place for a synchronization object that does not involve a manual clean up. Should I take care about this? If yes, do you happen to know if there's any place that I could place a synchronization primitive such as timestamp of first request that triggered the refresh, and to let any subsequent parallel requests just skip the refresh as it's "being handled"? Is there any "self-cleaning" cache used elsewhere in the project that I could hook into?

Just for fun, without any activity I keep my jenkins tab opened and UI triggers a lot of requests every x seconds, resulting in at least 2 token refresh requests without any activity

image

  • When a Token is expired, no refresh can be performed, and strict mode is on, should we simply redirect the user to the logout URL or do we clean up the session manually and simply "halt"?

Any additional feedback is welcome.

@michael-doubez
Copy link
Contributor

Hello,

Thanks for the work. You can ignore the warnings about secret leaking, they are mostly irrelevant.
I'll check the concurrency issue but if the information is local to a user context, there shouldn't be any issue.

There are a lot of features in the PR. I would break them down into:

  • handle session timeout: if refresh/offline token is unavailable or expired, how to clear session information and request re-login
  • handle additional (refresh) token lifetime : how to request it if available, as you did. It should be store in session information such as to be available globally, it may also be revoked at logout because lifetime is typically longer than access token
  • handle refresh of user info

I should be able to review in depth and (hopefuly) provide relevant feedback next Sunday.

@krezovic
Copy link
Contributor Author

krezovic commented May 2, 2024

Hello,

I believe I have handled most of the use-cases except the logout. The question remains on how do we proceed.

Answering the points in the order they were raised

  • Easiest way to do this is to request log out at IdP. This is an already supported feature. This will invalidate the refresh token, and next refresh will fail. This use-case has been handled and works without clearing the Jenkins related session stuff (as of yet - we have to agree on how to proceed)
  • Refresh tokens may be extended indefinitely as long as the session is active on the IdP. I wouldn't go beyond what was done here as described in step 1 - refresh token returns "invalid_grant" -> assume RT is expired or revoked -> destroy the user session. User may be logged elsewhere out of IdP and detecting this is a bit more bothersome than I'd like to go into - but there are standards on how to handle this scenario (see below)
  • Refresh of user info and ID token has already been incorporated in the MR and I re-used the existing userinfo request code and IdToken decoding (without the validations - but that can be re-used if needed, as well)

Back to the first point, we can implement additional features if needed, namely

@michael-doubez
Copy link
Contributor

@krezovic the change looks good to me.

What bothers me most is the idtoken kept in the session information. It should be removed and kept only in the credentials but that can be ironed out later.

@krezovic krezovic force-pushed the token_refresh branch 2 times, most recently from fdea1a2 to 063ec59 Compare June 10, 2024 19:21
@krezovic krezovic changed the title [WIP] Implement Token Expiry Check and Refresh using Refresh Tokens Implement Token Expiry Check and Refresh using Refresh Tokens Jun 10, 2024
@krezovic
Copy link
Contributor Author

Thanks for the feedback @michael-doubez. I have removed IdToken from Session and moved it strictly to SecurityContext.

Thank you @jglick for spotting missing Serializable and for bumping this MR as I completely forgot about it.

With that said and done, the MR is now ready for review and hopefully merging.

@krezovic
Copy link
Contributor Author

krezovic commented Jun 13, 2024

Thanks to @jglick for brief explanation on what I've missed and the original purpose of issue #100.

The plugin will now serialize OIDC tokens into User object so they are valid when accessing the instance with API Keys.

Logging with an API Key will also trigger token refresh if AT has expired. If RT has also expired or has been revoked, the request made with API Key will correctly return 401.

Downside of this is that all API Keys will automatically become invalid when "Log out from OIDC Provider" option has been selected. For this scenario, I have nullified all token values in this scenario so no token refresh will be attempted but the expiration logic will correctly detect an expired token.

Edit: I am not sure how to write a test for this scenario. I am unable to find any documentation for public REST API that may return 401/403 for API Tokens with Jenkins Test Instance (I'd expect at least a "whoami" request to exist). Any help on this topic is welcome.

@mikecirioli
Copy link
Contributor

I've been playing with this in a local cluster and the easiest way i've found is to:

  • Create a job or folder that your test user should be able to access
  • set the AT expiration to something short (60s)
  • use the cli command list-jobs (using an access token) - verify that you can see the job
  • wait 60 seconds (or manually invalidate the access token
  • use the cli command list-jobs (using an access token) - verify that you now get a 403 type response
    You can also use the HTTP api to do the same list-jobs check

Copy link
Contributor

@michael-doubez michael-doubez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krezovic
Copy link
Contributor Author

I've been playing with this in a local cluster and the easiest way i've found is to:

  • Create a job or folder that your test user should be able to access
  • set the AT expiration to something short (60s)
  • use the cli command list-jobs (using an access token) - verify that you can see the job
  • wait 60 seconds (or manually invalidate the access token
  • use the cli command list-jobs (using an access token) - verify that you now get a 403 type response
    You can also use the HTTP api to do the same list-jobs check

Hi. I was more asking in context of "integration test" jenkins instance (in PluginTest, annotated with @JenkinsRule). Which is why I said I'd prefer a REST API and that I do not have to configure or enable security manually on the test instance - not sure what the consequences will be.

Do note that I can easily call the HTML page users/testUser (which is the uid of the test user that's created), but this works both with authentication and without any authentication - so I'm kinda in conflict.

@mikecirioli
Copy link
Contributor

mikecirioli commented Jun 14, 2024

I'll try to take a look later today or this weekend and see if i can help contribute a test. So far, i have not been able to successfully get the plugin to refresh my users authorities after i have removed them from the IDP, and access token remains working.

Strict token expiration does successfully revoke jenkins api token access when the primary token expires.

UPDATE:

I can now see that the granted authorities change when i remove the user from the org. I need to tweak my auth strategy in jenkins to fully test though because the token requests still show authenticated which i'm not sure is what i expect.

if (this.logoutFromOpenidProvider && !Strings.isNullOrEmpty(this.endSessionEndpoint)) {
// This ensures that token will be expired at the right time with API Key calls, but no refresh can be
// made.
user.addProperty(new OicCredentials(null, null, null, credentials.getExpiresAtMillis()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalidating the refresh token here seems to cause jenkins to fall back to lastGrantedAuthorities for the user when doing token based access. This means (afaict in my testing) that tokens will still be effectively valid even if that access was removed at the idp level. The new authorities are only propagated to jenkins if the user (whose access was removed) attempts to log back in again.

If the refreshToken is left persisted in the user properties then it will be used to refresh the API token (once the id token as expired) and at that point it will get the updated authorities for the user associated with the token. For example, if their access to the particular jenkins instance was updated at the IdP then this will correctly get reflected in their jenkins user's grantedAuthorities

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, it seems that if the refresh token is saved then the next time the user tried to log directly into the instance then that token will be used and the credentials refreshed and the user will not be prompted to log back in at the idp, which doesn't really seem ideal either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalidating the refresh token here seems to cause jenkins to fall back to lastGrantedAuthorities for the user when doing token based access. This means (afaict in my testing) that tokens will still be effectively valid even if that access was removed at the idp level. The new authorities are only propagated to jenkins if the user (whose access was removed) attempts to log back in again.

If the refreshToken is left persisted in the user properties then it will be used to refresh the API token (once the id token as expired) and at that point it will get the updated authorities for the user associated with the token. For example, if their access to the particular jenkins instance was updated at the IdP then this will correctly get reflected in their jenkins user's grantedAuthorities

This is only true if "strict expiration check" is disabled - which is default at the moment. Once it's changed to require refresh token by default, user will be greeted with 403.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, it seems that if the refresh token is saved then the next time the user tried to log directly into the instance then that token will be used and the credentials refreshed and the user will not be prompted to log back in at the idp, which doesn't really seem ideal either.

I'm not sure why do you see this a problem. This is exactly the purpose of refresh token. However, if the user logs out at IdP, the refresh token will be invalidated and the user will be greeted with 403/redirect to login.

@krezovic krezovic force-pushed the token_refresh branch 2 times, most recently from 75ee08a to b7cf91c Compare June 23, 2024 13:18
michael-doubez
michael-doubez previously approved these changes Jun 24, 2024
Copy link
Contributor

@michael-doubez michael-doubez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mikecirioli mikecirioli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for the discussion

@mikecirioli
Copy link
Contributor

@krezovic what else needs to be done in order to merge this PR? Is there something you need a hand with or are you just waiting for me feedback?

@michael-doubez
Copy link
Contributor

@krezovic conflicts needs to be resolved.

IMHO we can merge it as it is.

@michael-doubez michael-doubez removed the draft Work in progress label Jul 3, 2024
@michael-doubez
Copy link
Contributor

@krezovic I have been working on the merge, I should be able to contribute it before the end of the week

@michael-doubez
Copy link
Contributor

michael-doubez commented Jul 11, 2024

See https://github.com/michael-doubez/oic-auth-plugin/tree/token_refresh

Note: some unit test still to adapt/fix

@krezovic
Copy link
Contributor Author

See https://github.com/michael-doubez/oic-auth-plugin/tree/token_refresh

Note: some unit test still to adapt/fix

Hi. Thanks for doing the heavy lifting. I was already halfway through and was hoping to finish it over the weekend. Your changes to test really helped and now I have finished fixing the tests and took a bit different approach than you in passing the nullable refresh token.

Copy link
Contributor

@michael-doubez michael-doubez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michael-doubez michael-doubez merged commit cddbd8a into jenkinsci:master Jul 15, 2024
16 of 18 checks passed
Vlatombe added a commit to Vlatombe/oic-auth-plugin that referenced this pull request Jul 19, 2024
…quired

In a CloudBees CI HA setup, I recently upgraded to 4.297.vcddb_d8a_e4694 (including jenkinsci#310). The various changes to OicSecurityRealm broke serialization due to usage of an anonymous inner class (extending OicSession) that gets stored in session.

Storing `OicSession` in session is actually only required for a brief amount of time, only between doCommenceLogin and doFinishLogin. Once the user is logged in, it is no longer necessary to store the whole object. The only thing that needs to be persisted is the state, as it gets used later for logging out.
Vlatombe added a commit to Vlatombe/oic-auth-plugin that referenced this pull request Jul 19, 2024
…quired

In a CloudBees CI HA setup, I recently upgraded to 4.297.vcddb_d8a_e4694 (including jenkinsci#310). The various changes to OicSecurityRealm broke serialization due to usage of an anonymous inner class (extending OicSession) that gets stored in session.

Storing `OicSession` in session is actually only required for a brief amount of time, only between doCommenceLogin and doFinishLogin. Once the user is logged in, it is no longer necessary to store the whole object. The only thing that needs to be persisted is the state, as it gets used later for logging out.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OIDC logins are cached forever
4 participants