Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh Token Concurrency issue #225

Closed
elestedt opened this issue Nov 13, 2024 · 9 comments · Fixed by #226 or #230
Closed

Refresh Token Concurrency issue #225

elestedt opened this issue Nov 13, 2024 · 9 comments · Fixed by #226 or #230
Assignees
Labels
bug Something isn't working

Comments

@elestedt
Copy link

Describe the bug
When we have multiple clients using the same account (read: pipelines in a CI environment) and it happens to try to refresh the token at the same time this seems to cause the refreshable token to become invalid and future refreshes fails

To Reproduce
Steps to reproduce the behavior:

  1. Save a user refresh token
  2. Read user token to validate it properly
  3. Wait for it to have expired so that a refresh is triggered
  4. Try to read a token from two different external processes at the same time

After these steps there is a chance that the stored token has been "corrupted" and can no longer be used.

Artifactory version 7.98.7
Plugin version 1.8.1
Vault version 1.17.3+ent

Expected behavior
Refresh events (when the refresh token is used to renew the generator token) should not cause the refresh token to become invalidated.

Additional context
I suspect this happens due to concurrency issues and the fact that a refresh token is invalid once it's been used (i.e. can only be used once).

@elestedt elestedt added the bug Something isn't working label Nov 13, 2024
@alexhung
Copy link
Member

@elestedt Thanks for the report. I'll try to reproduce this on our end.

alexhung added a commit that referenced this issue Nov 21, 2024
Add read/write mutex for refreshing access token
@elestedt
Copy link
Author

elestedt commented Nov 22, 2024

I'm not sure if this new version is the issue here, but after upgrading to this version I'm seeing 504 errors that we didn't have before - and the requests never reach Artifactory

❯ curl -X GET -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/artifactory/user_token/$SUSER -v
<html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>

The timeout is maybe 30s. It looks to be a vault internal error, and not something we ever got on 1.8.1.
I've omitted most of the output from a verbose log output . but it's noteworthy that there is no proxy in between.

@alexhung
Copy link
Member

@elestedt I think it's very unlikely due to the changes in 1.8.2. I added write mutex around the operations of updating the Vault store to ensure concurrency is correct. Nothing else was changed.

The error is from Nginx so I assume this is from Artifactory?

@elestedt
Copy link
Author

Actually the artifactory logs show nothing at all - the request never reaches Artifactory at all as far as I can see.

@alexhung
Copy link
Member

@elestedt In that case you're probably right that this is Vault server related. Perhaps there are clues in the verbose/trace logs.

@elestedt
Copy link
Author

elestedt commented Dec 4, 2024

@alexhung I reconfigured the entire vault intance and things are still not working properly .but now at least I can use the plugin again.

So, I configure the plugin for a user with a refreshable token. An hour later, when that token has expired and only the refreshable token is valid it fails to renew it.

Logs:

vault-0 vault 2024-12-04T10:22:03.591Z [INFO]  secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: fetching user token configuration: func=fetchUserTokenConfiguratio
n path=config/user_token/felested timestamp=2024-12-04T10:22:03.591Z                                                                                                                                                
vault-0 vault 2024-12-04T10:22:03.591Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: use checkVersion to see if access token is expired: func=RefreshEx
piredAccessToken timestamp=2024-12-04T10:22:03.591Z                                                                                                                                                                 
vault-0 vault 2024-12-04T10:22:03.591Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: fetching Artifactory version: func=getVersion timestamp=2024-12-04
T10:22:03.591Z                                                                                                                                                                                                      
vault-0 vault 2024-12-04T10:22:03.595Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: found Artifactory version: func=getVersion version=7.98.7 timestam
p=2024-12-04T10:22:03.595Z                                                                                                                                                                                          
vault-0 vault 2024-12-04T10:22:03.596Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: initialize maxLeaseTTL to system value: func=pathUserTokenCreatePe
rform maxLeaseTTL="2.7648e+15" timestamp=2024-12-04T10:22:03.596Z                                                                                                                                                   
vault-0 vault 2024-12-04T10:22:03.596Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: using user token config MaxTTL: userTokenConfig.MaxTTL="1.44e+13" 
func=pathUserTokenCreatePerform timestamp=2024-12-04T10:22:03.596Z                                                                                                                                                  
vault-0 vault 2024-12-04T10:22:03.596Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: Max lease TTL (sec): func=pathUserTokenCreatePerform maxLeaseTTL="
1.44e+13" timestamp=2024-12-04T10:22:03.596Z                                                                                                                                                                        
vault-0 vault 2024-12-04T10:22:03.596Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: using user config DefaultTTL: func=pathUserTokenCreatePerform user
TokenConfig.DefaultTTL="3.6e+12" timestamp=2024-12-04T10:22:03.596Z                                                                                                                                                 
vault-0 vault 2024-12-04T10:22:03.596Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: TTL (sec): func=pathUserTokenCreatePerform ttl="3.6e+12" timestamp
=2024-12-04T10:22:03.596Z                                                                                                                                                                                           
vault-0 vault 2024-12-04T10:22:03.596Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: fetching Artifactory version: func=getVersion timestamp=2024-12-04
T10:22:03.596Z                                                                                                                                                                                                      
vault-0 vault 2024-12-04T10:22:03.602Z [DEBUG] secrets.artifactory.artifactory_de8b22ec.artifactory.artifactory-secrets-plugin_1.8.2_linux_amd64: found Artifactory version: func=getVersion version=7.98.7 timestam
p=2024-12-04T10:22:03.602Z

Plugin config:

vault write artifactory/config/admin url=https://our-hostname
vault write artifactory/config/admin use_expiring_tokens=true
vault write artifactory/config/admin default_description="Generated by Vault" max_ttl=14400 default_ttl=3600
vault write artifactory/config/user_token use_expiring_tokens=true
vault write artifactory/config/user_token default_description="Generated by Vault" max_ttl=14400 default_ttl=3600
vault write artifactory/config/user_token scope="applied-permissions/groups:*"
vault write artifactory/config/user_token audience="jfrt@* jfxr@*"

Command used to get token:

curl -X GET -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/artifactory/user_token/$SUSER

Seems pretty clear that it fails to refresh the expired refreshable token, why I cannot say.

@alexhung alexhung reopened this Dec 4, 2024
@elestedt
Copy link
Author

elestedt commented Dec 4, 2024

I did manage to find these request in our Artifactory logs - just a single request to the token endpoint and it's rejected with a 401. I tried to capture the traffic - but didn't manage (encrypted)

@elestedt
Copy link
Author

elestedt commented Dec 6, 2024

After upgrading prod to 1.8.2 - more and more users are starting to report Context Deadline Exceeded responses from vault itself. It seems to be a deadlock in the mutex that was added. There is no logs in the plugin when this happens and it's impossible to do anything against the plugin... The rest of vault works fine.

@alexhung
Copy link
Member

alexhung commented Dec 6, 2024

@elestedt From my testing, I ran into this issue (that wasn't present when I implemented the token refresh) where soon after the token expired, the requests to the Artifactory API still works (i.e. no token expired error) but fails with requests to Access API. The delay (of token expiration between Access and Artifactory services) lasted about 1 min, so using the Artifactory version API passes while the create token API with Access fails.

This accounts for the behavior you reported so I'm switching from the version API to role API (which is from Access service). Hopefully the delay will not be presented and thus more reliable.

Since the issue is unrelated to the additional mutex I added, they will be removed along with the API switch.

alexhung added a commit that referenced this issue Dec 9, 2024
Switched to use Access API for checking expired token
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants