Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated copy feature/OIDC oauth groups #1199

Open
wants to merge 44 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
25d1654
added plan
uwwint Aug 20, 2024
96300d4
Use idp groups
flashguerdon Aug 27, 2024
d8d0955
remove print
flashguerdon Aug 28, 2024
5dfdd7b
tests updates
flashguerdon Aug 28, 2024
05c04c8
test fixes
flashguerdon Aug 28, 2024
724b76f
Code update and Unit Test
flashguerdon Aug 28, 2024
dd8efdd
merge upstream
uwwint Sep 11, 2024
388cf69
check group flag
flashguerdon Sep 12, 2024
d77cec3
testing
flashguerdon Sep 12, 2024
bcbcee5
Unit Test updates
flashguerdon Sep 12, 2024
c5c0519
token refreshing
flashguerdon Sep 13, 2024
5ec4fb7
refresh token update
flashguerdon Sep 16, 2024
6f30aeb
Renaming to AccessTokenUpdater, unit testing
flashguerdon Sep 17, 2024
10c59ec
fixed tests
flashguerdon Sep 23, 2024
d3419d5
Updated tests
flashguerdon Sep 23, 2024
3692d79
Use OAuth user groups and implementation of token refresh in fence.
flashguerdon Sep 24, 2024
2984f05
Update fence/resources/openid/idp_oauth2.py
flashguerdon Oct 7, 2024
b125321
Code revision and refactoring
flashguerdon Oct 7, 2024
18336df
Remove new Arborist client instance
flashguerdon Oct 13, 2024
4645494
Re-add Arborist client, to fix fence_create update-visas job
flashguerdon Oct 14, 2024
7008e94
2nd revision
flashguerdon Oct 31, 2024
d0074f5
check group sync config on startup
flashguerdon Nov 6, 2024
55cfdc4
added test for generic3
flashguerdon Nov 7, 2024
ab7dcfa
Add link to user.yaml guide
paulineribeyre Oct 18, 2024
8520425
feat: config with option to allow only existing OR active users to login
pieterlukasse Jul 2, 2024
7774fc9
feat: remove unnecessary else
pieterlukasse Oct 24, 2024
7300d99
fix: remove unnecessary code
pieterlukasse Nov 4, 2024
3869656
feat: add extra fields to /admin/user POST endpoint
pieterlukasse Sep 6, 2024
39d5217
fix: fix tests/admin
pieterlukasse Oct 14, 2024
ab6e17d
feat: add extra debug logging to create_user method
pieterlukasse Oct 15, 2024
f0f9d28
fix: add session.commit() to create_user
pieterlukasse Oct 15, 2024
0d72ec7
fix: store tags and add unit test for tags and new fields
pieterlukasse Oct 18, 2024
659bf5a
feat: update dependencies
pieterlukasse Oct 22, 2024
d791800
fix: add docstring to new test
pieterlukasse Nov 4, 2024
f1b8e31
feat: improve unit test checks on error messages
pieterlukasse Nov 4, 2024
bd28aba
Fix/bucket name (#1193)
mfshao Oct 28, 2024
67c318c
Update documentation link in setup.md (#1194)
ocshawn Oct 30, 2024
436c08a
feat: udpate admin_login_required decorator
pieterlukasse Oct 17, 2024
d546b02
fix: update /admin/user tests to mock arborist call
pieterlukasse Nov 5, 2024
cd810a1
feat: add rainy path test for when arborist check fails
pieterlukasse Nov 5, 2024
a5ccac6
Merge branch 'master' into feature/oidc-oauth-groups
flashguerdon Nov 8, 2024
90667c3
reverted host option
flashguerdon Nov 10, 2024
3cb11a7
Merge branch 'master' into feature/oidc-oauth-groups
flashguerdon Nov 12, 2024
4a206ae
Merge branch 'master' into automatedCopy-feature/oidc-oauth-groups
Avantol13 Dec 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,6 @@ tests/resources/keys/*.pem
.DS_Store
.vscode
.idea

# snyk
.dccache
1 change: 1 addition & 0 deletions fence/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -470,6 +470,7 @@ def _setup_oidc_clients(app):
logger=logger,
HTTP_PROXY=config.get("HTTP_PROXY"),
idp=settings.get("name") or idp.title(),
arborist=app.arborist,
)
clean_idp = idp.lower().replace(" ", "")
setattr(app, f"{clean_idp}_client", client)
Expand Down
188 changes: 184 additions & 4 deletions fence/blueprints/login/base.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
import time
import base64
import json
from urllib.parse import urlparse, urlencode, parse_qsl
import jwt
import requests
import flask
from cdislogging import get_logger
from flask_restful import Resource
from urllib.parse import urlparse, urlencode, parse_qsl

from fence.auth import login_user
from fence.blueprints.login.redirect import validate_redirect
from fence.config import config
Expand All @@ -20,7 +24,7 @@ def __init__(self, idp_name, client):
Args:
idp_name (str): name for the identity provider
client (fence.resources.openid.idp_oauth2.Oauth2ClientBase):
Some instaniation of this base client class or a child class
Some instantiation of this base client class or a child class
"""
self.idp_name = idp_name
self.client = client
Expand Down Expand Up @@ -92,8 +96,27 @@ def __init__(
self.is_mfa_enabled = "multifactor_auth_claim_info" in config[
"OPENID_CONNECT"
].get(self.idp_name, {})

# Config option to explicitly persist refresh tokens
self.persist_refresh_token = False

self.read_authz_groups_from_tokens = False

self.app = app

# This block of code probably need to be made more concise
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this cleanup before we merge?

if "persist_refresh_token" in config["OPENID_CONNECT"].get(self.idp_name, {}):
self.persist_refresh_token = config["OPENID_CONNECT"][self.idp_name][
"persist_refresh_token"
]

if "is_authz_groups_sync_enabled" in config["OPENID_CONNECT"].get(
self.idp_name, {}
):
self.read_authz_groups_from_tokens = config["OPENID_CONNECT"][
self.idp_name
]["is_authz_groups_sync_enabled"]

def get(self):
# Check if user granted access
if flask.request.args.get("error"):
Expand All @@ -119,7 +142,11 @@ def get(self):

code = flask.request.args.get("code")
result = self.client.get_auth_info(code)

refresh_token = result.get("refresh_token")

username = result.get(self.username_field)

if not username:
raise UserError(
f"OAuth2 callback error: no '{self.username_field}' in {result}"
Expand All @@ -129,11 +156,157 @@ def get(self):
id_from_idp = result.get(self.id_from_idp_field)

resp = _login(username, self.idp_name, email=email, id_from_idp=id_from_idp)
self.post_login(user=flask.g.user, token_result=result, id_from_idp=id_from_idp)

expires = self.extract_exp(refresh_token)

# if the access token is not a JWT, or does not carry exp,
# default to now + REFRESH_TOKEN_EXPIRES_IN
if expires is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a log here so it's clear in an audit trail that we assumed a specific expiration

expires = int(time.time()) + config["REFRESH_TOKEN_EXPIRES_IN"]

# Store refresh token in db
should_persist_token = (
self.persist_refresh_token or self.read_authz_groups_from_tokens
)
if should_persist_token:
# Ensure flask.g.user exists to avoid a potential AttributeError
if getattr(flask.g, "user", None):
self.client.store_refresh_token(flask.g.user, refresh_token, expires)
else:
logger.error(
"User information is missing from flask.g; cannot store refresh token."
)

self.post_login(
user=flask.g.user,
token_result=result,
id_from_idp=id_from_idp,
)

return resp

def extract_exp(self, refresh_token):
"""
Extract the expiration time (`exp`) from a refresh token.

This function attempts to retrieve the expiration time from the provided
refresh token using three methods:

1. Using PyJWT to decode the token (without signature verification).
2. Introspecting the token (if supported by the identity provider).
3. Manually base64 decoding the token's payload (if it's a JWT).

**Disclaimer:** This function assumes that the refresh token is valid and
does not perform any JWT validation. For JWTs from an OpenID Connect (OIDC)
provider, validation should be done using the public keys provided by the
identity provider (from the JWKS endpoint) before using this function to
extract the expiration time. Without validation, the token's integrity and
authenticity cannot be guaranteed, which may expose your system to security
risks. Ensure validation is handled prior to calling this function,
especially in any public or production-facing contexts.

Args:
refresh_token (str): The JWT refresh token from which to extract the expiration.

Returns:
int or None: The expiration time (`exp`) in seconds since the epoch,
or None if extraction fails.
"""

# Method 1: PyJWT
try:
# Skipping keys since we're not verifying the signature
decoded_refresh_token = jwt.decode(
refresh_token,
options={
"verify_aud": False,
"verify_at_hash": False,
"verify_signature": False,
},
algorithms=["RS256", "HS512"],
)
exp = decoded_refresh_token.get("exp")

if exp is not None:
return exp
except Exception as e:
logger.info(f"Refresh token expiry: Method (PyJWT) failed: {e}")

# Method 2: Introspection
try:
introspection_response = self.introspect_token(refresh_token)
exp = introspection_response.get("exp")

if exp is not None:
return exp
except Exception as e:
logger.info(f"Refresh token expiry: Method Introspection failed: {e}")

# Method 3: Manual base64 decoding
try:
# Assuming the token is a JWT (header.payload.signature)
payload_encoded = refresh_token.split(".")[1]
# Add necessary padding for base64 decoding
payload_encoded += "=" * (4 - len(payload_encoded) % 4)
payload_decoded = base64.urlsafe_b64decode(payload_encoded)
payload_json = json.loads(payload_decoded)
exp = payload_json.get("exp")

if exp is not None:
return exp
except Exception as e:
logger.info(f"Method 3 (Manual decoding) failed: {e}")

# If all methods fail, return None
return None

def introspect_token(self, token):
"""Introspects an access token to determine its validity and retrieve associated metadata.

This method sends a POST request to the introspection endpoint specified in the OpenID
discovery document. The request includes the provided token and client credentials,
allowing verification of the token's validity and retrieval of any additional metadata
(e.g., token expiry, scopes, or user information).

Args:
token (str): The access token to be introspected.

Returns:
dict or None: A dictionary containing the token's introspection data if the request
is successful and the response status code is 200. If the introspection fails or an
exception occurs, returns None.

Raises:
Exception: Logs an error message if an error occurs during the introspection process.
"""
try:
introspect_endpoint = self.client.get_value_from_discovery_doc(
"introspection_endpoint", ""
)

# Headers and payload for the introspection request
headers = {"Content-Type": "application/x-www-form-urlencoded"}
data = {
"token": token,
"client_id": flask.session.get("client_id"),
"client_secret": flask.session.get("client_secret"),
}

response = requests.post(introspect_endpoint, headers=headers, data=data)

if response.status_code == 200:
return response.json()
else:
logger.info(f"Error introspecting token: {response.status_code}")
return None

except Exception as e:
logger.info(f"Error introspecting token: {e}")
return None

def post_login(self, user=None, token_result=None, **kwargs):
prepare_login_log(self.idp_name)

metrics.add_login_event(
user_sub=flask.g.user.id,
idp=self.idp_name,
Expand All @@ -142,6 +315,13 @@ def post_login(self, user=None, token_result=None, **kwargs):
client_id=flask.session.get("client_id"),
)

# this attribute is only applicable to some OAuth clients
# (e.g., not all clients need is_read_authz_groups_from_tokens_enabled)
if self.read_authz_groups_from_tokens:
self.client.update_user_authorization(
user=user, pkey_cache=None, db_session=None, idp_name=self.idp_name
)

if token_result:
username = token_result.get(self.username_field)
if self.is_mfa_enabled:
Expand Down
19 changes: 19 additions & 0 deletions fence/config-default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ DB_MIGRATION_POSTGRES_LOCK_KEY: 100
# - WARNING: Be careful changing the *_ALLOWED_SCOPES as you can break basic
# and optional functionality
# //////////////////////////////////////////////////////////////////////////////////////

OPENID_CONNECT:
# any OIDC IDP that does not differ from the generic implementation can be
# configured without code changes
Expand All @@ -115,6 +116,24 @@ OPENID_CONNECT:
multifactor_auth_claim_info: # optional, include if you're using arborist to enforce mfa on a per-file level
claim: '' # claims field that indicates mfa, either the acr or acm claim.
values: [ "" ] # possible values that indicate mfa was used. At least one value configured here is required to be in the token
# When true, it allows refresh tokens to be stored even if is_authz_groups_sync_enabled is set false.
# When false, the system will only store refresh tokens if is_authz_groups_sync_enabled is enabled
persist_refresh_token: false
# is_authz_groups_sync_enabled: A configuration flag that determines whether the application should
# verify and synchronize user group memberships between the identity provider (IdP)
# and the local authorization system (Arborist). When enabled, the refresh token is stored, the system retrieves
# the user's group information from their token issued by the IdP and compares it against
# the groups defined in the local system. Based on the comparison, the user is added to
# or removed from relevant groups in the local system to ensure their group memberships
# remain up-to-date. If this flag is disabled, no group synchronization occurs
is_authz_groups_sync_enabled: true
authz_groups_sync:
# This defines the prefix used to identify authorization groups.
group_prefix: "some_prefix"
# This flag indicates whether the audience (aud) claim in the JWT should be verified during token validation.
verify_aud: true
# This specifies the expected audience (aud) value for the JWT, ensuring that the token is intended for use with the 'fence' service.
audience: fence
# These Google values must be obtained from Google's Cloud Console
# Follow: https://developers.google.com/identity/protocols/OpenIDConnect
#
Expand Down
7 changes: 7 additions & 0 deletions fence/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,13 @@ def post_process(self):
f"IdP '{idp_id}' is using multifactor_auth_claim_info '{mfa_info['claim']}', which is neither AMR or ACR. Unable to determine if a user used MFA. Fence will continue and assume they have not used MFA."
)

groups_sync_enabled = idp.get("is_authz_groups_sync_enabled", False)
# when is_authz_groups_sync_enabled, then you must provide authz_groups_sync, with group prefix
if groups_sync_enabled and not idp.get("authz_groups_sync"):
error = f"Error: is_authz_groups_sync_enabled is enabled, required values not configured, for idp: {idp_id}"
logger.error(error)
raise Exception(error)

self._validate_parent_child_studies(self._configs["dbGaP"])

@staticmethod
Expand Down
8 changes: 7 additions & 1 deletion fence/error_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,13 @@

from fence.errors import APIError
from fence.config import config
import traceback


logger = get_logger(__name__)


def get_error_response(error):
def get_error_response(error: Exception):
details, status_code = get_error_details_and_status(error)
support_email = config.get("SUPPORT_EMAIL_FOR_ERRORS")
app_name = config.get("APP_NAME", "Gen3 Data Commons")
Expand All @@ -27,6 +28,11 @@ def get_error_response(error):
)
)

# TODO: Issue: Error messages are obfuscated, the line below needs be
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get this fixed before we merge?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

# uncommented when troubleshooting errors.
# Breaks tests if not commented out / removed. We need a fix for this.
# raise error

# don't include internal details in the public error message
# to do this, only include error messages for known http status codes
# that are less that 500
Expand Down
Loading
Loading