-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/dynamic aws param store to airflow dag #48
Feature/dynamic aws param store to airflow dag #48
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left one suggestion for error handling, but not sure if it's even a real error case. If this has been tested in SC already, LGTM
|
||
# If a wildcard is in the prefix string, extract and set the tenant code as the first path element. | ||
if self.TENANT_REPR in self._prefix: | ||
inferred_tenant = re.search(self._prefix.replace(self.TENANT_REPR, "(.*)"), p['Name']).group(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would there every be a case where no TENANT_REPR doesn't exist anywhere in p['Name']? If so, this would error, I think
also, this line 104 could use some more friendly commenting, but that's a nitpick :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more comments. I would like to come up with an example of the situation you describe so we can work around it.
* Update project version metadata. * Feature/Sharefile custom users DAG & S3 to Snowflake standardization (#50) * add sharefile dag and helpers * use S3ToSnowflake operator * use correct operator * fix metadata column name * datetime issues * rename to num_expected_files and make use optional * Feature/simplify sharefile to disk (#49) * Add type hints and comments; make ds_nodash and ts_nodash optional; reduce number of loops and simply variable-referencing. * Simplify Regex call. * Deprecate ds_nodash and ts_nodash from sharefile callable. * Raise HTTP errors for all ShareFile hook exceptions, instead of generic Airflow exceptions. * Update .gitignore. * add logging when deleting from sharefile * Add kwargs to sharefile_to_disk() for Airflow context paramter overflow. --------- Co-authored-by: johncmerfeld <[email protected]> * Feature/dynamic aws param store to airflow dag (#48) * Allow wildcards to be passed to SSM prefix strings (untested). * BugFix (wrong datatype for ParameterFilters). * Bugfix (revert logic when not using wildcards). * Minor cleanup. * Add 1 second sleep between page gets. * Ignore unexpected keys in collected connections. * Attempt to fix parameter filters to use both Recursive and Contains options. * Clean up error handling in AWS ParamStore DAG. * Cleanup to try to use a single filter. * Output missing paramstore keys in DAG. * Bugfix: set.difference() * Update metadata files. * Update ssm.py * Update CHANGELOG with missing updates. * Callable: S3 to Sharefile (#51) * init * tested code * update docs * use None instead of empty string * update allshared comment --------- Co-authored-by: Jay Kaiser <[email protected]> * Hotfix/support files in ftp (#52) * Overload FTP.download_all() to support files. * Fix file passed as remote_dir. * Update CHANGELOG. --------- Co-authored-by: Samantha LeBlanc <[email protected]> Co-authored-by: johncmerfeld <[email protected]>
Feature: Dynamic
AWSParamStoreToAirflowDAG
PathsDescription & motivation
The previous implementation of the
AWSParamStoreToAirflowDAG
has a serious limitation: the keys in Parameter Store had to be structured as{prefix}/{tenant_code}/secret
. This was the structure of keys historically, but newer batches of keys have placed tenant in the middle of the key (i.e.,{prefix}/{tenant_code}/{suffix}/secret
).This PR updates the DAG and associated SSM helper classes to accept either structure when building out connections.
PR Merge Priority:
This DAG is really only used in South Carolina and GSN to my understanding. I need this feature to populate the SC 2025 connections in Airflow, but I can temporarily checkout the branch in prod while it's under review.
Changes to existing files:
ea_airflow_util/callables/ssm.py
: Add conditionals to handle original and new path formats.ea_airflow_util/dags/aws_param_store_to_airflow_dag.py
: Improve DAG logging.Tests and QC done:
Ran successfully in South Carolina dev.
Future ToDos & Questions:
It is likely that this approach of retrieving credentials for Ed-Fi ODSes will be deprecated very soon.