Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asserting that ingest exports valid provenance metadata #509

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

rolando-ebi
Copy link
Contributor

No description provided.

@@ -9,7 +9,7 @@ before_script:
- apt-get -y update
- apt-get -y install jq
- pip install -r requirements.txt
- export DEPLOYMENT_ENV=$CI_COMMIT_REF_NAME
- export DEPLOYMENT_ENV=integration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$CI_COMMIT_REF_NAME resolves to the deployment being tested in the scheduled dcp-wide integration tests, e.g. integration, staging, and prod. Shouldn't this change be reverted to continue testing in the other environments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a temporary change to just get the test running and passing in gitlab. It should be reverted once the changes are ready to merge

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An intermittent issue un-related to the changes caused the previous test failure for this branch. Will re-run and post a link to the completed pipeline

@parthshahva
Copy link
Contributor

What is the status of this PR?

@maniarathi
Copy link
Contributor

Hey @rolando-ebi , somehow I wasn't aware of this ticket getting assigned to me. Do you remember what this was about?

Copy link
Contributor

@maniarathi maniarathi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the long delay on this review! Better late than never right? :)

@@ -25,6 +25,7 @@ dcp_wide_test_SS2:
only:
- integration
- staging
- validate-schema-versions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm a little confused why validate-schema-versions is an environment listed here and likewise below?

@@ -228,6 +230,20 @@ def wait_for_primary_bundles(self):
raise RuntimeError(f'Expected {self.expected_bundle_count} primary bundles, but only '
f'got {primary_bundles_count}')

def assert_valid_schema_versions_in_provenance(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please move the assert function down to join the group of other assert functions below? (Roughly line 387).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps also write a small docstring for this function please so that folks who aren't familiar with the schema versions in the provenance can understand what is being validated here?

for metadata_file in metadata_files:
schema_url = metadata_file["describedBy"]
schema = requests.get(schema_url).json()
validate(metadata_file, schema=schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validate function won't actually check to make sure that schema and the major and minor versions of the metadata file will match because in the schema, the pattern just says something like "^(http|https)://schema.(.?)humancellatlas.org/type/biomaterial/(([0-9]{1,}.[0-9]{1,}.[0-9]{1,})|([a-zA-Z]?))/donor_organism".

At least this is what I think. Have you verified to make sure this test fails if there is a mismatch?

schema = requests.get(schema_url).json()
validate(metadata_file, schema=schema)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete extra line break here.

from ..data_store_agent import DataStoreAgent
from ..dataset_fixture import DatasetFixture
from ..dataset_runner import DatasetRunner
from tests.utils import Progress, Timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these changes are not needed once you are done testing in your own environment. Maybe remove them from the PR and keep these changes local/untracked?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants