Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FCL-644] Improve typing and code robustness #260

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jacksonj04
Copy link
Collaborator

@jacksonj04 jacksonj04 commented Feb 10, 2025

So that we're more confident making changes in future, add some missing type annotations to the codebase.

@jacksonj04 jacksonj04 force-pushed the FCL-644-ingester-should-assign-a-uuid-based-uri-to-documents-with-no-ncn branch from 9012489 to 609679f Compare February 10, 2025 18:00
@jacksonj04 jacksonj04 changed the title [FCL-644] Ingester should assign a UUID-based URI to documents [FCL-644] Improve typing and code robustness Feb 11, 2025
@jacksonj04 jacksonj04 force-pushed the FCL-644-ingester-should-assign-a-uuid-based-uri-to-documents-with-no-ncn branch 9 times, most recently from 6a6f534 to 91be496 Compare February 12, 2025 14:49
@jacksonj04 jacksonj04 force-pushed the FCL-644-ingester-should-assign-a-uuid-based-uri-to-documents-with-no-ncn branch from 91be496 to 323208c Compare February 12, 2025 17:33
@@ -336,23 +337,38 @@ def test_store_metadata(self, api_client, v2_ingest):
def test_store_file_success(self, mock_print):
session = boto3.Session
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could probably use moto but maybe not this PR.

xml_file = extract_xml_file(tar, xml_file_name)
if xml_file:
contents = xml_file.read()
contents = xml_file.read().decode("utf-8") # We assume here that our XML is in UTF-8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whilst true in practice, I don't think this should be something we have to assume. I think lxml will raise an exception if it receives str XML with a document type preamble, since it's now not accurately describing the document.

s3_client.copy(source, public_bucket, key, extra_args)


def parse_xml(xml) -> ET.Element:
def parse_xml(xml: str) -> ET.Element:
Copy link
Collaborator

@dragon-dxw dragon-dxw Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes? (see other comment )

@jacksonj04 jacksonj04 force-pushed the FCL-644-ingester-should-assign-a-uuid-based-uri-to-documents-with-no-ncn branch from 323208c to 8f4bd35 Compare February 13, 2025 11:01
@jacksonj04 jacksonj04 force-pushed the FCL-644-ingester-should-assign-a-uuid-based-uri-to-documents-with-no-ncn branch from 8f4bd35 to a7fb4ef Compare February 13, 2025 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants