Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add parse_raw_upload #63

Merged
merged 9 commits into from
Jan 8, 2025
Merged

feat: add parse_raw_upload #63

merged 9 commits into from
Jan 8, 2025

Conversation

joseph-sentry
Copy link
Contributor

@joseph-sentry joseph-sentry commented Dec 19, 2024

we want to support parsing raw upload files instead of individual JUnit
XML files

the input of this new function is the raw upload in byte form

the output is a messagepacked binary payload containing the results of
the parsing and the raw upload in readable format in byte form

Depends on: #62

@joseph-sentry joseph-sentry marked this pull request as draft December 19, 2024 22:06
we will need to handle msgpacking, base64 decoding and zlib
decompressing
we want to document the new function that we're going to implement
we want to support parsing raw upload files instead of individual JUnit
XML files

the input of this new function is the raw upload in byte form

the output is a messagepacked binary payload containing the results of
the parsing and the raw upload in readable format in byte form
@joseph-sentry joseph-sentry force-pushed the joseph/parse-raw-upload branch from 1117e75 to 7920ff8 Compare December 20, 2024 17:50
@codecov-notifications
Copy link

codecov-notifications bot commented Dec 20, 2024

Codecov Report

Attention: Patch coverage is 82.97872% with 24 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/testrun.rs 52.08% 23 Missing ⚠️
src/junit.rs 93.75% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

codecov bot commented Dec 20, 2024

Codecov Report

Attention: Patch coverage is 82.97872% with 24 lines in your changes missing coverage. Please review.

Project coverage is 96.95%. Comparing base (0a663e5) to head (d3a4509).
Report is 10 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/testrun.rs 52.08% 23 Missing ⚠️
src/junit.rs 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #63      +/-   ##
==========================================
+ Coverage   94.31%   96.95%   +2.63%     
==========================================
  Files          14       15       +1     
  Lines        1900     1871      -29     
==========================================
+ Hits         1792     1814      +22     
+ Misses        108       57      -51     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@joseph-sentry joseph-sentry requested a review from a team December 20, 2024 21:03
@joseph-sentry joseph-sentry marked this pull request as ready for review December 20, 2024 21:03
src/raw_upload.rs Outdated Show resolved Hide resolved
src/testrun.rs Outdated Show resolved Hide resolved
src/raw_upload.rs Outdated Show resolved Hide resolved
src/raw_upload.rs Outdated Show resolved Hide resolved
src/raw_upload.rs Outdated Show resolved Hide resolved
src/raw_upload.rs Outdated Show resolved Hide resolved
src/raw_upload.rs Outdated Show resolved Hide resolved
tests/test_parse_raw_upload.py Outdated Show resolved Hide resolved
we want the parse_raw_upload function to return a list of dictionaries
since that's what will be inserted into postgres

we also replace the pyresult with an anyhow::Result

we also add snapshot testing using insta
we want to have enums for Outcome and Framework because having them as
Strings does not encode the fact that they have a limited range of
values they can take
Copy link
Contributor

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just some minor stylistic improvements that you can do or also skip for later :-)

src/junit.rs Outdated
@@ -167,13 +148,13 @@ pub fn use_reader(
b"skipped" => {
let testrun = saved_testrun
.as_mut()
.ok_or_else(|| ParserError::new_err("Error accessing saved testrun"))?;
.ok_or_else(|| anyhow!("Error accessing saved testrun"))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .context() method also exists on Option, so all these can be further simplified :-)

Comment on lines 115 to 116
r#"{{"network": [], "test_results_files": [{{"filename": "{}", "format": "base64+compressed", "data": "{}"}}]}}"#,
filename, base64_data,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r#"{{"network": [], "test_results_files": [{{"filename": "{}", "format": "base64+compressed", "data": "{}"}}]}}"#,
filename, base64_data,
r#"{{"network": [], "test_results_files": [{{"filename": "{filename}", "format": "base64+compressed", "data": "{base64_data}"}}]}}"#,

I believe you can also use format fields in raw strings as well.

r#"{{"network": [], "test_results_files": [{{"filename": "{}", "format": "base64+compressed", "data": "{}"}}]}}"#,
filename, base64_data,
);
upload_json.as_bytes().to_vec()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
upload_json.as_bytes().to_vec()
upload_json.into()

This should avoid a copy. It should be possible to convert the String into a Vec<u8> trivially.

}
}

// i can't seem to get pyo3(from_item_all) to work when IntoPyObject is also being derived
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is unfortunate :-(
Is there an upstream issue about this?

Copy link
Contributor Author

@joseph-sentry joseph-sentry Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there does not seem to be one, but at the same time i feel like i'm doing something wrong in the first place, maybe we should have different types for the testruns we expect to be ingesting in the writer and the testruns we're outputting in the parser

anyways i looked into why it's outputting an error and it's because the IntoPyObject macro is parsing the attributes using this function. i don't have a clue on how to fix something like this, other than maybe adding a new FromIntoPyObject derive

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is actually a good idea, yes.

depending on how we move forward with bigquery or the custom binary format, we wouldn’t want to do a roundtrip through python anyway when using the writer.



assert parsing_infos[0]["framework"] == "Pytest"
assert parsing_infos[0]["testruns"] == [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insta also exists in Python I believe, if you want to use it for these asserts as well :-)

@joseph-sentry joseph-sentry merged commit cc76ca7 into main Jan 8, 2025
8 of 11 checks passed
@joseph-sentry joseph-sentry deleted the joseph/parse-raw-upload branch January 8, 2025 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants