-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce a faster gt3x parser and infrastructure for older gt3x file formats #24
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ith new dev parser, not worth it)
…able to dev parser
…cluding a new unit test)
…or ACTIVITY2 and SENSOR_DATA packets)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a big PR with two main outcomes related to
AGread::read_gt3x
:I've created a new parsing scheme (available by setting
parser = "dev"
) that runs 3-4x faster for most of the files I've tested.To accommodate the new parser, I've revised the flow of
AGread::read_gt3x
in such a way that additional parsers can be added. This could occur through theparser
argument, or in the initial extraction of the gt3x zip archive (seeAGread:::read_gt3x_setup
). The latter function now assigns a file type based on the contents of the archive. A new type could be introduced reflecting the structure of old gt3x files (see Reading Old GT3X Format (NHANES) #10), and that type would then be automatically directed to a unique parser.Some other notes:
Output from
parser = "legacy"
(default setting) andparser = "dev"
can be compared usingAGread::legacy_dev_compare
. The comparison hinges on the use ofbase::all.equal
, which is more crude than the means I have used to compareparser = "legacy"
output to what you find inRAW.csv
andIMU.csv
files. My thinking is that the legacy parser is in lock step with the latter outputs, so as long as the dev parser is in reasonable agreement with the legacy parser (which it is), we should be able to infer that all three give similar outputs.The dev parser's speed advantage comes from several sources, one of which is the decision to forgo rounding values off. (The legacy parser uses a wildly inefficient midpoint rounding system for ACTIVITY2 packets, meaning acceleration values may differ by 1 milli-g between the parsers.)