Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a faster gt3x parser and infrastructure for older gt3x file formats #24

Merged
merged 37 commits into from
May 24, 2020

Conversation

paulhibbing
Copy link
Owner

This is a big PR with two main outcomes related to AGread::read_gt3x:

  • I've created a new parsing scheme (available by setting parser = "dev") that runs 3-4x faster for most of the files I've tested.

  • To accommodate the new parser, I've revised the flow of AGread::read_gt3x in such a way that additional parsers can be added. This could occur through the parser argument, or in the initial extraction of the gt3x zip archive (see AGread:::read_gt3x_setup). The latter function now assigns a file type based on the contents of the archive. A new type could be introduced reflecting the structure of old gt3x files (see Reading Old GT3X Format (NHANES) #10), and that type would then be automatically directed to a unique parser.

Some other notes:

  • Output from parser = "legacy" (default setting) and parser = "dev" can be compared using AGread::legacy_dev_compare. The comparison hinges on the use of base::all.equal, which is more crude than the means I have used to compare parser = "legacy" output to what you find in RAW.csv and IMU.csv files. My thinking is that the legacy parser is in lock step with the latter outputs, so as long as the dev parser is in reasonable agreement with the legacy parser (which it is), we should be able to infer that all three give similar outputs.

  • The dev parser's speed advantage comes from several sources, one of which is the decision to forgo rounding values off. (The legacy parser uses a wildly inefficient midpoint rounding system for ACTIVITY2 packets, meaning acceleration values may differ by 1 milli-g between the parsers.)

@paulhibbing paulhibbing added this to the read_gt3x milestone May 24, 2020
@paulhibbing paulhibbing merged commit 887b31d into master May 24, 2020
@paulhibbing paulhibbing deleted the fread branch May 24, 2020 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant