Implement efficient trajectory frame reading for LAMMPS dump files #868
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements efficient trajectory frame reading for LAMMPS dump files, allowing users to read only specific frames instead of loading entire trajectories. This addresses the performance issue where workflows only need a subset of frames but must load and process complete trajectories.
Key Features
1. Selective Frame Reading
Added
f_idx
parameter todpdata.System()
for loading only specified frames:2. Multi-Trajectory Pattern
Implemented the exact frames_dict pattern requested in the issue:
3. Efficient Block-Based Reading
The implementation uses
itertools.zip_longest(*[f] * nlines)
to read frames in blocks and skip unwanted frames, as suggested in the issue. This provides significant performance improvements for large trajectories when only a few frames are needed.Technical Implementation
get_frame_nlines()
automatically determines the number of lines per frameread_frames()
uses block-based reading to skip unwanted frames entirelyload_file()
to support both traditionalbegin/step
and newf_idx
parameterssystem_data()
pipeline and dpdata workflowPerformance Benefits
Backward Compatibility
The implementation maintains complete backward compatibility:
begin
andstep
parameters continues to work unchangedf_idx
parameter is optional and defaults toNone
Testing
Added comprehensive test suite with 22 test cases covering:
Fixes #367.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.