-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for memory mapped operation #51
Labels
enhancement
New feature or request
Comments
bvacaliuc
pushed a commit
to bvacaliuc/mcpevent2hist
that referenced
this issue
Jan 30, 2024
bvacaliuc
pushed a commit
to bvacaliuc/mcpevent2hist
that referenced
this issue
Jan 30, 2024
Thank you for merging #52. I apologize, but there are two bugs in the
I will prepare another PR to resolve this. |
bvacaliuc
pushed a commit
to bvacaliuc/mcpevent2hist
that referenced
this issue
Feb 7, 2024
bvacaliuc
pushed a commit
to bvacaliuc/mcpevent2hist
that referenced
this issue
Feb 7, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The current processing API requires
std::vector<char>&
to provide input to the TPX3-protocol. This means that any file must be read-in completely via the functionreadTPX3RawToCharVec()
. This also creates challenges when attempting to stream from a network port or integrate with DAS processing systems that operate in a zero-copy mode by referencing physical memory spaces.A solution is to use the
mmap()
system call to obtain a pointer to the contents of the file ( or memory space containing the TPX3-protocol data ) such that the software can process hits without buffering the input into a memory space first.In order for this to integrate into the existing API, it is necessary to provide 3 things:
char *
input parameter for the start of the memory spacestd::size_t
input parameter for the extent of the memory spacestd::size_t
output parameter to feedback to the caller how much of the memory space was consumed in this iterationThe caller will then need to call the functions over and over until all of the input data is consumed ( in the memory-mapped case ), or keep looping as new input data is received ( in the streaming case ).
There is an alternative to keep the
std::vector<char *>
API, but that requires the use of a custom allocator. While this can be made to work, it still requires the API change tostd::vector<char *, custom_allocator>
and is far more intrusive to the code than simply providing alternate calling patterns for the TPX3 stream parsing functionfindTPX3H()
.The other thing that will be needed is for
findTPX3H()
to limit is consumption of the input stream to a manageable chunk, if the caller does not do it themselves. If this is not done, then a very large file could exhaust the computer's memory space as the intermediate data structures used during clustering expand. When this is implemented, the API should change to indicate how much of the original vector was consumed even ifstd::vector<char *>
is used to provide the input.I am preparing a PR for the above and will reference this issue.
The text was updated successfully, but these errors were encountered: