Replies: 1 comment 3 replies
-
I would suggesting parsing your large input file directly through staedi and then chunking the output to smaller files, if that is your desired strategy. The parser is designed to minimize memory usage and have decent throughput for scenarios like this. Doing something like skipping to a particular offset wouldn't work well at the moment. Each reader instance is indeed stateful and an input block always needs to be (for X12) an |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have use cases that involve storing and processing X12s in excess of 300MB each.
Given this, I was considering implementing a chunking strategy where each file would be <= 4k in size chunks. I can create a facade InputStream to pass to the reader, but I was also hoping to not have to start processing each file from the start each time (offset 0). I'd much prefer to start processing, say after a Group Header segment (offset Z mapping to (chunk Y offset Z)).
I believe this might be possible if (1) the reader is stateless, or (2) I can fetch and inject state into the reader - eg., the delimiters detected from the initial block, etc. Also, I'd need to capture the offset location for the start of the next segment that I'd then continue processing the InputStream for the next group segment, etc.
Is this making any sense to you? Is this possible with staedi? Do you have any alternative suggestions or pointers for handling large edi inputs?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions