Fix row_count always being 1 when \r used as lineTerminator #6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes the fact that row_count was always reported as 1 for CSVs using
\r
as lineTerminator, even if the dialect explicitly set lineTerminator to\r
.The cause of this issue was the use of
#each_line
method to split streams/files into lines sent to#parse_line
. With no arguments given,#each_line
reads lines as determined by line separator$/
.1 By default,$/
is set to\n
.2This results in the entire stream/file being sent to
#parse_line
, which was explicitly only treating\n
as a line break.This was fixed by:
Making all places that checked for line breaks by matching only on
\n
check for either\n
or\r
Deriving the correct
$/
value prior to calling#each_line
, and then resetting$/
to the default value after the#each_line
block is closed.The latter is currently accomplished by:
@source
as a String. If this includes\n
, set$/
to\n
. Otherwise, set it to\r
The second step has 2 problems:
\n
or\r\n
, but the file's actual lineTerminator is\r
I am going ahead with this implementation for now because:
I have an idea for an approach that would address both of those issues, but it gets me halfway to writing a CSV parser from scratch and I ain't got time for that at the moment.
Footnotes
https://docs.ruby-lang.org/en/master/IO.html#method-i-each_line ↩
https://docs.ruby-lang.org/en/master/globals_rdoc.html#label-24-2F+-28Input+Record+Separator-29 ↩