Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to specify custom null values for CSV reader #4794

Closed
vrongmeal opened this issue Sep 7, 2023 · 1 comment · Fixed by #4795
Closed

Add option to specify custom null values for CSV reader #4794

vrongmeal opened this issue Sep 7, 2023 · 1 comment · Fixed by #4795
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog

Comments

@vrongmeal
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Add an option to the CSV reader to specify custom null values that might be used in CSV files as placeholders.

Helps in parsing of file such as:

a,b,c
1,2,NA
3,NA,5

where NA is a placeholder for NULL

Describe the solution you'd like

Describe alternatives you've considered

Additional context

@vrongmeal vrongmeal added the enhancement Any new improvement worthy of a entry in the changelog label Sep 7, 2023
vrongmeal added a commit to vrongmeal/arrow-rs that referenced this issue Sep 7, 2023
Can specify custom strings as `NULL` values for CSVs. This allows
reading a CSV files which have placeholders for NULL values instead of
empty strings.

Fixes apache#4794

Signed-off-by: Vaibhav <[email protected]>
vrongmeal added a commit to vrongmeal/arrow-rs that referenced this issue Sep 8, 2023
Can specify custom strings as `NULL` values for CSVs. This allows
reading a CSV files which have placeholders for NULL values instead of
empty strings.

Fixes apache#4794

Signed-off-by: Vaibhav <[email protected]>
vrongmeal added a commit to vrongmeal/arrow-rs that referenced this issue Sep 11, 2023
Can specify custom strings as `NULL` values for CSVs. This allows
reading a CSV files which have placeholders for NULL values instead of
empty strings.

Fixes apache#4794

Signed-off-by: Vaibhav <[email protected]>
vrongmeal added a commit to vrongmeal/arrow-rs that referenced this issue Sep 12, 2023
Can specify custom strings as `NULL` values for CSVs as a regular
expression. This allows reading a CSV files which have placeholders for
NULL values instead of empty strings.

Fixes apache#4794

Signed-off-by: Vaibhav <[email protected]>
tustvold added a commit that referenced this issue Sep 13, 2023
* csv: Add option to specify custom null regex

Can specify custom strings as `NULL` values for CSVs as a regular
expression. This allows reading a CSV files which have placeholders for
NULL values instead of empty strings.

Fixes #4794

Signed-off-by: Vaibhav <[email protected]>

* Apply suggestions from code review

---------

Signed-off-by: Vaibhav <[email protected]>
Co-authored-by: Raphael Taylor-Davies <[email protected]>
@tustvold tustvold added the arrow Changes to the arrow crate label Sep 18, 2023
@tustvold
Copy link
Contributor

label_issue.py automatically added labels {'arrow'} from #4795

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants