Skip to content

Latest commit

 

History

History
59 lines (41 loc) · 2.14 KB

data_transformations.md

File metadata and controls

59 lines (41 loc) · 2.14 KB

Contents


Data Transformations

SmarterCSV automatically transforms the values in each colum in order to normalize the data. This behavior can be customized or disabled.

Remove Empty Values

remove_empty_values is enabled by default It removes any values which are nil or would be empty strings.

Convert Values to Numeric

convert_values_to_numeric is enabled by default. SmarterCSV will convert strings containing Integers or Floats to the appropriate class.

Here is an example of using convert_values_to_numeric for numbers with leading zeros, e.g. ZIP codes:

  data = SmarterCSV.process('/tmp/zip.csv',  convert_values_to_numeric: { except: [:zip] })
   => [{:zip=>"00480"}, {:zip=>"51903"}, {:zip=>"12354"}, {:zip=>"02343"}]

This will return the column :zip as a string with all digits intact.

Remove Zero Values

remove_zero_values is disabled by default. When enabled, it removes key/value pairs which have a numeric value equal to zero.

Remove Values Matching

remove_values_matching is disabled by default. When enabled, this can help removing key/value pairs from result hashes which would cause problems.

e.g.

  • remove_values_matching: /^\$0\.0+$/ would remove $0.00
  • remove_values_matching: /^#VALUE!$/ would remove errors from Excel spreadsheets

Empty Hashes

It can happen that after all transformations, a row of the CSV file would produce a completely empty hash.

By default SmarterCSV uses remove_empty_hashes: true to remove these empty hashes from the result.

This can be set to false, to keep these empty hashes in the results.


PREVIOUS: Header Validations | NEXT: Value Converters