Skip to content

Fix issue with invisible chars while importing csv files #109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
1 change: 1 addition & 0 deletions lib/csv_importer/column.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,6 @@ class Column

attribute :name, String
attribute :definition, ColumnDefinition
attribute :rank, Integer, default: 0
end
end
4 changes: 2 additions & 2 deletions lib/csv_importer/csv_reader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -70,11 +70,11 @@ def detect_separator(csv_content)
end
end

# Remove trailing white spaces and ensure we always return a string
# Remove trailing white spaces, invisible characters and ensure we always return a string
def sanitize_cells(rows)
rows.map do |cells|
cells.map do |cell|
cell ? cell.strip : ""
cell ? cell.strip.gsub(/\P{Print}|\p{Cf}/, '') : ""
end
end
end
Expand Down
12 changes: 10 additions & 2 deletions lib/csv_importer/header.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,21 @@ class Header
attribute :column_names, Array[String]

def columns
max_column = column_definitions.size

column_names.map do |column_name|
# ensure column name escapes invisible characters
column_name = column_name.gsub(/[^[:print:]]/, '')
column_name = column_name.gsub(/\P{Print}|\p{Cf}/, '')

# the column will be processed not in the order found in the csv
# but in the order of the importation code
# first column declared, first column processed
rank = column_definitions.index { |definition| definition.match?(column_name) }

Column.new(
name: column_name,
definition: find_column_definition(column_name)
definition: find_column_definition(column_name),
rank: rank || max_column
)
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/csv_importer/row.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def csv_attributes

# Set attributes
def set_attributes(model)
header.columns.each do |column|
header.columns.sort_by(&:rank).each do |column|
value = csv_attributes[column.name]
begin
value = value.dup if value
Expand Down
2 changes: 2 additions & 0 deletions lib/csv_importer/runner.rb
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ def persist_rows!

if row.skip?
tags << :skip
elsif row.errors.size > 0
tags << :failure
else
if row.model.save
tags << :success
Expand Down
2 changes: 1 addition & 1 deletion spec/csv_importer_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -430,7 +430,7 @@ class ImportUserCSVByFirstName
[email protected] , true, bob ,,"

# insert invisible characters
csv_content.insert(-1, "\u{FEFF}")
csv_content.insert(0, "\u{FEFF}")

csv_io = StringIO.new(csv_content)
import = ImportUserCSV.new(file: csv_io)
Expand Down