Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix last row missed issue(#22) and infinite loop problem #23

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

suexcxine
Copy link

@suexcxine suexcxine commented Jul 24, 2021

Fix issue #22

it's because of the following codes
https://github.com/jsonkenl/xlsxir/blob/master/lib/xlsxir.ex#L602

  defp row_num(tid) do
    # do not count :info key
    :ets.info(tid, :size) - 1
  end

and we don't have this :info key when we load xls files

and the infinite loop problem is because of
https://github.com/jsonkenl/xlsxir/blob/master/lib/xlsxir.ex#L510

  defp fill_empty_cells(from, to, line, cells) do
    next_ref = next_col(from)

    if next_ref == to do
      fill_empty_cells(to, to, line, [[from, nil] | cells])
    else
      fill_empty_cells(next_ref, to, line, [[from, nil] | cells])
    end
  end

the if condition never meets and an infinite loop happens
we can change next_ref == to to next_ref >= to to enhance robustness
but the root cause is because our cell order of a row in ets differs from the cell order of xlsx format

ets contents of a xlsx file parsed by Xlsxir

[
  {1,
   [
     ["A1", "head1"],
     ["B1", "head2"],
     ["C1", "head3"],
     ["D1", "head4"],
     ["E1", "head5"]
   ]},
  {:info, :worksheet_name, "Sheet1"},
  {2, [["A2", "a2"], ["B2", "b2"], ["C2", "c2"], ["D2", "d2"], ["E2", "e2"]]}
]

ets contents of a xls file parsed by Exoffice

[
  {1,
   [
     ["E1", "head5"],
     ["D1", "head4"],
     ["C1", "head3"],
     ["B1", "head2"],
     ["A1", "head1"]
   ]},
  {2,
   [
     ["E2", "e2"],
     ["D2", "d2"],
     ["C2", "c2"],
     ["B2", "b2"],
     ["A2", "a2"]
   ]}
]

@AlexKovalevych
Copy link
Owner

Thank you for your PR. It looks good, but can you please add a test for this case which will fail in master branch and pass in your branch?

@suexcxine
Copy link
Author

I noticed there are still other problems,..
I fixed a case when the order differs with the above patch,
but when I use another excel file, the order differs again...
It seems that cell order in excel binary stream is not certain,
so maybe we need to sort the cells of every row before we call Xlsxir..

And, I have an excel with numbers, but numbers are not read..
However, numbers in another excel is read correctly..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants