Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal coding error detected at file mlrmap_flatten_unflatten.go line 160 #1733

Closed
irisjae opened this issue Dec 22, 2024 · 9 comments · Fixed by #1735
Closed

Internal coding error detected at file mlrmap_flatten_unflatten.go line 160 #1733

irisjae opened this issue Dec 22, 2024 · 9 comments · Fixed by #1735

Comments

@irisjae
Copy link

irisjae commented Dec 22, 2024

Save the following file as test.csv

.,","
,

Now, execute mlr --icsv --ojson cat test.csv, which produces the error Internal coding error detected at file mlrmap_flatten_unflatten.go line 160.

mlr --version yields mlr 6.13.0.

@aborruso
Copy link
Contributor

Hi,
this is a CSV with no field names. Try to add -N (implicit csv header)

mlr --icsv --ojson -N cat test.csv

I have this output

[
{
  "1": ".",
  "2": ","
},
{
  "1": "",
  "2": ""
}
]

@irisjae
Copy link
Author

irisjae commented Dec 22, 2024

Let me clarify a bit. The file does indeed have headers, and the first line ".", and "," are the (valid) header names. The above file was produced by a csv file (containing a lot of data) failing to convert to json, and my subsequent pruning to reduce the failure to a minimal example.

Indeed, a perhaps clearer example as below fails similarly. In any case, if an error was triggered, I would expect miller to report the error rather than abort as it did.

header 1 .,"header 2 ,"
my info,more info

@aborruso
Copy link
Contributor

Hi @irisjae the . is a special char.

Default: if the input has y.1=7,y.2=8,y.3=9 then this unflattens to $y=[7,8,9]. flattens to y.1=7,y.2=8,y.3=9. With--no-auto-flatten, instead we get${y.1}=7,${y.2}=8,${y.3}=9`

There is an auto unflatten related to ..

You can disable it, running mlr --icsv --ojson --no-auto-unflatten cat test.csv and you will get

[
{
  "header 1 .": "my info",
  "header 2 ,": "more info"
}
]

@irisjae
Copy link
Author

irisjae commented Dec 22, 2024

Thank you, that makes sense. Perhaps the error message should be improved.

Also, if . is the separator for flattening, shouldn't

.,","
,

produce

{
  "": {
    "": ""
  },
  ",": ""
}

?

@johnkerl
Copy link
Owner

The error-handling logic needs improving -- I will do so.

@johnkerl
Copy link
Owner

Some more examples of unflatten misbehavior:

a,.,c
1,2,3
a,b.,c
1,2,3
a,.b,c
1,2,3

@johnkerl
Copy link
Owner

Inputs and outputs with #1735:

$ for x in *.csv; do echo ===================================== $x; cat $x; echo; mlr --icsv --ojson cat $x; echo; done

===================================== bdot.csv
a,b.,c
1,2,3

[
{
  "a": 1,
  "b.": 2,
  "c": 3
}
]

===================================== bdotdotc.csv
a,b..c,d
1,2,3

[
{
  "a": 1,
  "b..c": 2,
  "d": 3
}
]

===================================== dot.csv
a,.,c
1,2,3

[
{
  "a": 1,
  ".": 2,
  "c": 3
}
]

===================================== dotb.csv
a,.b,c
1,2,3

[
{
  "a": 1,
  ".b": 2,
  "c": 3
}
]

===================================== dotdot.csv
a,..,c
1,2,3

[
{
  "a": 1,
  "..": 2,
  "c": 3
}
]

===================================== good.csv
a,b.c,d
1,2,3

[
{
  "a": 1,
  "b": {
    "c": 2
  },
  "d": 3
}
]

===================================== test.csv
.,","
,

[
{
  ".": "",
  ",": ""
}
]

===================================== test2.csv
x,","
,

[
{
  "x": "",
  ",": ""
}
]

@aborruso
Copy link
Contributor

Thank you very much

@johnkerl
Copy link
Owner

@irisjae @aborruso please also see
https://miller.readthedocs.io/en/main/flatten-unflatten/#non-inferencing-cases

(Note: these are docs for main, not a tagged release -- these doc updates will appear with tag latest on the next Miller release.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants