Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unflatten: arrays of strings and numbers are 'doubled' (CSV and XLSX) #426

Open
duncandewhurst opened this issue Aug 9, 2023 · 3 comments
Assignees
Labels
bug RDLS Issues and PRs relating to the Risk Data Library Standard

Comments

@duncandewhurst
Copy link
Contributor

duncandewhurst commented Aug 9, 2023

The unflatten command adds an extra set of square brackets around the value of string and numeric arrays, which means that the resulting data returns an invalid type error when validated against the schema.

I'm pretty sure this is a new issue because I don't recall it being a problem when unflattening OCDS releases, in which tag is an array of strings.

I've provided a minimal example below to reproduce the issue.

Input:

array
"a,b,c"

Schema:

{
  "properties": {
    "array": {
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  }
}

Command

flatten-tool unflatten -f csv -s schema.json input

Expected output:

{
    "main": [
        {
            "array": [
                  "a",
                  "b",
                  "c"
            ]
        }
    ]
}

Actual output:

{
    "main": [
        {
            "array": [
                [
                    "a",
                    "b",
                    "c"
                ]
            ]
        }
    ]
}
@Bjwebb
Copy link
Member

Bjwebb commented Aug 11, 2023

For 1 level arrays, flatten-tool uses ;, not ,.

This has the results you want:

array
"a;b;c"

@Bjwebb Bjwebb closed this as completed Aug 11, 2023
@Bjwebb
Copy link
Member

Bjwebb commented Aug 13, 2023

Re-opening this because we should check whether this is documented correctly in the docs.

@Bjwebb Bjwebb reopened this Aug 13, 2023
@duncandewhurst
Copy link
Contributor Author

Ah, my mistake, then. It is documented, but somewhat buried under a heading that suggests it is unsupported: https://flatten-tool.readthedocs.io/en/latest/unflatten/#plain-lists-unsupported

@duncandewhurst duncandewhurst added the RDLS Issues and PRs relating to the Risk Data Library Standard label Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug RDLS Issues and PRs relating to the Risk Data Library Standard
Projects
None yet
Development

No branches or pull requests

3 participants