Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behaviour for "empty" CSV column of unrequired file #8

Open
awgymer opened this issue Mar 7, 2024 · 6 comments
Open
Labels
bug Something isn't working

Comments

@awgymer
Copy link
Collaborator

awgymer commented Mar 7, 2024

A csv can be "quoted" or not.

If there is a field defined like so:

"bai": {
    "type": "string",
    "format": "file-path",
    "exists": true,
    "pattern": "^\\S+\\.bai$",
    "errorMessage": "If provided the BAM index (bai) must exist and have extension '.bai'"
},

then the following csv have different behaviour:

Unquoted "empty" field parses fine as expected.

field1,bai
myfield,

Quoted "empty" field errors out.

field1,bai
myfield,""
ERROR ~ ERROR: Validation of 'input' file failed!

 -- Check '.nextflow.log' file for details
The following errors have been detected:

* -- Entry 1 - bai: the file or directory '""' does not exist.
* -- Entry 1 - bai: If provided the BAM index (bai) must exist and have extension '.bai' ("")

This feels somewhat counterintuitive as an empty string is obviously not a path, it's an empty column.

Not sure if this behaviour is to be expected and I am misunderstanding or whether we can somehow improve this to behave more uniformly

@nvnieuwk
Copy link
Collaborator

nvnieuwk commented Mar 7, 2024

I don't think this is a bug since the field isn't actually empty, but contains an empty string.

@awgymer
Copy link
Collaborator Author

awgymer commented Mar 7, 2024

But in a quoted csv then "" is the same as a truly empty field.

@nvnieuwk
Copy link
Collaborator

nvnieuwk commented Mar 7, 2024

Have you tried this with the unreleased version? Does it work in the same way there? My guess is that the .splitCsv function of nextflow is the issue here. When a CSV field is truly empty, the function returns null but when it contains an empty string it returns "".

@awgymer
Copy link
Collaborator Author

awgymer commented Mar 7, 2024

No I haven't, I guess maybe with the new castToType this would be fixed as we do =="": return null

@nvnieuwk
Copy link
Collaborator

nvnieuwk commented Mar 7, 2024

Yes that will probably change that behaviour :)

@nvnieuwk nvnieuwk transferred this issue from nextflow-io/nf-validation Apr 23, 2024
@nvnieuwk
Copy link
Collaborator

Hi @awgymer can you try this out with nf-schema?

@nvnieuwk nvnieuwk added the bug Something isn't working label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants