Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster summary file will show incorrect values for numerical entries when loaded in Excel #20

Open
apetkau opened this issue Aug 23, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@apetkau
Copy link
Member

apetkau commented Aug 23, 2024

Description of the bug

When running arboratornf on data that includes numerical metadata values, the cluster_summary.tsv value should summarize values from different samples in a single cell for each grouped set of samples (e.g., as a list of values or min/max).

These values are properly stored in the TSV tabular file as comma-separated numbers (e.g., 1000,2000). However, in Excel, these are incorrectly interpreted as a single number with misplaced commas and will be improperly shown (e.g., as 10,002,000).

For example, given this set of metadata as input:

Portion of samplesheet.csv with some columns renamed to better match output:

sample  outbreak      value
S1      1             1000
S2      1             2000
S3      2             3000
S4      2             4000
S5      3             5000
S6      unassociated  6000

The cluster summary for the metadata_partition group 1 will be:

Portion of cluster_summary.tsv

Outbreak Code  value
1              1000,2000
2              3000,4000
3              5000
unassociated   6000

But, when loading cluster_summary.tsv in Excel, the following is shown:

cluster_summary.tsv in Excel
image

Command used and terminal output

To reproduce this behaviour, please run the following with the attached samplesheet.tsv

nextflow run . -profile test,docker --outdir results --input samplesheet.csv --metadata_1_header value

To verify the output as a text file and in Excel, please download the attached output cluster summary file.

Relevant files

System information

  • Nextflow version: 24.04.4
  • Hardware: Desktop
  • Executor: local
  • Container engine: Docker
  • OS: WSL/Ubuntu
  • Version of arboratornf: 0.1.0
@apetkau apetkau added the bug Something isn't working label Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant