Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional execution results in collection including nulls #18905

Closed
pvanheus opened this issue Sep 28, 2024 · 2 comments
Closed

Conditional execution results in collection including nulls #18905

pvanheus opened this issue Sep 28, 2024 · 2 comments

Comments

@pvanheus
Copy link
Contributor

Describe the bug
In a workflow where some of the steps are conditionally executed and a collection (list type) is processed, the result is a collection that contains null values. For example, this workflow:

https://usegalaxy.org/u/pvanheus/w/tb-variant-analysis-v11-imported-from-url

and this history:

https://usegalaxy.org/u/pvanheus/h/smartt-samples

history item "bcftools consensus on collection 2313, collection 4778, and collection 2634: con..." tagged as "consensus_genome" contains null items for e.g. SRR26331600 which lacks sufficient mapped reads to call a consensus.

Galaxy Version and/or server at which you observed the bug
Galaxy Version: 24.1 3.dev0
(on usegalaxy.org)

Browser and Operating System
Operating System: Windows
Browser: Firefox

To Reproduce

Producing a minimal example is on my TODO list. For now:

  1. Download the read from SRA project PRJNA1026351 (the list of accessions is an item in the above history)
  2. Run the TB Variant Analysis v1.1 workflow
  3. Check the consensus genome output

I'm opening this issue before I once again forget to do so, and will work on creating a more minimal example.

Expected behavior

Possible solutions I can imagine:

  1. Create a tool to filter nulls out of a collection
  2. Change the logic with regards to the outputs of tools when they are skipped due to conditional logic (i.e. don't make a null)
@mvdbeek
Copy link
Member

mvdbeek commented Sep 28, 2024

  1. Create a tool to filter nulls out of a collection

That's done using the pick parameter value tool, which you can find in the expression tools section in the workflow editor

2. Change the logic with regards to the outputs of tools when they are skipped due to conditional logic (i.e. don't make a null)

this is not an option, if you do this you won't be able to skip mapping over some elements of a collection (e.g. those that didn't pass QC)

Let me know if there's anything else you run into.

@mvdbeek mvdbeek closed this as completed Sep 28, 2024
@pvanheus
Copy link
Contributor Author

pvanheus commented Oct 3, 2024

Just adding a is relevant here.

Also noting that for the workflow in question, the solution was to turn the status of the collection element ("TRUE" for datasets that pass the evaluation criteria and "FALSE" otherwise) into a list of element IDs that only represents that good (i.e. "TRUE") datasets. That list was then used to filter outputs, only including "good" elements in downstream analysis. For this particular use case that solution worked well. This might not always work though, because it mutates the list of elements - if you needed to combine two lists after conditional execution you wouldn't want to use this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants