Simplify handling of motif files #6

nictru · 2024-04-26T15:59:18Z

This PR adds a new subworkflow MOTIFS which handles the conversion of several input formats into the formats needed by the pipeline. If no motif file is provided to the pipeline, it can fetch motifs from JASPAR based on the taxon_id. It also performs the motif filtering that was part of the PEAKS subworkflow earlier.

The outputs of this subworkflow can also be used to simplify the FIMO subworkflow.

github-actions · 2024-04-26T16:00:43Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit 2656f79

+| ✅ 198 tests passed       |+
#| ❔   7 tests were ignored |#
!| ❗  14 tests had warnings |!

❗ Test warnings:

readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
pipeline_todos - TODO string in main.nf: A stub section should mimic the execution of the original module as best as possible
pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
pipeline_todos - TODO string in ci.yml: You can customise CI pipeline run tests as required
pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
pipeline_todos - TODO string in base.config: Check the defaults for all processes
pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
nfcore_yml - nf-core version not set in .nf-core.yml

❔ Tests ignored:

template_strings - Ignoring Jinja template strings in file /home/runner/work/tfactivity/tfactivity/modules/local/report/create/app/templates/base.html
template_strings - Ignoring Jinja template strings in file /home/runner/work/tfactivity/tfactivity/modules/local/report/create/app/templates/configuration.html
template_strings - Ignoring Jinja template strings in file /home/runner/work/tfactivity/tfactivity/modules/local/report/create/app/templates/macros.html
template_strings - Ignoring Jinja template strings in file /home/runner/work/tfactivity/tfactivity/modules/local/report/create/app/templates/network.html
template_strings - Ignoring Jinja template strings in file /home/runner/work/tfactivity/tfactivity/modules/local/report/create/app/templates/snp.html
template_strings - Ignoring Jinja template strings in file /home/runner/work/tfactivity/tfactivity/modules/local/report/create/app/templates/tf.html
template_strings - Ignoring Jinja template strings in file /home/runner/work/tfactivity/tfactivity/modules/local/report/create/app/templates/tg.html

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-tfactivity_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-tfactivity_logo_light.png
files_exist - File found: docs/images/nf-core-tfactivity_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: modules.json
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: docs/images/nf-core-tfactivity_logo.png
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/NfcoreTemplate.groovy
files_exist - File not found check: lib/Utils.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: lib/WorkflowMain.groovy
files_exist - File not found check: lib/WorkflowTfactivity.groovy
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: Singularity
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 1.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.min_peak_occurrence= 1
nextflow_config - Config default value correct: params.window_size= 50000
nextflow_config - Config default value correct: params.decay= true
nextflow_config - Config default value correct: params.expression_aggregation= mean
nextflow_config - Config default value correct: params.affinity_aggregation= max
nextflow_config - Config default value correct: params.chromhmm_states= 10
nextflow_config - Config default value correct: params.chromhmm_threshold= 0.9
nextflow_config - Config default value correct: params.chromhmm_marks= H3K27ac,H3K4me3
nextflow_config - Config default value correct: params.min_count= 50
nextflow_config - Config default value correct: params.min_tpm= 1.0
nextflow_config - Config default value correct: params.min_count_tf= 50
nextflow_config - Config default value correct: params.min_tpm_tf= 1.0
nextflow_config - Config default value correct: params.dynamite_ofolds= 3
nextflow_config - Config default value correct: params.dynamite_ifolds= 6
nextflow_config - Config default value correct: params.dynamite_alpha= 0.1
nextflow_config - Config default value correct: params.dynamite_randomize= false
nextflow_config - Config default value correct: params.dynamite_min_regression= 0.1
nextflow_config - Config default value correct: params.alpha= 0.05
nextflow_config - Config default value correct: params.custom_config_version= master
nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Config default value correct: params.max_cpus= 16
nextflow_config - Config default value correct: params.max_memory= 128.GB
nextflow_config - Config default value correct: params.max_time= 240.h
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-tfactivity_logo_light.png matches the template
files_unchanged - docs/images/nf-core-tfactivity_logo_light.png matches the template
files_unchanged - docs/images/nf-core-tfactivity_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (215 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: download_pipeline.yml
actions_schema_validation - Workflow validation passed: release-announcements.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - assets/multiqc_config.yml found and not ignored.
multiqc_config - assets/multiqc_config.yml contains report_section_order
multiqc_config - assets/multiqc_config.yml contains export_plots
multiqc_config - assets/multiqc_config.yml contains report_comment
multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
multiqc_config - assets/multiqc_config.yml contains a matching 'report_comment'.
multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
base_config - conf/base.config found and not ignored.
modules_config - conf/modules.config found and not ignored.
modules_config - CLEAN_BED found in conf/modules.config and Nextflow scripts.
modules_config - BEDTOOLS_SORT found in conf/modules.config and Nextflow scripts.
modules_config - BEDTOOLS_MERGE found in conf/modules.config and Nextflow scripts.
modules_config - ANNOTATE_SAMPLES found in conf/modules.config and Nextflow scripts.
modules_config - CONCAT_SAMPLES found in conf/modules.config and Nextflow scripts.
modules_config - FILTER_MIN_OCCURRENCE found in conf/modules.config and Nextflow scripts.
modules_config - UCSC_GTFTOGENEPRED found in conf/modules.config and Nextflow scripts.
modules_config - COMBINE_TFS_PER_ASSAY found in conf/modules.config and Nextflow scripts.
modules_config - COMBINE_TGS_PER_ASSAY found in conf/modules.config and Nextflow scripts.
nfcore_yml - Repository type in .nf-core.yml is valid: pipeline

Run details

nf-core/tools version 2.14.1
Run at 2024-05-30 13:38:34

nictru · 2024-04-29T09:06:10Z

Requires #5 to be merged first

LeonHafner · 2024-05-18T11:55:55Z

I updated the FIMO workflow by removing the DOWNLOAD_JASPAR process.
The MEME file from MOTIFS is now used in the FIMO/FILTER_MOTIFS process, which previously used the output of DOWNLOAD_JASPAR.

Currently FIMO is still using the pwm file to convert gene symbols to jaspar IDs. Since this mapping should also be part of the MEME file from MOTIFS, we could possible also use this one for the mapping and remove the pwm completely from that subworkflow. What do you mean @nictru?

nictru · 2024-05-20T09:57:52Z

Yes, that's the way to go
Since the PWM file should now contain the same information as the motif file already used, we can safely remove it.

nictru · 2024-05-29T12:25:11Z

subworkflows/local/fimo.nf

-
-        FILTER_MOTIFS(JASPAR_MAPPING.out.jaspar_ids, JASPAR_DOWNLOAD.out.motifs)
+        JASPAR_MAPPING(tf_ranking, motifs_meme)
+        FILTER_MOTIFS(JASPAR_MAPPING.out.jaspar_ids, motifs_meme)


JASPAR_MAPPING and FILTER_MOTIFS aim to keep only the motifs for transcription factors found significant by the pipeline. Splitting this into two processes made sense in the original implementation with a dedicated download from JASPAR, but now this should be done in a single process.

nictru linked an issue Apr 26, 2024 that may be closed by this pull request

Add more supported PWM formats #2

Closed

nictru added 5 commits April 29, 2024 10:50

Simplify ROSE script

123821f

Use bed file directly in ROSE

9874e1f

Use standard genepred file in ROSE

ae028dc

Remove traces of UCSC parameter

c7d3bb7

Remove refseq traces from ROSE script

06c77a2

nictru force-pushed the motif-files branch 2 times, most recently from 08b52b0 to 3409689 Compare April 29, 2024 08:57

nictru added 17 commits April 29, 2024 10:58

Fix GTF channel structure for ROSE

c4d2dac

Remove unnecessary igenomes values

b811dd1

Add CONVERT_MOTIFS process

723ad2b

Remove duplicate taxon id

2a284df

Implement motif subworkflow

704e9a9

Implement first version of psem creation

481d75f

Use template in TRANSFAC_TO_PSEM

0c92f44

Fix TRANSFAC_TO_PSEM environment

3758c3b

Improve TRANSFAC_TO_PSEM performannce

2600722

Fix PSEM compatibility problems

64096ab

Remove PWM filtering in peaks subworkflow

d761ba4

Implement motif filtering in Motif subworkflow

90cf274

Implement motif version capture

bb3fd58

Implement jaspar motif fetching

e9ef113

Convert TF names to uppercase in JASPAR fetching

ce42260

Editorconfig

dc8691d

Editorconfig + Linting

761c1a5

nictru force-pushed the motif-files branch from 3409689 to 761c1a5 Compare April 29, 2024 08:59

Integrated MOTIFS output into FIMO

8aa6b93

switched from pwm to meme file

600cce2

nictru commented May 29, 2024

View reviewed changes

integrated JASPAR_MAPPING into FILTER_MOTIFS

2656f79

nictru marked this pull request as ready for review May 30, 2024 14:39

nictru merged commit 31b5c43 into dev May 30, 2024
4 checks passed

nictru deleted the motif-files branch May 30, 2024 14:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify handling of motif files #6

Simplify handling of motif files #6

nictru commented Apr 26, 2024

github-actions bot commented Apr 26, 2024 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

nictru commented Apr 29, 2024

LeonHafner commented May 18, 2024

nictru commented May 20, 2024 •

edited

Loading

nictru May 29, 2024

Simplify handling of motif files #6

Simplify handling of motif files #6

Conversation

nictru commented Apr 26, 2024

github-actions bot commented Apr 26, 2024 • edited Loading

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

nictru commented Apr 29, 2024

LeonHafner commented May 18, 2024

nictru commented May 20, 2024 • edited Loading

nictru May 29, 2024

Choose a reason for hiding this comment

github-actions bot commented Apr 26, 2024 •

edited

Loading

`nf-core lint` overall result: Passed ✅ ⚠️

nictru commented May 20, 2024 •

edited

Loading