nf-core · JoseEspinosa · Sep 19, 2024 · Sep 19, 2024 · Sep 19, 2024 · Sep 19, 2024
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 - [[#327](https://github.com/nf-core/atacseq/issues/327)] - Consistently support `.csi` indices as alternative to `.bai` to allow SAMTOOLS_INDEX to be used with the `-c` flag.
 - [[#356](https://github.com/nf-core/atacseq/issues/356)] - Get rid of the `lib` folder and rearrange the pipeline accordingly.
+- [[#379](https://github.com/nf-core/atacseq/pull/356)] - Use macs3 instead of macs2.
 - Updated pipeline template to [nf-core/tools 2.14.1](https://github.com/nf-core/tools/releases/tag/2.14.1)
 - [[#359](https://github.com/nf-core/atacseq/issues/359)] - Fix `--save_unaligned` description in schema.
 - [[#344](https://github.com/nf-core/atacseq/issues/344)] - Fix memory issues when sorting merged replicates after `bedtools genomecov`.
@@ -25,6 +26,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 > **NB:** Parameter has been **added** if just the new parameter information is present.
 > **NB:** Parameter has been **removed** if parameter information isn't present.
 
+### Software dependencies
+
+Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own [Biocontainer](https://biocontainers.pro/#/registry). This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
+
+| Dependency | Old version | New version |
+| ---------- | ----------- | ----------- |
+| `macs2`    | 2.2.7.1     |             |
+| `macs3`    |             | 3.0.1       |
+
 ## [[2.1.2](https://github.com/nf-core/atacseq/releases/tag/2.1.2)] - 2022-08-07
 
 ### Enhancements & fixes

diff --git a/CITATIONS.md b/CITATIONS.md
@@ -38,7 +38,7 @@
 
   > Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010 May 28;38(4):576-89. doi: 10.1016/j.molcel.2010.05.004. PubMed PMID: 20513432; PubMed Central PMCID: PMC2898526.
 
-- [MACS2](https://www.ncbi.nlm.nih.gov/pubmed/18798982/)
+- [MACS3](https://www.ncbi.nlm.nih.gov/pubmed/18798982/)
 
   > Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):R137. doi: 10.1186/gb-2008-9-9-r137. Epub 2008 Sep 17. PubMed PMID: 18798982; PubMed Central PMCID: PMC2592715.
 

diff --git a/README.md b/README.md
@@ -56,7 +56,7 @@ On release, automated continuous integration tests run the pipeline on a full-si
    4. Create normalised bigWig files scaled to 1 million mapped reads ([`BEDTools`](https://github.com/arq5x/bedtools2/), [`bedGraphToBigWig`](http://hgdownload.soe.ucsc.edu/admin/exe/))
    5. Generate gene-body meta-profile from bigWig files ([`deepTools`](https://deeptools.readthedocs.io/en/develop/content/tools/plotProfile.html))
    6. Calculate genome-wide enrichment (optionally relative to control) ([`deepTools`](https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html))
-   7. Call broad/narrow peaks ([`MACS2`](https://github.com/macs3-project/MACS))
+   7. Call broad/narrow peaks ([`MACS3`](https://github.com/macs3-project/MACS))
    8. Annotate peaks relative to gene features ([`HOMER`](http://homer.ucsd.edu/homer/download.html))
    9. Create consensus peakset across all samples and create tabular file to aid in the filtering of the data ([`BEDTools`](https://github.com/arq5x/bedtools2/))
    10. Count reads in consensus peaks ([`featureCounts`](http://bioinf.wehi.edu.au/featureCounts/))
@@ -66,7 +66,7 @@ On release, automated continuous integration tests run the pipeline on a full-si
    1. Re-mark duplicates ([`picard`](https://broadinstitute.github.io/picard/))
    2. Remove duplicate reads ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
    3. Create normalised bigWig files scaled to 1 million mapped reads ([`BEDTools`](https://github.com/arq5x/bedtools2/), [`bedGraphToBigWig`](http://hgdownload.soe.ucsc.edu/admin/exe/))
-   4. Call broad/narrow peaks ([`MACS2`](https://github.com/macs3-project/MACS))
+   4. Call broad/narrow peaks ([`MACS3`](https://github.com/macs3-project/MACS))
    5. Annotate peaks relative to gene features ([`HOMER`](http://homer.ucsd.edu/homer/download.html))
    6. Create consensus peakset across all samples and create tabular file to aid in the filtering of the data ([`BEDTools`](https://github.com/arq5x/bedtools2/))
    7. Count reads in consensus peaks relative to merged library-level alignments ([`featureCounts`](http://bioinf.wehi.edu.au/featureCounts/))

diff --git a/assets/multiqc/merged_library_frip_score_header.txt b/assets/multiqc/merged_library_frip_score_header.txt
@@ -1,7 +1,7 @@
 #id: 'mlib_frip_score'
-#section_name: 'MERGED LIB: MACS2 peak FRiP score'
+#section_name: 'MERGED LIB: MACS3 peak FRiP score'
 #description: "is generated by calculating the fraction of all mapped reads that fall
-#              into the MACS2 called peak regions. A read must overlap a peak by at least 20% to be counted.
+#              into the MACS3 called peak regions. A read must overlap a peak by at least 20% to be counted.
 #              See <a href='https://www.encodeproject.org/data-standards/terms/' target='_blank'>FRiP score</a>."
 #plot_type: 'bargraph'
 #anchor: 'mlib_frip_score'

diff --git a/assets/multiqc/merged_library_peak_count_header.txt b/assets/multiqc/merged_library_peak_count_header.txt
@@ -1,7 +1,7 @@
 #id: 'mlib_peak_count'
-#section_name: 'MERGED LIB: MACS2 peak count'
+#section_name: 'MERGED LIB: MACS3 peak count'
 #description: "is calculated from total number of peaks called by
-#	       <a href='https://github.com/taoliu/MACS' target='_blank'>MACS2</a>"
+#	       <a href='https://github.com/taoliu/MACS' target='_blank'>MACS3</a>"
 #plot_type: 'bargraph'
 #anchor: 'mlib_peak_count'
 #pconfig:

diff --git a/assets/multiqc/merged_replicate_frip_score_header.txt b/assets/multiqc/merged_replicate_frip_score_header.txt
@@ -1,7 +1,7 @@
 #id: 'mrep_frip_score'
-#section_name: 'MERGED REP: MACS2 peak FRiP score'
+#section_name: 'MERGED REP: MACS3 peak FRiP score'
 #description: "is generated by calculating the fraction of all mapped reads that fall
-#              into the MACS2 called peak regions. A read must overlap a peak by at least 20% to be counted.
+#              into the MACS3 called peak regions. A read must overlap a peak by at least 20% to be counted.
 #              See <a href='https://www.encodeproject.org/data-standards/terms/' target='_blank'>FRiP score</a>."
 #plot_type: 'bargraph'
 #anchor: 'mrep_frip_score'

diff --git a/assets/multiqc/merged_replicate_peak_count_header.txt b/assets/multiqc/merged_replicate_peak_count_header.txt
@@ -1,7 +1,7 @@
 #id: 'mrep_peak_count'
-#section_name: 'MERGED REP: MACS2 Peak count'
+#section_name: 'MERGED REP: MACS3 Peak count'
 #description: "is calculated from total number of peaks called by
-#	       <a href='https://github.com/taoliu/MACS' target='_blank'>MACS2</a>"
+#	       <a href='https://github.com/taoliu/MACS' target='_blank'>MACS3</a>"
 #plot_type: 'bargraph'
 #anchor: 'mrep_peak_count'
 #pconfig:

diff --git a/assets/multiqc_config.yml b/assets/multiqc_config.yml
@@ -71,7 +71,7 @@ module_order:
       anchor: "mlib_featurecounts"
       info: "This section of the report shows featureCounts results for the number of reads assigned to merged library consensus peaks."
       path_filters:
-        - "./macs2/merged_library/featurecounts/*.summary"
+        - "./macs3/merged_library/featurecounts/*.summary"
   - samtools:
       name: "MERGED REP: SAMTools"
       info: "This section of the report shows SAMTools results after merging replicates and filtering."
@@ -88,7 +88,7 @@ module_order:
       anchor: "mrep_featurecounts"
       info: "This section of the report shows featureCounts results for the number of reads assigned to merged replicate consensus peaks."
       path_filters:
-        - "./macs2/merged_replicate/featurecounts/*.summary"
+        - "./macs3/merged_replicate/featurecounts/*.summary"
 
 report_section_order:
   mlib_peak_count:

diff --git a/bin/macs2_merged_expand.py → bin/macs3_merged_expand.py b/bin/macs2_merged_expand.py → bin/macs3_merged_expand.py
@@ -17,15 +17,15 @@
 ############################################
 
 Description = "Add sample boolean files and aggregate columns from merged MACS narrow or broad peak file."
-Epilog = """Example usage: python macs2_merged_expand.py <MERGED_INTERVAL_FILE> <SAMPLE_NAME_LIST> <OUTFILE> --is_narrow_peak --min_replicates 1"""
+Epilog = """Example usage: python macs3_merged_expand.py <MERGED_INTERVAL_FILE> <SAMPLE_NAME_LIST> <OUTFILE> --is_narrow_peak --min_replicates 1"""
 
 argParser = argparse.ArgumentParser(description=Description, epilog=Epilog)
 
 ## REQUIRED PARAMETERS
-argParser.add_argument("MERGED_INTERVAL_FILE", help="Merged MACS2 interval file created using linux sort and mergeBed.")
+argParser.add_argument("MERGED_INTERVAL_FILE", help="Merged MACS3 interval file created using linux sort and mergeBed.")
 argParser.add_argument(
     "SAMPLE_NAME_LIST",
-    help="Comma-separated list of sample names as named in individual MACS2 broadPeak/narrowPeak output file e.g. SAMPLE_R1 for SAMPLE_R1_peak_1.",
+    help="Comma-separated list of sample names as named in individual MACS3 broadPeak/narrowPeak output file e.g. SAMPLE_R1 for SAMPLE_R1_peak_1.",
 )
 argParser.add_argument("OUTFILE", help="Full path to output directory.")
 
@@ -76,7 +76,7 @@ def makedir(path):
 ## sort -k1,1 -k2,2n <MACS_NARROWPEAK_FILE_LIST> | mergeBed -c 2,3,4,5,6,7,8,9,10 -o collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse > merged_peaks.txt
 
 
-def macs2_merged_expand(MergedIntervalTxtFile, SampleNameList, OutFile, isNarrow=False, minReplicates=1):
+def macs3_merged_expand(MergedIntervalTxtFile, SampleNameList, OutFile, isNarrow=False, minReplicates=1):
     makedir(os.path.dirname(OutFile))
 
     combFreqDict = {}
@@ -208,7 +208,7 @@ def macs2_merged_expand(MergedIntervalTxtFile, SampleNameList, OutFile, isNarrow
 ############################################
 ############################################
 
-macs2_merged_expand(
+macs3_merged_expand(
     MergedIntervalTxtFile=args.MERGED_INTERVAL_FILE,
     SampleNameList=args.SAMPLE_NAME_LIST.split(","),
     OutFile=args.OUTFILE,

diff --git a/bin/plot_macs2_qc.r → bin/plot_macs3_qc.r b/bin/plot_macs2_qc.r → bin/plot_macs3_qc.r
@@ -20,7 +20,7 @@ library(scales)
 option_list <- list(make_option(c("-i", "--peak_files"), type="character", default=NULL, help="Comma-separated list of peak files.", metavar="path"),
                     make_option(c("-s", "--sample_ids"), type="character", default=NULL, help="Comma-separated list of sample ids associated with peak files. Must be unique and in same order as peaks files input.", metavar="string"),
                     make_option(c("-o", "--outdir"), type="character", default='./', help="Output directory", metavar="path"),
-                    make_option(c("-p", "--outprefix"), type="character", default='macs2_peakqc', help="Output prefix", metavar="string"))
+                    make_option(c("-p", "--outprefix"), type="character", default='macs3_peakqc', help="Output prefix", metavar="string"))
 
 opt_parser <- OptionParser(option_list=option_list)
 opt <- parse_args(opt_parser)