Source code for ensembl.tools.anno.transcriptomic_annotation.minimap
-# See the NOTICE file distributed with this work for additional information
+# See the NOTICE file distributed with this work for additional information #pylint: disable=missing-module-docstring
# regarding copyright ownership.
#
# Licensed under the Apache License, Version 2.0 (the "License");
@@ -50,7 +50,6 @@ Source code for ensembl.tools.anno.transcriptomic_annotation.minimap
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
-
"""
Minimap2 is a pairwise sequence alignment algorithm designed for efficiently comparing nucleotide sequences.
The algorithm uses a versatile indexing strategy to quickly find approximate matches between sequences,
@@ -89,14 +88,23 @@ Source code for ensembl.tools.anno.transcriptomic_annotation.minimap
"""
Run Minimap2 to align long read data against genome file.
Default Minimap set for PacBio data.
- Args:
- output_dir : Working directory path.
- long_read_fastq_dir : Long read directory path.
- genome_file : Genome file path.
- minimap2_bin : Software path.
- paftools_bin : Software path.
- max_intron_length : The maximum intron size for alignments. Defaults to 100000.
- num_threads : Number of available threads.
+ :param output_dir: Working directory path.
+ :type output_dir: Path
+ :param long_read_fastq_dir: Long read directory path.
+ :type long_read_fastq_dir: Path
+ :param genome_file: Genome file path.
+ :type genome_file: Path
+ :param minimap2_bin: Software path.
+ :type minimap2_bin: Path, default minimap2
+ :param paftools_bin: Software path.
+ :type paftools_bin: Path, default paftools.js
+ :param max_intron_length: The maximum intron size for alignments. Defaults to 100000.
+ :type max_intron_length: int, default 100000
+ :param num_threads: Number of available threads.
+ :type num_threads: int, default 1
+
+ :return: None
+ :rtype: None
"""
check_exe(minimap2_bin)
check_exe(paftools_bin)
diff --git a/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/scallop.html b/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/scallop.html
index 6f09568..b36c5a4 100644
--- a/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/scallop.html
+++ b/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/scallop.html
@@ -92,12 +92,19 @@ Source code for ensembl.tools.anno.transcriptomic_annotation.scallop
"""
Run Scallop assembler on short read data after STAR alignment.
- Args:
- output_dir : Working directory path.
- scallop_bin : Software path.
- prlimit_bin : Software path.
- stringtie_bin : Software path.
- memory_limit : Memory limit Scallop command Defaults to 40*1024**3.
+ :param output_dir: Working directory path.
+ :type output_dir: Path
+ :param scallop_bin: Software path.
+ :type scallop_bin: Path, default scallop
+ :param prlimit_bin: Software path.
+ :type prlimit_bin: Path, default prlimit
+ :param stringtie_bin: Software path.
+ :type stringtie_bin: Path, default stringtie
+ :param memory_limit: Memory limit Scallop command Defaults to 40*1024**3.
+ :type memory_limit: int
+
+ :return: None
+ :rtype: None
"""
check_exe(scallop_bin)
check_exe(stringtie_bin)
diff --git a/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/star.html b/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/star.html
index bc215f2..9fa9ac7 100644
--- a/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/star.html
+++ b/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/star.html
@@ -100,18 +100,31 @@ Source code for ensembl.tools.anno.transcriptomic_annotation.star
Run STAR alignment on list of short read data.
Args:
- genome_file : Genome file path.
- output_dir : Working directory path.
- short_read_fastq_dir : Short read directory path.
- delete_pre_trim_fastq : Delete the original fastq files after trimming. Defaults to False.
- trim_fastq : Trim short read files using TrimGalore. Defaults to False.
- max_reads_per_sample : Max number of reads per sample. Defaults to 0 (unlimited).
- max_intron_length : The maximum intron size for alignments. Defaults to 100000.
- num_threads : Number of available threads.
- star_bin : Software path.
- samtools_bin : Software path.
- trim_galore_bin : Software path.
-
+ :param genome_file: Genome file path.
+ :type genome_file: Path
+ :param output_dir: Working directory path.
+ :type output_dir: Path
+ :param short_read_fastq_dir: Short read directory path.
+ :type short_read_fastq_dir: Path
+ :param delete_pre_trim_fastq: Delete the original fastq files after trimming. Defaults to False.
+ :type delete_pre_trim_fastq: boolean, default False
+ :param trim_fastq: Trim short read files using TrimGalore. Defaults to False.
+ :type trim_fastq: boolean, default False
+ :param max_reads_per_sample: Max number of reads per sample. Defaults to 0 (unlimited).
+ :type max_reads_per_sample: int, default 0
+ :param max_intron_length: The maximum intron size for alignments. Defaults to 100000.
+ :type max_intron_length: int, default 100000
+ :param num_threads: Number of available threads.
+ :type num_threads: int, default 1
+ :param star_bin: Software path.
+ :type star_bin: Path, default star
+ :param samtools_bin: Software path.
+ :type samtools_bin: Path,default samtools
+ :param trim_galore_bin: Software path.
+ :type trim_galore_bin: Path, default trim_galore
+
+ :return: None
+ :rtype: None
"""
check_exe(star_bin)
# If trimming has been enabled then switch the path for
diff --git a/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/stringtie.html b/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/stringtie.html
index b951590..9adde40 100644
--- a/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/stringtie.html
+++ b/docs/build/_modules/ensembl/tools/anno/transcriptomic_annotation/stringtie.html
@@ -83,10 +83,15 @@ Source code for ensembl.tools.anno.transcriptomic_annotation.stringtie
<
) -> None:
"""
StringTie assembler of short read data.
- Args:
- output_dir : Working directory path.
- stringtie_bin : Software path.
- num_threads : Number of available threads.
+ :param output_dir: Working directory path.
+ :type output_dir: Path
+ :param stringtie_bin: Software path.
+ :type stringtie_bin: Path, default stringtie
+ :param num_threads: Number of available threads.
+ :type num_threads: int, default 1
+
+ :return: None
+ :rtype: None
"""
check_exe(stringtie_bin)
stringtie_dir = create_dir(output_dir, "stringtie_output")
diff --git a/docs/build/_sources/index.rst.txt b/docs/build/_sources/index.rst.txt
index 6e24e41..add9008 100644
--- a/docs/build/_sources/index.rst.txt
+++ b/docs/build/_sources/index.rst.txt
@@ -20,7 +20,7 @@
========================================
Ensembl-anno
-===========================================
+========================================
Anno tool kit
diff --git a/docs/build/cpg.html b/docs/build/cpg.html
index 7afe15e..390af36 100644
--- a/docs/build/cpg.html
+++ b/docs/build/cpg.html
@@ -51,14 +51,59 @@
-
ensembl.tools.anno.simple_feature_annotation.cpg.run_cpg(genome_file: PathLike, output_dir: Path, cpg_bin: Path = PosixPath('cpg_lh'), cpg_min_length: int = 400, cpg_min_gc_content: int = 50, cpg_min_oe: float = 0.6, num_threads: int = 1) None [source]¶
-Run CpG islands on genomic slices
-:param genome_file: Genome file path.
-:param output_dir: Working directory path
-:param cpg_bin: CpG software path.
-:param cpg_min_length: Min length of CpG islands
-:param cpg_min_gc_content: Min GC frequency percentage
-:param cpg_min_oe: Min ratio of the observed to expected number of CpG (CpGo/e)
-:param num_threads: int, number of threads.
+Run CpG islands on genomic slices
+
+
+- param genome_file:
+Genome file path.
+
+- type genome_file:
+PathLike
+
+- param output_dir:
+Working directory path
+
+- type output_dir:
+Path
+
+- param cpg_bin:
+CpG software path.
+
+- type cpg_bin:
+Path
+
+- param cpg_min_length:
+Min length of CpG islands
+
+- type cpg_min_length:
+int
+
+- param cpg_min_gc_content:
+Min GC frequency percentage
+
+- type cpg_min_gc_content:
+int
+
+- param cpg_min_oe:
+Min ratio of the observed to expected number of CpG (CpGo/e)
+
+- type cpg_min_oe:
+float
+
+- param num_threads:
+int, number of threads.
+
+- type num_threads:
+int
+
+- return:
+None
+
+- rtype:
+None
+
+
+
diff --git a/docs/build/dust.html b/docs/build/dust.html
index 68adb79..ac6ee27 100644
--- a/docs/build/dust.html
+++ b/docs/build/dust.html
@@ -51,11 +51,41 @@
-
ensembl.tools.anno.repeat_annotation.dust.run_dust(genome_file: PathLike, output_dir: Path, dust_bin: Path = PosixPath('dustmasker'), num_threads: int = 1) None [source]¶
-Run Dust on genomic slices with mutiprocessing
-:param genome_file: Genome file path.
-:param output_dir: Working directory path.
-:param dust_bin: Dust software path.
-:param num_threads: Number of threads.
+
+- Run Dust on genomic slices with mutiprocessing
+- param genome_file:
+Genome file path.
+
+- type genome_file:
+PathLike
+
+- param output_dir:
+Working directory path.
+
+- type output_dir:
+Path
+
+- param dust_bin:
+Dust software path.
+
+- type dust_bin:
+Path, default dustmasker
+
+- param num_threads:
+Number of threads.
+
+- type num_threads:
+int, default 1
+
+- return:
+None
+
+- rtype:
+None
+
+
+
+
@@ -74,10 +104,7 @@ Table of Contents
API Setup and installation
License
CpG Module Documentation
-DustMasker Module Documentation
-run_dust()
-
-
+DustMasker Module Documentation
Eponine Module Documentation
Genblast Module Documentation
Minimap2 Module Documentation
diff --git a/docs/build/eponine.html b/docs/build/eponine.html
index d0fefe0..3e90c10 100644
--- a/docs/build/eponine.html
+++ b/docs/build/eponine.html
@@ -52,12 +52,47 @@
-
ensembl.tools.anno.simple_feature_annotation.eponine.run_eponine(genome_file: PathLike, output_dir: Path, num_threads: int = 1, java_bin: Path = PosixPath('java'), eponine_bin: Path = PosixPath('/hps/software/users/ensembl/ensw/C8-MAR21-sandybridge/linuxbrew/opt/eponine/libexec/eponine-scan.jar'), eponine_threshold: float = 0.999) None [source]¶
-Run Eponine on genomic slices
-:param genome_file: Genome file path.
-:param output_dir: Working directory path.
-:param java_bin: Java path.
-:param eponine_bin: Eponine software path
-:param num_threads: Number of threads.
+
+- Run Eponine on genomic slices
+- param genome_file:
+Genome file path.
+
+- param genome_file:
+PathLike
+
+- param output_dir:
+Working directory path.
+
+- param output_dir:
+Path
+
+- param java_bin:
+Java path.
+
+- param java_bin:
+Path, default java
+
+- param eponine_bin:
+Eponine software path
+
+- param eponine_bin:
+Path
+
+- param num_threads:
+Number of threads.
+
+- param num_threads:
+int, default 1
+
+- return:
+None
+
+- rtype:
+None
+
+
+
+
@@ -77,10 +112,7 @@ Table of Contents
License
CpG Module Documentation
DustMasker Module Documentation
-Eponine Module Documentation
-
+Eponine Module Documentation
Genblast Module Documentation
Minimap2 Module Documentation
Red Module Documentation
diff --git a/docs/build/genblast.html b/docs/build/genblast.html
index 67417d2..6e101eb 100644
--- a/docs/build/genblast.html
+++ b/docs/build/genblast.html
@@ -49,8 +49,8 @@
comparative genomics tasks and accurately identify homologs even when
the sequences have undergone significant evolutionary changes.
This capability makes it a valuable resource for researchers studying gene
-evolution, gene families, and gene function across diverse species.
-GenBlast has been widely used in various genomic analyses and is available as
+evolution, gene families, and gene function across diverse species.
+GenBlast has been widely used in various genomic analyses and is available as
a standalone command-line tool or as part of different bioinformatics pipelines.
Researchers in the field of comparative genomics and gene function analysis
often rely on GenBlast to perform sensitive homology searches and obtain
@@ -62,17 +62,79 @@
-
ensembl.tools.anno.protein_annotation.genblast.run_genblast(masked_genome: Path, output_dir: Path, protein_dataset: Path, max_intron_length: int, genblast_timeout_secs: int = 10800, genblast_bin: Path = PosixPath('genblast'), convert2blastmask_bin: Path = PosixPath('convert2blastmask'), makeblastdb_bin: Path = PosixPath('makeblastdb'), num_threads: int = 1, protein_set: str = ['uniprot', 'orthodb']) None [source]¶
-Executes GenBlast on genomic slices
-:param masked_genome: Masked genome file path.
-:param output_dir: Working directory path.
-:param protein_dataset: Protein dataset (Uniprot/OrthoDb) path.
-:param genblast_timeout_secs: Time for timeout (sec).
-:param max_intron_length: Maximum intron length.
-:param genblast_bin: Software path.
-:param convert2blastmask_bin: Software path.
-:param makeblastdb_bin: Software path.
-:param genblast_timeout: seconds
-:param num_threads: int, number of threads.
+
+- Executes GenBlast on genomic slices
+- param masked_genome:
+Masked genome file path.
+
+- type masked_genome:
+Path
+
+- param output_dir:
+Working directory path.
+
+- type output_dir:
+Path
+
+- param protein_dataset:
+Protein dataset (Uniprot/OrthoDb) path.
+
+- type protein_dataset:
+Path
+
+- param genblast_timeout_secs:
+Time for timeout (sec).
+
+- type genblast_timeout_secs:
+int, default 10800
+
+- param max_intron_length:
+Maximum intron length.
+
+- type max_intron_length:
+int
+
+- param genblast_bin:
+Software path.
+
+- type genblast_bin:
+Path, default genblast
+
+- param convert2blastmask_bin:
+Software path.
+
+- type convert2blastmask_bin:
+Path, default convert2blastmask
+
+- param makeblastdb_bin:
+Software path.
+
+- type makeblastdb_bin:
+Path, default makeblastdb
+
+- param genblast_timeout:
+seconds
+
+- type genblast_timeout:
+int, default 1
+
+- param num_threads:
+int, number of threads.
+
+
+:type num_threads:int, default 1
+:param protein_set: Source
+:type str: [“uniprot”, “orthodb”]
+
+- return:
+None
+
+- rtype:
+None
+
+
+
+
@@ -93,10 +155,7 @@ Table of Contents
CpG Module Documentation
DustMasker Module Documentation
Eponine Module Documentation
-Genblast Module Documentation
-
+Genblast Module Documentation
Minimap2 Module Documentation
Red Module Documentation
Repeatmasker Module Documentation
diff --git a/docs/build/index.html b/docs/build/index.html
index 0e1cabe..9f11b08 100644
--- a/docs/build/index.html
+++ b/docs/build/index.html
@@ -5,7 +5,7 @@
- Contents — ensembl-anno 0.1 documentation
+ Ensembl-anno — ensembl-anno 0.1 documentation
@@ -39,9 +39,11 @@
- Anno tool kit
+
+Ensembl-anno¶
+Anno tool kit
-Contents¶
+Contents¶
Check out installation section for further information on how
to install the project.
-Indices and tables¶
+Indices and tables¶
+
diff --git a/docs/build/install.html b/docs/build/install.html
index 6d87929..b2cf6cf 100644
--- a/docs/build/install.html
+++ b/docs/build/install.html
@@ -15,14 +15,14 @@
-
+