eQTLHap (V0.1) is a comprehensive eQTL analysis tool that scans the genome for three kinds of associations:
- Associations with single SNP similar to standard eQTL analysis.
- Associations with a block of SNPs represented by their phased haplotypes.
- Associations with a block of SNPs represented by their genotypes.
eQTLHap is implemented in R and it depends on matrices operations to calculate correlation coefficients. It adapts the ultra-fast Matrix eQTL (http://www.bios.unc.edu/research/genomic_software/Matrix_eQTL/) to deal with blocks and to maintain high speed.
Signifincat associations (Blocks' haplotypes, Blocks' genotypes and SNPs) are available at https://drive.google.com/drive/u/3/folders/1gTUPCSeLGLLSRoKVSxowxN8p70LpA25Y
- R version 3.4.4 (2018-03-15) or later.
- dplyr and optparse (R libraries).
Rscript eQTLHap.R -f path_to_phased_haplotypes_in_shapeit_format -g path_to_gene_expression_file_bed_format -b path_to_haplotype_blocks_plink_det_format -o path_to_output_files
Please read documentation.pdf for more details and examples. Parameters and options can be accessed using help command.
Rscript eQTLHap.R --help
-f
or--haps
: Phased haplotypes in SHAPEIT format (.haps/.sample). The complete path of the .haps file should be provided, however, .sample file should be in the same location and with the same name as .haps file. Users can also provide a VCF file but it requires the flag--vcf
to be enabled. Files can be gzipped (.gz).-g
or--genes
: The path for gene expression file in bed format.-b
or--blocks
: The path for haplotype blocks file. It is mandatory when block assessment is required.-o
or--out
: The path for output RData files.
-c
or--cov
: The path for covariates file.--chrm
: The path for covariates file.--mtc
: Multiple test correction approach. Take any value fromp.adjust.methods
. The default is Benjamini-Hochberg (BH).-p
or--permutation
: The number of permutations for permutation-based multiple test correction. The default value is 1,000.-w
or--window
: scanning window up/down transcription start site (TSS). The default is 1000,000.-a
or--assessment
: Assessment type, takes any combination of the letters S, G and H. where S: single SNP assessment. G: block's genotype. H: block's haplotype. The default is HSG.--vcf
: Flag to process VCF file for phased haplotype file provided by-f
. The default value isFALSE
.--smf
: SNP minimum frequency to be included in the analysis. The default value is 0.01.--hmf
: Haplotype minimum frequency to be included in the analysis. The default value is 0.02.--gmf
: Genotype minimum frequency to be included in the analysis. The default value is 0.02.--maxPval4Perm
: Maximum p-value for an association to be passed to permutation-based multiple test correction. The default value is 0.--rmvIndividuals
: When block assessment is applied and this option is enabled, individuals with rare haplotypes (freq <--hmf
) or rare genotypes (freq <--gmf
) will be eliminated from the assessment. The default value isFALSE
.--minIndividuals
: Minumum individuals count to perform a statistical assessment. The default value is 50.--outSignifcancePval
: Maximum p-value for the association to be reported in the output files. The default value is 0.05.--outSignifcanceQval
: Maximum corrected p-value for the association to be reported in the output files. The default value is 1.--outSignifcancePerm
: Maximum permutation p-value for the association to be reported in the output files. The default value is 1.--customBlocks
: A flag to provide a custom block (a subset of the SNPs within the block) instead of considering the complete block (all SNPs). The default value isFALSE
.--unphased
: It is needed when there is no haplotype-based eQTL analysis and input VCF file is unphased.
The folder comparison contains R scripts to compare results obtained by eQTLHap and the results obtained by:
- Matrix eQTL available at http://www.bios.unc.edu/research/genomic_software/Matrix_eQTL/.
- R-linear regression model (lm).
Al Bkhetan, Ziad, et al. "eQTLHap: a tool for comprehensive eQTL analysis considering haplotypic and genotypic effects." Briefings in Bioinformatics (2021).
Copyright 2020 Ziad Al Bkhetan
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
For any help or inquiries, please contact: [email protected]