diff --git a/README.md b/README.md index 26b1fba..b4ce818 100644 --- a/README.md +++ b/README.md @@ -4,9 +4,9 @@ Randomized Haseman–Elston regression for Multi-variance Components ## Prerequisites -The following packages are required on a linux machine to compile and use the software package. +The following packages are required on a Linux machine to compile and use the software package. ``` -g++ (4.4.7) +g++ cmake make ``` @@ -23,51 +23,61 @@ make ``` # Documentation for RHE-mc -An executable file named RHEmc will be in build folder after the installation steps. Run RHE-mc as follows: +An executable file named RHEmc will be in the build folder after the installation steps. Run RHE-mc as follows: ``` ./RHEmc ``` -To run the highly memory efficient version : (-jn 1000 is recommended when there are many overlapping annotations or small annotations) +To run the high memory efficient version : (-jn 1000 is recommended when there are many overlapping annotations or small annotations) + +The memory usage of RHEmc_mem does not depend on the number of jackknife blocks. -The memory usage of RHEmc_mem does not depend on number of jackknife blocks. ``` ./RHEmc_mem ``` -To estimate dominance heritability and additve heritability jointly run : +To estimate dominance heritability and additive heritability jointly run : ``` ./RHEmc_dom ``` +To run the multiple phenotypes version: + +``` +./RHEmc_mp + +``` + + + ## Parameters ``` -genotype (-g) : The path of genotype file +genotype (-g): The path of genotype file phenotype (-p): The path of phenotype file covariate (-c): The path of covariate file annotation (-annot): The path of annotation file. -num_vec (-k) : The number of random vectors (10 is recommended). -num_block (-jn): The number of jackknife blocks. (100 is recommended). The higher number of jackknife blocks the higher memory usage. -out_put (-o): The path of output file. +num_vec (-k): The number of random vectors (10 is recommended). +num_block (-jn): The number of jackknife blocks. (100 is recommended). The higher number of jackknife blocks the higher the memory usage. +out_put (-o): The path of the output file. ``` ## File formats ``` -Genotype : It must be in bed format. -Phenotype: It must have a header in the following format: FID IID name_of_phenotype +Genotype: It must be in bed format. +Phenotype: It must have a header in the following format: FID IID name_of_phenotype. In case of multiple phenotypes: FID IID name_of_phen_1 name_of_phen_2 . . . name_of_phen_n Covariate: It must have a header in the following format: FID IID name_of_cov_1 name_of_cov_2 . . . name_of_cov_n -Annotation: It has M rows (M=number of SNPs) and K columns (K=number of annotations). If SNP i belongs to annotation j, then there is "1" in row i and column j. Otherwise there is "0". (delimiter is " ") +Annotation: It has M rows (M=number of SNPs) and K columns (K=number of annotations). If SNP i belongs to annotation j, then there is "1" in row i and column j. Otherwise, there is "0". (delimiter is " ") -1) Number and order of individuals must be same in phenotype, gentype and covariate files. -2) Number and order of SNPs must be same in bim file and annotation file. +1) Number and order of individuals must be the same in phenotype, genotype, and covariate files. +2) Number and order of SNPs must be the same in the bim file and annotation file. 3) Annotation file does not have a header. The code supports overlapping annotations (e.g : functional annotation) 4) SNPs with MAF=0 must be excluded from the genotype file. 5) RHE-mc excludes individuals with NA values in the phenotype file from the analysis. ``` ## Toy example -To make sure that everything works well, sample files are provided in example directory. Look at test.sh file and run it : +To make sure that everything works well, sample files are provided in the example directory. Look at test.sh file and run it : ``` chmod +x test.sh