-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathIMPUTE2_splines.Rmd
94 lines (67 loc) · 2.11 KB
/
IMPUTE2_splines.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
title: "R Notebook"
output:
html_notebook: default
html_document: default
---
Genetic Map in IMPUTE2 format for all the chromosomes:
We will be extracting the positions
```{r}
chromosome1 <- read.delim2("/home/DB2/Imputation/Splines/chr1_Beagle.map", header = F, stringsAsFactors = F)
cM <- chromosome1[,3]
physical <- chromosome1[,4]
IMPUTE2 <- matrix(0, nrow = length(physical), ncol = 3) # IMPUTE2 genetic map format
colnames(IMPUTE2) <- c("position", "COMBINED_rate.cM.Mb.", "Genetic_Map.cM.")
IMPUTE2[,1] <- as.numeric(physical)
IMPUTE2[,3] <- as.numeric(cM)
```
Now we will calculate the Recombination rate for chromosome 1:
```{r}
for (i in 1:(dim(IMPUTE2)[1]-1)){
# CM between markers
cm <- as.numeric(IMPUTE2[i+1,3]) - as.numeric(IMPUTE2[i,3])
# Mb between markers
mb <- as.numeric(IMPUTE2[i+1,1]) - as.numeric(IMPUTE2[i,1])
# Recombination Rate
rr <- cm/(mb/1000000)
if (rr > 600){
IMPUTE2[i,2] <- 0
}else{
IMPUTE2[i,2] <- rr
}
}
```
Plotting the first chunk of the chromosome
```{r}
plot(IMPUTE2[1:25000,2])
```
Now calculate the map in IMPUTE2 format for everything
```{r}
setwd("/home/DB2/Imputation/Splines/")
for (j in 1:18) {
chromosome <- paste("chr", j, "_Beagle.map", sep = "")
chromosome1 <- read.delim2(chromosome, header = F, stringsAsFactors = F)
cM <- chromosome1[,3]
physical <- chromosome1[,4]
IMPUTE2 <- matrix(0, nrow = length(physical), ncol = 3) # IMPUTE2 genetic map format
colnames(IMPUTE2) <- c("position", "COMBINED_rate.cM.Mb.", "Genetic_Map.cM.")
IMPUTE2[,1] <- as.numeric(physical)
IMPUTE2[,3] <- as.numeric(cM)
for (i in 1:(dim(IMPUTE2)[1]-1)){
# CM between markers
cm <- as.numeric(IMPUTE2[i+1,3]) - as.numeric(IMPUTE2[i,3])
# Mb between markers
mb <- as.numeric(IMPUTE2[i+1,1]) - as.numeric(IMPUTE2[i,1])
# Recombination Rate
rr <- cm/(mb/1000000)
if (rr > 600){
IMPUTE2[i,2] <- 0
}else{
IMPUTE2[i,2] <- rr
}
}
output <- paste("chr",j,"_Impute2.map", sep = "")
# Export the data frame in Beagle format
write.table(IMPUTE2, file=output, sep="\t", quote = F, row.names = F, col.names = T)
}
```