-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathdouble-stranded.qmd
46 lines (33 loc) · 2.53 KB
/
double-stranded.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# Double stranded DNA libraries
![Archibald by Tessa Zeibig](/assets/images/double-stranded/archibald.png){#fig-double-stranded-caricature height=400px}
```{r}
#| label: fig-double-stranded-smiley
#| fig-cap: Example of a smiley plot of a double stranded DNA library. Data taken from library COD076E1bL1 (ERR1943600-ERR1943602) of [@Star2017-cj]. Damage data generated using [DamageProfiler](/intro.qmd) and plotted using R and tidyverse packages [@Wickham2019-ot].
#| echo: false
#| warning: false
#| error: false
library(readr)
library(tidyr)
library(dplyr)
library(ggplot2)
library(patchwork)
source("assets/src/R/style.r")
source("assets/src/R/damage-profiler.r")
type_colours <- type_colours_dp
raw_5p <- readr::read_tsv("assets/data/double-stranded/5p_freq_misincorporations.txt", comment = "#")
raw_3p <- readr::read_tsv("assets/data/double-stranded/3p_freq_misincorporations.txt", comment = "#")
long_5p <- pivot_longer_freqmisincorporat(raw_5p, reverse = FALSE, type_colours = type_colours)
long_3p <- pivot_longer_freqmisincorporat(raw_3p, reverse = TRUE, type_colours = type_colours)
smiley_5p <- plot_longer_freqmisincorporat(long_5p, reverse = FALSE, type_colours)
smiley_3p <- plot_longer_freqmisincorporat(long_3p, reverse = TRUE, type_colours)
smiley_5p + smiley_3p
```
This is the 'classical' ancient DNA plot that you will see most often in palaeogenomics.
You expect to see a smooth curve from the beginning of the read (position 1) to a flat line in the middle (e.g. positions 10-25 in mapDamage plots).
At the 5' end this will be indicated by the original C to T deamination, whereas the 3' of the molecule will show the complementary G to A.
You only see deamination the C to T (and complement G to A) at one end of the the molecule, as during typical double-stranded library construction protcols [@Meyer2010-qc] only _one_ end of the single-ended overhangs of a DNA molecule is repaired by being 'filled in' (where the mis-reading of the deaminated C occurs).
Overhangs at the other end of the molecule (which may also hold cytosine demination) are 'blunt-ended' by being trimmed off.
Both fill-in and blunt-ending reactions are performed to allow ligation of next-generation-sequencing adapters and/or internal barcodes to both ends of the molecules.
The highest frequency point of the curve can vary from 1% to 50% depending on the age and preservation of the sample.
If you get such a plot with smooth lines from ancient DNA double-stranded libraries, this is a good indication you have authentic ancient DNA!
<!-- TODO: FINISH DESCRIPTION -->