-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Erik Wright
authored and
Erik Wright
committed
May 7, 2020
1 parent
a697ca3
commit 369ce86
Showing
6 changed files
with
18,315 additions
and
3,344 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,26 +1,34 @@ | ||
README for ncov analysis presented in: | ||
"SARS-CoV-2 genome evolution exposes early human adaptations" | ||
|
||
# AUTHOR | ||
Erik S. Wright <[email protected]> | ||
|
||
# DEPENDENCIES | ||
R >= 3.15 | ||
DECIPHER >= 2.14 | ||
R >= 4.0 | ||
DECIPHER >= 2.16.1 | ||
|
||
# TO RUN | ||
Open AnalyzeSequences_v5.R | ||
Install R then DECIPHER | ||
Open AnalyzeSequences_v6.R | ||
Set working directory to path to ncov | ||
Source the code in R | ||
|
||
# FILES | ||
AnalyzeSequences_v5.R = MAIN SCRIPT to perform all analyses | ||
AnalyzeSequences_v6.R = MAIN SCRIPT to perform all analyses | ||
coordinates_v1.R = Positions of features in reference genome | ||
gisaid_cov2020_sequences-Apr14.fasta.gz = FASTA file with all genomes except reference | ||
gisaid_cov2020_sequences-May2.fasta.gz = FASTA file with all genomes | ||
map_v1.R = Function for mapping substitutions on the phylogenetic tree | ||
movavg_v1.R = Function for performing center-point exponential moving averaging | ||
NC_045512.2.fas = FASTA file with reference genome | ||
results_v3.csv = Results of the analysis | ||
results_v5.csv = Results of the analysis | ||
metadata_v1.tsv = Tab delimited matrix of metadata and acknowledgements | ||
|
||
# CHANGING THE DATASET | ||
To change the dataset, simply change the file name of: | ||
gisaid_cov2020_sequences-Apr14.fasta.gz | ||
gisaid_cov2020_sequences-May2.fasta.gz | ||
and: | ||
metadata_v1.tsv | ||
in: | ||
AnalyzeSequences_v5.R | ||
All analyses should (hopefully) work, but figures might need some adjustment. | ||
AnalyzeSequences_v6.R | ||
All analyses should (hopefully) still work, but figures will need some adjustment. |
Binary file renamed
BIN
+34 MB
gisaid_cov2020_sequences-Apr14.fasta.gz → gisaid_cov2020_sequences-May2.fasta.gz
Binary file not shown.
Oops, something went wrong.