Skip to content

Commit 805feda

Browse files
authored
Merge pull request #109 from HaidyGiratallah/patch-1
2 parents 10d21f8 + 1ed4dbf commit 805feda

File tree

1 file changed

+328
-2
lines changed

1 file changed

+328
-2
lines changed

setup.md

Lines changed: 328 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,332 @@
11
---
22
layout: page
33
title: Setup
4-
root: .
54
---
6-
FIXME
5+
This lesson is an additional lesson to [the genomics workshop](https://datacarpentry.org/genomics-workshop/). Below, is a detailed setup instructions for the main workshop which can also be found on [the main setup page](https://datacarpentry.org/genomics-workshop/setup.html). If you are only here for the Intro to R and RStudio for Genomics lesson, and do not wish to work on the cloud, you can go for option B below where you will only need to [download the data files](https://figshare.com/articles/Data_Carpentry_Genomics_beta_2_0/7726454) to your local working directory where you will create the r-project in.
6+
7+
# Genomics workshop setup directions
8+
# Overview
9+
10+
This workshop is designed to be run on pre-imaged Amazon Web Services (AWS)
11+
instances. With the exception of a spreadsheet program, all of the software and data used in the workshop are hosted on an Amazon
12+
Machine Image (AMI). Please follow the instructions below to prepare your computer for the workshop:
13+
14+
- Required additional software + Option A
15+
**OR**
16+
- Required additional software + Option B
17+
18+
## Required additional software
19+
20+
This lesson requires a working spreadsheet program. If you don't have a spreadsheet program already, you can use LibreOffice. It's a free, open source spreadsheet program. Directions to install are included for each Windows, Mac OS X, and Linux systems below. For Windows, you will also need to install Git Bash, PuTTY, or the Ubuntu Subsystem.
21+
22+
> ## Windows
23+
> - Install LibreOffice by going to [the installation page](https://www.libreoffice.org/download/libreoffice-fresh/). The version for Windows should automatically be selected. Click Download Version X.X.X (whichever is the most recent version). You will go to a page that asks about a donation, but you don't need to make one. Your download should begin automatically.
24+
> - Once the installer is downloaded, double click on it and LibreOffice should install.
25+
> - Download the [Git for Windows installer](https://git-for-windows.github.io/). Run the installer and follow the steps below:
26+
> + Click on "Next" four times (two times if you've previously installed Git). You don't need to change anything in the Information, location, components, and start menu screens.
27+
> + **From the dropdown menu select "Use the Nano editor by default" (NOTE: you will need to scroll up to find it) and click on "Next".**
28+
> + On the page that says "Adjusting the name of the initial branch in new repositories", ensure that "Let Git decide" is selected. This will ensure the highest level of compatibility for our lessons.
29+
> + Ensure that "Git from the command line and also from 3rd-party software" is selected and click on "Next". (If you don't do this Git Bash will not work properly, requiring you to remove the Git Bash installation, re-run the installer and to select the "Git from the command line and also from 3rd-party software" option.)
30+
> + Ensure that "Use the native Windows Secure Channel Library" is selected and click on "Next".
31+
> + Ensure that "Checkout Windows-style, commit Unix-style line endings" is selected and click on "Next".
32+
> + **Ensure that "Use Windows' default console window" is selected and click on "Next".**
33+
> + Ensure that "Default (fast-forward or merge) is selected and click "Next"
34+
> + Ensure that "Git Credential Manager Core" is selected and click on "Next".
35+
> + Ensure that "Enable file system caching" is selected and click on "Next".
36+
> + Click on "Install".
37+
> + Click on "Finish".
38+
> + Check the settings for you your "HOME" environment variable.
39+
> - If your "HOME" environment variable is not set (or you don't know what this is):
40+
> - Open command prompt (Open Start Menu then type `cmd` and press [Enter])
41+
> - Type the following line into the command prompt window exactly as shown: `setx HOME "%USERPROFILE%"`
42+
> - Press [Enter], you should see `SUCCESS: Specified value was saved.`
43+
> - Quit command prompt by typing `exit` then pressing [Enter]
44+
> - An **alternative option** is to install PuTTY by going to the [the installation page](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html). For most newer computers, click on putty-64bit-X.XX-installer.msi to download the 64-bit version. If you have an older laptop, you may need to get the 32-bit version putty-X.XX-installer.msi. If you aren't sure whether you need the 64 or 32 bit version, you can check your laptop version by following [the instructions here](https://support.microsoft.com/en-us/help/15056/windows-32-64-bit-faq). Once the installer is downloaded, double click on it, and PuTTY should install.
45+
> - **Another alternative option** is to use the Ubuntu Subsystem for Windows. This option is only available for Windows 10 - detailed [instructions are available here](https://docs.microsoft.com/en-us/windows/wsl/install-win10).
46+
{: .solution}
47+
> ## Mac OS X
48+
> - Install LibreOffice by going to [the installation page](https://www.libreoffice.org/download/libreoffice-fresh/). The version for Mac should automatically be selected. Click Download Version X.X.X (whichever is the most recent version). You will go to a page that asks about a donation, but you don't need to make one. Your download should begin automatically.
49+
> - Once the installer is downloaded, double click on it and LibreOffice should install.
50+
{: .solution}
51+
> ## Linux
52+
> - Install LibreOffice by going to [the installation page](https://www.libreoffice.org/download/libreoffice-fresh/). The version for Linux should automatically be selected. Click Download Version X.X.X (whichever is the most recent version). You will go to a page that asks about a donation, but you don't need to make one. Your download should begin automatically.
53+
> - Once the installer is downloaded, double click on it and LibreOffice should install.
54+
{: .solution}
55+
## Option A (**Recommended**): Using the lessons with Amazon Web Services (AWS)
56+
57+
If you are signed up to take a Genomics Data Carpentry workshop, you do *not* need to worry about setting up an AMI instance. The Carpentries
58+
staff will create an instance for you and this will be provided to you at no cost. This is true for both self-organized and centrally-organized workshops. Your Instructor will provide instructions for connecting to the AMI instance at the workshop.
59+
60+
If you would like to work through these lessons independently, outside of a workshop, you will need to start your own AMI instance.
61+
Follow these [instructions on creating an Amazon instance](https://datacarpentry.org/genomics-workshop/AMI-setup/). Use the AMI `ami-04b3bc83255f918b0` (Data Carpentry Genomics with R 4.0) listed on the Community AMIs page. Please note that you must set your location as `N. Virginia` in order to access this community AMI. You can change your location in the upper right corner of the main AWS menu bar. The cost of using this AMI for a few days, with the t2.medium instance type is very low (about USD $1.50 per user, per day). Data Carpentry has *no* control over AWS pricing structure and provides this
62+
cost estimate with no guarantees. Please read AWS documentation on pricing for up-to-date information.
63+
64+
If you're an Instructor or Maintainer or want to contribute to these lessons, please get in touch with us [[email protected]](mailto:[email protected]) and we will start instances for you.
65+
66+
## Option B: Using the lessons on your local machine
67+
68+
While not recommended, it is possible to work through the lessons on your local machine (i.e. without using
69+
AWS). To do this, you will need to install all of the software used in the workshop and obtain a copy of the
70+
dataset. Instructions for doing this are below.
71+
72+
### Data
73+
74+
The data used in this workshop is available on FigShare. Because this workshop works with real data, be aware that file sizes for the data are large. Please read the FigShare page linked below for information about the data and access to the data files.
75+
76+
[FigShare Data Carpentry Genomics Beta 2.0](https://figshare.com/articles/Data_Carpentry_Genomics_beta_2_0/7726454)
77+
78+
More information about these data will be presented in the [first lesson of the workshop](http://www.datacarpentry.org/organization-genomics/data/).
79+
80+
### Software
81+
82+
| Software | Version | Manual | Available for | Description |
83+
| -------- | ------------ | ------ | ------------- | ----------- |
84+
| [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) | 0.11.7 | [Link](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/)| Linux, MacOS, Windows | Quality control tool for high throughput sequence data. |
85+
| [Trimmomatic](http://www.usadellab.org/cms/?page=trimmomatic) | 0.38 | [Link](http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf) | Linux, MacOS, Windows | A flexible read trimming tool for Illumina NGS data. |
86+
| [BWA](http://bio-bwa.sourceforge.net/) | 0.7.17 | [Link](http://bio-bwa.sourceforge.net/bwa.shtml) | Linux, MacOS | Mapping DNA sequences against reference genome. |
87+
| [SAMtools](http://samtools.sourceforge.net/) | 1.9 | [Link](http://www.htslib.org/doc/samtools.html) | Linux, MacOS | Utilities for manipulating alignments in the SAM format. |
88+
| [BCFtools](https://samtools.github.io/bcftools/) | 1.8 | [Link](https://samtools.github.io/bcftools/bcftools.html) | Linux, MacOS | Utilities for variant calling and manipulating VCFs and BCFs. |
89+
| [IGV](http://software.broadinstitute.org/software/igv/home) | [Link](https://software.broadinstitute.org/software/igv/download) | [Link](https://software.broadinstitute.org/software/igv/UserGuide) | Linux, MacOS, Windows | Visualization and interactive exploration of large genomics datasets. |
90+
91+
### QuickStart Software Installation Instructions
92+
93+
These are the QuickStart installation instructions. They assume familiarity with the command line and with installation in general. As there are different operating systems and many different versions of operating systems and environments, these may not work on your computer. If an installation doesn't work for you, please refer to the user guide for the tool, listed in the table above.
94+
95+
We have installed software using [miniconda](https://docs.conda.io/en/latest/miniconda.html). Miniconda is a package manager that simplifies the installation process. Please first install miniconda3 (installation instructions below), and then proceed to the installation of individual tools.
96+
97+
### Miniconda3
98+
99+
> ## MacOS
100+
>
101+
>To install miniconda3, type:
102+
>
103+
>~~~
104+
>$ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
105+
>$ bash Miniconda3-latest-MacOSX-x86_64.sh
106+
>~~~
107+
>{: .bash}
108+
> Then, follow the instructions that you are prompted with on the screen to install Miniconda3.
109+
{: .solution}
110+
111+
### FastQC
112+
113+
> ## MacOS
114+
>
115+
>To install FastQC, type:
116+
>
117+
> ~~~
118+
> $ conda install -c bioconda fastqc=0.11.7=5
119+
> ~~~
120+
>{: .bash}
121+
{: .solution}
122+
> ## FastQC Source Code Installation
123+
>
124+
> If you prefer to install from source, follow the directions below:
125+
>
126+
> ~~~
127+
> $ cd ~/src
128+
> $ curl -O http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.7.zip
129+
> $ unzip fastqc_v0.11.7.zip
130+
> ~~~
131+
> {: .bash}
132+
>
133+
> Link the fastqc executable to the ~/bin folder that
134+
> you have already added to the path.
135+
>
136+
> ~~~
137+
> $ ln -sf ~/src/FastQC/fastqc ~/bin/fastqc
138+
> ~~~
139+
> {: .bash}
140+
>
141+
> Due to what seems a packaging error
142+
> the executable flag on the fastqc program is not set.
143+
> We need to set it ourselves.
144+
>
145+
> ~~~
146+
> $ chmod +x ~/bin/fastqc
147+
> ~~~
148+
> {: .bash}
149+
{: .solution}
150+
**Test your installation by running:**
151+
152+
~~~
153+
$ fastqc -h
154+
~~~
155+
{: .bash}
156+
157+
### Trimmomatic
158+
159+
> ## MacOS
160+
>
161+
> ~~~
162+
> conda install -c bioconda trimmomatic=0.38=0
163+
> ~~~
164+
>{: .bash}
165+
{: .solution}
166+
> ## Trimmomatic Source Code Installation
167+
>
168+
> If you prefer to install from source, follow the directions below:
169+
>
170+
> ~~~
171+
> $ cd ~/src
172+
> $ curl -O http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.38.zip
173+
> $ unzip Trimmomatic-0.38.zip
174+
> ~~~
175+
> {: .bash}
176+
>
177+
> The program can be invoked via:
178+
>
179+
> ~~~
180+
> $ java -jar ~/src/Trimmomatic-0.38/trimmomatic-0.38.jar
181+
> ~~~
182+
>
183+
> The ~/src/Trimmomatic-0.38/adapters/ directory contains
184+
> Illumina specific adapter sequences.
185+
>
186+
> ~~~
187+
> $ ls ~/src/Trimmomatic-0.38/adapters/
188+
> ~~~
189+
> {: .bash}
190+
{: .solution}
191+
**Test your installation by running:** (assuming things are installed in ~/src)
192+
193+
~~~
194+
$ java -jar ~/src/Trimmomatic-0.38/trimmomatic-0.38.jar
195+
~~~
196+
{: .bash}
197+
198+
199+
> ## Simplify the Invocation, or to Test your installation if you installed with miniconda3:
200+
>
201+
> To simplify the invocation you could also create a script in the ~/bin folder:
202+
>
203+
> ~~~
204+
> $ echo '#!/bin/bash' > ~/bin/trimmomatic
205+
> $ echo 'java -jar ~/src/Trimmomatic-0.36/trimmomatic-0.36.jar $@' >> ~/bin/trimmomatic
206+
> $ chmod +x ~/bin/trimmomatic
207+
> ~~~
208+
> {: .bash}
209+
>
210+
> Test your script by running:
211+
>
212+
> ~~~
213+
> $ trimmomatic
214+
> ~~~
215+
> {: .bash}
216+
{: .solution}
217+
### BWA
218+
219+
> ## MacOS
220+
>
221+
>~~~
222+
>conda install -c bioconda bwa=0.7.17=ha92aebf_3
223+
>~~~
224+
>{: .bash}
225+
{: .solution}
226+
> ## BWA Source Code Installation
227+
>
228+
> If you prefer to install from source, follow the instructions below:
229+
>
230+
> ~~~
231+
> $ cd ~/src
232+
> $ curl -OL http://sourceforge.net/projects/bio-bwa/files/bwa-0.7.17.tar.bz2
233+
> $ tar jxvf bwa-0.7.17.tar.bz2
234+
> $ cd bwa-0.7.17
235+
> $ make
236+
> $ export PATH=~/src/bwa-0.7.17:$PATH
237+
> ~~~
238+
> {: .bash}
239+
{: .solution}
240+
**Test your installation by running:**
241+
242+
~~~
243+
$ bwa
244+
~~~
245+
{: .bash}
246+
247+
### SAMtools
248+
249+
> ## MacOS
250+
>
251+
>~~~
252+
>$ conda install -c bioconda samtools=1.9=h8ee4bcc_1
253+
>~~~
254+
>{: .bash}
255+
{: .solution}
256+
> ## SAMtools Versions
257+
> SAMtools has changed the command line invocation (for the better). But this means that most of the tutorials
258+
> on the web indicate an older and obsolete usage.
259+
>
260+
> Using SAMtools version 1.9 is important to work with the commands we present in these lessons.
261+
{: .callout}
262+
> ## SAMtools Source Code Installation
263+
>
264+
> If you prefer to install from source, follow the instructions below:
265+
>
266+
> ~~~
267+
> $ cd ~/src
268+
> $ curl -OkL https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2
269+
> $ tar jxvf samtools-1.9.tar.bz2
270+
> $ cd samtools-1.9
271+
> $ make
272+
> ~~~
273+
> {: .bash}
274+
>
275+
> Add directory to the path if necessary:
276+
>
277+
> ~~~
278+
> $ echo export `PATH=~/src/samtools-1.9:$PATH` >> ~/.bashrc
279+
> $ source ~/.bashrc
280+
> ~~~
281+
> {: .bash}
282+
{: .solution}
283+
**Test your installation by running:**
284+
285+
~~~
286+
$ samtools
287+
~~~
288+
{: .bash}
289+
290+
291+
### BCFtools
292+
293+
> ## MacOS
294+
>
295+
>~~~
296+
>$ conda install -c bioconda bcftools=1.8=h4da6232_3
297+
>~~~
298+
>{: .bash}
299+
{: .solution}
300+
> ## BCF tools Source Code Installation
301+
>
302+
> If you prefer to install from source, follow the instructions below:
303+
>
304+
> ~~~
305+
> $ cd ~/src
306+
> $ curl -OkL https://github.com/samtools/bcftools/releases/download/1.8/bcftools-1.8.tar.bz2
307+
> $ tar jxvf bcftools-1.8.tar.bz2
308+
> $ cd bcftools-1.8
309+
> $ make
310+
> ~~~
311+
> {: .bash}
312+
>
313+
> Add directory to the path if necessary:
314+
>
315+
> ~~~
316+
> $ echo export `PATH=~/src/bcftools-1.8:$PATH` >> ~/.bashrc
317+
> $ source ~/.bashrc
318+
> ~~~
319+
> {: .bash}
320+
{: .solution}
321+
**Test your installation by running:**
322+
323+
~~~
324+
$ bcftools
325+
~~~
326+
{: .bash}
327+
328+
329+
### IGV
330+
331+
- [Download the IGV installation files](https://software.broadinstitute.org/software/igv/download)
332+
- Install and run IGV using the [instructions for your operating system](https://software.broadinstitute.org/software/igv/download).

0 commit comments

Comments
 (0)