对于一个基因家族,它的特点就是编码蛋白都有同一个结构域,因为一般来讲,结构域决定某种功能,保守的结构域序列,容易形成稳定的三维结构结构域:Protein domain,是构成蛋白质(三级)**结构**的基本单元,Pfam和InterPro都是结构域网站
Homolog: A gene related to a second gene by descent from a common ancestral DNA sequence. The term, homolog, may apply to the relationship between genes separated by the event of speciation (see ortholog) or to the relationship betwen genes separated by the event of genetic duplication (see paralog).
Ortholog:Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Normally, orthologs retain the same function in the course of evolution. Identification of orthologs is critical for reliable prediction of gene function in newly sequenced genomes
Paralogs are genes related by duplication within a genome. Orthologs retain the same function in the course of evolution, whereas paralogs evolve new functions, even if these are related to the original one
二代测序短序列“short read aligners” OR "short read mappers"
- A mapping is a region where a read sequence is placed.
- A mapping is regarded to be correct if it overlaps the true region.
- An alignment is the detailed placement of each base in a read.
- An alignment is regarded to be correct if each base is placed correctly.
- 比对算法:
global, local or semi-local
- 工具遇到INDEL怎么处理?
- 工具可以跨大片段区域比对吗?
- 工具可以进行根据阈值过滤吗?阈值设置?
- 工具可以找到嵌合体比对吗?
Burrows-Wheelers Aligner
bwa mem Maximally Exact Matches
mkdir -p ~/tmp/bwa_test/ref
# 下载基因组(19Kb)[安装entrez-direct]
conda install -c bioconda entrez-direct
efetch -db=nuccore -format=fasta -id=AF086833 > ~/tmp/bwa_test/ref/1976.fa
# 构建索引
bwa index $ref
mkdir -p ~/tmp/bwa_test/raw && cd ~/tmp/bwa_test/raw
# 获取全部的埃博拉病毒项目的测序数据
esearch -db sra -query PRJNA257197 | efetch -format runinfo > runinfo.csv
# 挑选SRR1972739下载
cat runinfo.csv| grep SRR1972739 | cut -d "," -f 10 | xargs -n1 wget {}
# 解压成fq文件 [为了测试,选取前1万条reads]
fastq-dump -X10000 --split-files SRR1972739
mkdir -p ~/tmp/bwa_test/align && cd ~/tmp/bwa_test/align
baw mem $ref $r1 $r2 > bwa.sam
cat bwa.sam | cut -f 12-20 | head
bowtie2-build ~/tmp/bwa_test/ref/1976.fa ./1976 #index前缀
bowtie2 -x $ref -1 $r1 -2 $r2 > bowtie.sam
cat bowtie.sam | cut -f 12-20 | head
# 查看git版本
git --version
# 如果要更新git
git clone https://github.com/git/git
# open terminal (in Mac) or git_bash (in Windows)
# 新建一个目录本地存放git文件
mkdir ~/Git_Projects
cd ~/Git_Projects
# 然后初始化
git init # 会生成一个.git文件 (可以通过ls -a 查看)
ls ~/.ssh
ssh-keygen -t rsa -C [email protected]
#ls ~/.ssh ,其中的id_rsa.pub是需要用到的
#Add SSH key
ssh -T [email protected]
mkdir -p ~/Git/GEO
cd ~/Git/GEO
git init # 初始化git
git config --list
# 然后将本地git与GitHub联系起来
cd ~/.ssh # (注意这是隐藏文件夹,用ls -la才能查看)
ssh-keygen -t rsa -C [email protected] # 改一下邮箱名就好
Enter file in which to save the key (/YOUR/PATH/.ssh/id_rsa):
#这里输入自己能记住的密码 (可以和GitHub的密码一样)
Enter passphrase (empty for no passphrase):
# 再输入一遍
Enter same passphrase again:
# 然后看到.ssh文件夹中存在了id_rsa.pub
cat id_rsa.pub #然后将内容复制下来
git status
git add *
git commit -m
# 5.1 还是先检查下状态
git status
# 5.2 编辑代码主题
git commit -m "added info of us"
# 5.3 查看修改日志
git log
# 5.4 在Github上创建一个Repository,然后得到这个repository的地址,例如:https://github.com/YOUR_NAME/YOUR_CODE.git [第一次添加需要用户名和密码]
echo "something" >> README.md
git add README.md
git commit -m "add README"
git remote add origin https://github.com/YOUR_NAME/YOUR_CODE.git
# 5.5 然后我们就可以把自己的代码同步到Github网站的这个repository了
git push -u origin master
# 接下来就是上传
# 先看看log日志,我们做过的改动
git log
# 会得到以下一系列的更改日志
# 我们需要的就是commit后面的一串编号
# 比如:我想看看倒数第二次改动和最后一次改动的差别 [head就代表最后一次]
git diff e17d59584f5d812d2cfd7d60374c83721c9bdb31 head
Git中的三大重要空间就是:工作区(Working directory)、暂存区(Staging area)、仓库(Repository)
使用git rm a.txt
git mv a.txt b.txt
查看分支:git branch
创建分支:git branch 分支名
切换分支:git checkout 分支名
删除分支:git branch -d 分支名
合并分支:git merge 合并分支名
创建并切换分支:git checkout -b 分支名
- /mtk/ 过滤整个文件夹
- *.zip 过滤.zip文件夹
- /mtk/dov 过滤某个具体文件
- !index.php 不过滤具体某个文件
git push报错
git pull
git config branch.master.remote origin
git config branch.master.merge refs/heads/master