PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features
PredcircRNA, focused on distinguishing circularRNA from other lncRNAs using
multiple kernel learning. Firstly we extracted different sources of discriminative features, including graph feature, conservation information and sequence
compositions, ALU and tandem repeat, SNP density and open reading frame (ORF) from transcripts. Secondly, to better integrate features from different sources, we
proposed a computational approach based on multiple kernel learning framework to fuse those heterogeneous features.
Dependcy:
- GraphProt: http://www.bioinf.uni-freiburg.de/Software/GraphProt/
- SHOTGUN: http://www.shogun-toolbox.org/
- txCdsPredict: http://hgdownload.cse.ucsc.edu/admin/
- Tandem repeats finder(trf): http://tandem.bu.edu/trf/trf.download.html
Input bed file format(such as test_bed):
chr2 69304539 69318051 + gene1
chr7 138593736 138597206 - gene2
chr22 39134591 39137055 - gene3
NOTICE: in the last column, we need have unique name (here is gene1, gene2...) for the transcript.
How to use the tool, the command as follows:
python PredcircRNA.py --inputfile=test_bed --outputfile=test_bed_out
The output file have corresponding lncRNA type in last column.
Webserver :
http://rth.dk/resources/webcircrna
You can also use our updated webserver to predict the circRNA potential for coding and non-coding RNAs.
Reference
Xiaoyong Pan, Kai Xiong. PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features. Mol. BioSyst., 2015, 11, 2219-2226