- make (version 3.81 or higher)
- g++ (GCC version 4.1.2 or higher)
- IBM ILOG CPLEX Optimization Studio
In the Makefile
, set CPLEXROOT
to the path of your root CPLEX folder.
The Makefile is set-up for GCC 6.2. If you are using GCC version 4.x, add std=gnu++0x
flag to CCC
in the Makefile.
Simply run make
command in the root cd-CAP folder. It will create the executables.
Usage:
./mcsc -n [network] -l [alteration profiles] -c [chromosome information; optional] -r [min number of colours in subnetwork] -x [exclude genes; optional] -s [maximum subnetwork size] -t [minimum subgraph recurrence] -k [number of subnetworks] -e [error; optional] -f [outputFolder] -d [threads] -t [time limit in seconds]
./mcsi -p (for p value simulation; optional) -n [network] -l [alteration profiles] -r [color options in subnetwork] -s [maximum subnetwork size] -t [minimum subgraph recurrence] -e [error; optional]
Parameters | Description for MCSC | Description for MCSI |
---|---|---|
-n |
network file | network file |
-l |
alterations file | alterations file |
-c |
(optional) gene-to-chromosome map | N/A |
-x |
(optional) Excluded genes | N/A |
-f |
output folder name | N/A (mcsi has a single output file) |
-r |
minimum number of colors in each subnetwork | color requirement of the maximum subnetwork |
-s |
maximum subnetwork size | maximum subnetwork size |
-t |
minimum sample recurrence | minimum sample recurrence |
-k |
number of resulting subnetworks | N/A |
-e |
(optional) allowed extension error rate | (optional) allowed extension error rate |
-d |
number of threads used for ILP solver | N/A |
-h |
time limit in seconds for ILP solver | N/A |
-p |
N/A | (optional, without arguments) p-value simulation mode |
-n
: This parameter represents an edge collection file where each row represents an edge in form of two node names, separated by whitespace. All edges are treated as undirected. There is no header row. e.g.
A1BG CRISP3
A1CF APOBEC1
A2M ABCA1
...
-l
: This parameter represents a file containing information about alterations in all the input samples, in form of "SampleID Gene AltType" rows. Currently, up to 64 different alteration types are supported (the third column). There is no header row. e.g.
T294 CCNL2 SNV
T294 PTCHD2 SNV
T294 COL16A1 SNV
...
-c
: This optional
parameter is a gene-chromosome map, allowing for more information in the output.
Gene Chromosome KaryotypeBand
ADAM30 1 p12
HAO2 1 p12
HMGCS2 1 p12
...
-x
: This optional
parameter represents a file containing a list of genes whose colors should be removed after reading the input.
Gene1
Gene2
Gene3
...
-f
: This parameter contains the name of the output folder for the run of the program, in which all the output files will be stored. This folder name will have values of parameters below appended to it.
-r
: This integer parameter controls the minimum required number of colors among the nodes of each resulting subnetworks, i.e. how "colorful" a subnetwork must be. Keep the value set to 1 for default configuration.
-s
: This integer parameter controls the maximum subnetwork size. For the first time running the program on a new dataset, 10 could be a reasonable value.
-t
: This integer parameter controls the minimum required sample recurrence of each resulting subnetwork.
-k
: This integer parameter controls the number of subnetworks that we wish to detect.
-e
: This optional
floating type parameter controls the maximum allowed error rate when extending subnetworks before the optimization. If not specified, it defaults to 0.
-t
: This integer parameter controls the minimum required sample recurrence of each resulting subnetwork.
-d
: This integer parameter specifies the number of threads used for the optimization.
-h
: This integer parameter specifies the number of seconds that the optimization step is allowed to take before returning a solution.
-p
: Used for p-value simulation.
./mcsc -n ../data/STRING10_HiConf_PPI.edges -l ../data/alteration_status_COAD_20171108.tsv -c ../data/string10_node_chromosome_map.tsv -r 1 -s 10 -t 138 -k 100 -e 0 -d 32 -h 36000 -f TCGA_COAD