Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Directionality #120

Merged
merged 50 commits into from
Dec 3, 2023
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
af6a590
documented research per algorithm as docstrings per class for the alg…
ntalluri Aug 23, 2023
3f173b5
added directionality for generate inputs
ntalluri Aug 29, 2023
a214cc8
Merge branch 'master' of github.com:ntalluri/spras into direction
ntalluri Aug 30, 2023
1e7fef9
added in parse_output directionality
ntalluri Aug 30, 2023
4fdaadb
removed directed from config file, testing with analysis all false
ntalluri Aug 30, 2023
cae057a
made updates to code and attempted to add testing for interactome
ntalluri Sep 4, 2023
319a5d3
precommit formatting
ntalluri Sep 5, 2023
bb55ee0
cleaned up code and finished interactome test
ntalluri Sep 5, 2023
7aed4ee
updated util to deal with the idea if someone is using an old config …
ntalluri Sep 5, 2023
0e15534
current ml repairs
ntalluri Sep 5, 2023
75be8b8
ml post processing pre-commit and config file
ntalluri Sep 6, 2023
550614a
fixed testing
ntalluri Sep 12, 2023
760b1b7
Merge branch 'master' into direction
ntalluri Sep 18, 2023
9e29ceb
updated summary.py/associated files and tests. updated interactome.py
ntalluri Sep 18, 2023
0b47bd9
Merge branch 'direction' of github.com:ntalluri/spras into direction
ntalluri Sep 18, 2023
713279a
pre-commit
ntalluri Sep 18, 2023
1198c9a
added generate inputs test, cleaned up code
ntalluri Sep 19, 2023
39b4748
Resolve merge conflicts
agitter Sep 22, 2023
af1b849
Update EGFR network with edge directions
agitter Sep 22, 2023
a34f832
added back graphspace to work for directed and undirected graphs only
ntalluri Sep 26, 2023
9b551d6
precommit
ntalluri Sep 26, 2023
b314751
automate test_prepare_inputs
ntalluri Sep 29, 2023
32be496
renamed the tests for creating the inputs
ntalluri Oct 4, 2023
57599f7
fix break in test
ntalluri Oct 4, 2023
e991e0a
added parse_output tests and still fixing generate inputs
ntalluri Oct 4, 2023
dd6c899
precommit
ntalluri Oct 4, 2023
ae276f5
added more information to step 5 of contributing guide
ntalluri Oct 4, 2023
df7da32
add cytoscape into workflow
ntalluri Oct 4, 2023
e5be5a0
cleaning up generate inputs/parse outputs test suites
ntalluri Oct 17, 2023
09c2913
clean up gen inputs and prase outputs
ntalluri Oct 17, 2023
8dd0990
Merge with master
agitter Oct 18, 2023
d9b0f0b
Fix ruff errors on GitHub actions
agitter Oct 18, 2023
6f631ef
made changes based on review
ntalluri Oct 26, 2023
f651376
made more changes based on review
ntalluri Oct 26, 2023
b764129
fixed error
ntalluri Oct 27, 2023
8eb9ebb
some of the comments
ntalluri Nov 27, 2023
5c0939f
more comments
ntalluri Nov 27, 2023
884fe4f
precommit
ntalluri Nov 27, 2023
19a4ad4
more comments
ntalluri Nov 27, 2023
99e6e6f
more comments resolved
ntalluri Nov 28, 2023
e8735ea
resolving parse output comments
ntalluri Nov 29, 2023
96c0482
precommit
ntalluri Nov 29, 2023
83262c7
add check to dataset.py
ntalluri Dec 1, 2023
7bc9e3e
Rename .csv to .txt in test directory
agitter Dec 3, 2023
d918c00
Add tests for invalid 4th edge column
agitter Dec 3, 2023
cd625d4
Remove self-edges from EGFR data
agitter Dec 3, 2023
ee58069
Update Cytoscape wrapper for directed edges
agitter Dec 3, 2023
da655b3
Remove directed from EGFR config and run more algs
agitter Dec 3, 2023
b4d6e1c
Systematic proofreading and formatting
agitter Dec 3, 2023
cccf5ce
Bump Cytoscape image version in workflow
agitter Dec 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,15 @@ Follow the example for any of the other pathway reconstruction algorithm.
First pull the image `<username>/local-neighborhood` from Docker Hub.
Then build the Docker image using the `Dockerfile` that was completed in Step 2.

Modify Generate Inputs:
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
1. Include a key-value pair in the algo_exp_file dictionary that links the specific algorithm to its expected network file.
2. Create or obtain the expected network file and save it to `/test/GenerateInputs/expected`. Name it as: `{algorithm name}-{network file name}-expected.txt`.
ntalluri marked this conversation as resolved.
Show resolved Hide resolved

Modify Parse Outputs:
1. Obtain the raw-pathway output (ex. from the run function in your wrapper by running the Snakemake workflow) and save it to `/test/ParseOutputs/input`. Name it as `{algorithm name}-raw-pathway.txt`.
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
2. Create or obtain the expected universal output and save it to `test/ParseOutputs/expected` directory. Name it as: `{algorithm name}-pathway-expected.txt`.
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
3. Add the new algorithm's name to the algorithms list in `test/ParseOutputs/test_parse_outputs.py`.

ntalluri marked this conversation as resolved.
Show resolved Hide resolved
### Step 6: Work with SPRAS maintainers to revise the pull request
Step 0 previously described how to create a `local-neighborhood` branch and create a pull request.
Make sure to commit all of the new and modified files and push them to the `local-neighborhood` branch on your fork.
Expand All @@ -205,7 +214,7 @@ The pull request will be closed so that the `master` branch of the fork stays sy
1. Import the new class in `src/runner.py` so the wrapper functions can be accessed
1. Document the usage of the Docker wrapper and the assumptions made when implementing the wrapper
1. Add example usage for the new algorithm and its parameters to the template config file
1. Write test functions and provide example input data in a new test subdirectory `test/<algorithm>`
1. Write test functions and provide example input data in a new test subdirectory `test/<algorithm>`. Provide example data and algorithm/expected files names to lists or dicts in `test/GenerateInputs` and `test/ParseOutputs`.
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
1. Extend `.github/workflows/test-spras.yml` to pull and build the new Docker image

When adding new algorithms, there are many other considerations that are not relevant with the simple Local Neighborhood example.
Expand Down
20 changes: 11 additions & 9 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,8 @@ def make_final_input(wildcards):
#TODO analysis could be parsed in the parse_config() function.
if config["analysis"]["summary"]["include"]:
# add summary output file for each pathway
final_input.extend(expand('{out_dir}{sep}{dataset}-{algorithm_params}{sep}summary.txt',out_dir=out_dir,sep=SEP,dataset=dataset_labels,algorithm_params=algorithms_with_params))
#TODO: reuse in the future once we make summary work for mixed graphs. See https://github.com/Reed-CompBio/spras/issues/128
# final_input.extend(expand('{out_dir}{sep}{dataset}-{algorithm_params}{sep}summary.txt',out_dir=out_dir,sep=SEP,dataset=dataset_labels,algorithm_params=algorithms_with_params))
# add table summarizing all pathways for each dataset
final_input.extend(expand('{out_dir}{sep}{dataset}-pathway-summary.txt',out_dir=out_dir,sep=SEP,dataset=dataset_labels))

Expand Down Expand Up @@ -219,14 +220,15 @@ rule parse_output:
run:
runner.parse_output(wildcards.algorithm, input.raw_file, output.standardized_file)

#TODO: reuse in the future once we make summary work for mixed graphs. See https://github.com/Reed-CompBio/spras/issues/128
# Collect summary statistics for a single pathway
rule summarize_pathway:
input:
standardized_file = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'pathway.txt'])
output:
summary_file = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'summary.txt'])
run:
summary.run(input.standardized_file,output.summary_file,directed=algorithm_directed[wildcards.algorithm])
# rule summarize_pathway:
# input:
# standardized_file = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'pathway.txt'])
# output:
# summary_file = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'summary.txt'])
# run:
# summary.run(input.standardized_file,output.summary_file)

# Write GraphSpace JSON graphs
rule viz_graphspace:
Expand All @@ -235,7 +237,7 @@ rule viz_graphspace:
graph_json = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'gs.json']),
style_json = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'gsstyle.json'])
run:
graphspace.write_json(input.standardized_file,output.graph_json,output.style_json,directed=algorithm_directed[wildcards.algorithm])
graphspace.write_json(input.standardized_file,output.graph_json,output.style_json)


# Write a Cytoscape session file with all pathways for each dataset
Expand Down
7 changes: 0 additions & 7 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,12 @@
- name: "pathlinker"
params:
include: true
directed: true
run1:
k: range(100,201,100)

- name: "omicsintegrator1"
params:
include: true
directed: false
run1:
r: [5]
b: [5, 6]
Expand All @@ -47,7 +45,6 @@
- name: "omicsintegrator2"
params:
include: true
directed: false
run1:
b: [4]
g: [0]
Expand All @@ -58,7 +55,6 @@
- name: "meo"
params:
include: true
directed: true
run1:
max_path_length: [3]
local_search: ["Yes"]
Expand All @@ -67,20 +63,17 @@
- name: "mincostflow"
params:
include: true
directed: false
run1:
flow: [1] # The flow must be an int
capacity: [1]

- name: "allpairs"
params:
include: true
directed: false

- name: "domino"
params:
include: true
directed: false
run1:
slice_threshold: [0.3]
module_threshold: [0.05]
Expand Down
2 changes: 1 addition & 1 deletion docker-wrappers/Cytoscape/cytoscape_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def load_pathways(pathways: List[str], output: str) -> None:
path, name = parse_name(pathway)
suid = p4c.networks.import_network_from_tabular_file(
file=path,
column_type_list='s,t,x',
column_type_list='s,t,x,ea',
delimiters='\t'
)
p4c.networks.rename_network(name, network=suid)
Expand Down
2 changes: 1 addition & 1 deletion input/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ The following files are very small toy datasets used to illustrate the supported
This dataset represents protein phosphorylation changes in response to epidermal growth factor (EGF) treatment.
The network includes protein-protein interactions from [iRefIndex](http://irefindex.org/) and kinase-substrate interactions from [PhosphoSitePlus](http://www.phosphosite.org/).
The files are originally from the [Temporal Pathway Synthesizer (TPS)](https://github.com/koksal/tps) repository.
They have been lightly modified for SPRAS by lowering one edge weight that was greater than 1, removing a PSEUDONODE prize, adding a prize of 10.0 to EGF_HUMAN, and converting all edges to undirected edges.
They have been lightly modified for SPRAS by lowering one edge weight that was greater than 1, removing a PSEUDONODE prize, and adding a prize of 10.0 to EGF_HUMAN.
The only source is EGF_HUMAN.
All proteins with phosphorylation-based prizes are also labeled as targets.
All nodes are considered active.
Expand Down
18 changes: 9 additions & 9 deletions input/alternative-network.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
A B 0.98
B C 0.77
A D 0.12
C D 0.89
C E 0.59
C F 0.50
F G 0.76
G H 0.92
G I 0.66
A B 0.98 U
B C 0.77 U
A D 0.12 U
C D 0.89 U
C E 0.59 U
C F 0.50 U
F G 0.76 U
G H 0.92 U
G I 0.66 U
4 changes: 2 additions & 2 deletions input/network.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
A B 0.98
B C 0.77
A B 0.98 U
B C 0.77 U
Loading
Loading