Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Directionality #120

Merged
merged 50 commits into from
Dec 3, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
af6a590
documented research per algorithm as docstrings per class for the alg…
ntalluri Aug 23, 2023
3f173b5
added directionality for generate inputs
ntalluri Aug 29, 2023
a214cc8
Merge branch 'master' of github.com:ntalluri/spras into direction
ntalluri Aug 30, 2023
1e7fef9
added in parse_output directionality
ntalluri Aug 30, 2023
4fdaadb
removed directed from config file, testing with analysis all false
ntalluri Aug 30, 2023
cae057a
made updates to code and attempted to add testing for interactome
ntalluri Sep 4, 2023
319a5d3
precommit formatting
ntalluri Sep 5, 2023
bb55ee0
cleaned up code and finished interactome test
ntalluri Sep 5, 2023
7aed4ee
updated util to deal with the idea if someone is using an old config …
ntalluri Sep 5, 2023
0e15534
current ml repairs
ntalluri Sep 5, 2023
75be8b8
ml post processing pre-commit and config file
ntalluri Sep 6, 2023
550614a
fixed testing
ntalluri Sep 12, 2023
760b1b7
Merge branch 'master' into direction
ntalluri Sep 18, 2023
9e29ceb
updated summary.py/associated files and tests. updated interactome.py
ntalluri Sep 18, 2023
0b47bd9
Merge branch 'direction' of github.com:ntalluri/spras into direction
ntalluri Sep 18, 2023
713279a
pre-commit
ntalluri Sep 18, 2023
1198c9a
added generate inputs test, cleaned up code
ntalluri Sep 19, 2023
39b4748
Resolve merge conflicts
agitter Sep 22, 2023
af1b849
Update EGFR network with edge directions
agitter Sep 22, 2023
a34f832
added back graphspace to work for directed and undirected graphs only
ntalluri Sep 26, 2023
9b551d6
precommit
ntalluri Sep 26, 2023
b314751
automate test_prepare_inputs
ntalluri Sep 29, 2023
32be496
renamed the tests for creating the inputs
ntalluri Oct 4, 2023
57599f7
fix break in test
ntalluri Oct 4, 2023
e991e0a
added parse_output tests and still fixing generate inputs
ntalluri Oct 4, 2023
dd6c899
precommit
ntalluri Oct 4, 2023
ae276f5
added more information to step 5 of contributing guide
ntalluri Oct 4, 2023
df7da32
add cytoscape into workflow
ntalluri Oct 4, 2023
e5be5a0
cleaning up generate inputs/parse outputs test suites
ntalluri Oct 17, 2023
09c2913
clean up gen inputs and prase outputs
ntalluri Oct 17, 2023
8dd0990
Merge with master
agitter Oct 18, 2023
d9b0f0b
Fix ruff errors on GitHub actions
agitter Oct 18, 2023
6f631ef
made changes based on review
ntalluri Oct 26, 2023
f651376
made more changes based on review
ntalluri Oct 26, 2023
b764129
fixed error
ntalluri Oct 27, 2023
8eb9ebb
some of the comments
ntalluri Nov 27, 2023
5c0939f
more comments
ntalluri Nov 27, 2023
884fe4f
precommit
ntalluri Nov 27, 2023
19a4ad4
more comments
ntalluri Nov 27, 2023
99e6e6f
more comments resolved
ntalluri Nov 28, 2023
e8735ea
resolving parse output comments
ntalluri Nov 29, 2023
96c0482
precommit
ntalluri Nov 29, 2023
83262c7
add check to dataset.py
ntalluri Dec 1, 2023
7bc9e3e
Rename .csv to .txt in test directory
agitter Dec 3, 2023
d918c00
Add tests for invalid 4th edge column
agitter Dec 3, 2023
cd625d4
Remove self-edges from EGFR data
agitter Dec 3, 2023
ee58069
Update Cytoscape wrapper for directed edges
agitter Dec 3, 2023
da655b3
Remove directed from EGFR config and run more algs
agitter Dec 3, 2023
b4d6e1c
Systematic proofreading and formatting
agitter Dec 3, 2023
cccf5ce
Bump Cytoscape image version in workflow
agitter Dec 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 6 additions & 15 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,22 +189,13 @@ First pull the image `<username>/local-neighborhood` from Docker Hub.
Then build the Docker image using the `Dockerfile` that was completed in Step 2.

Modify Generate Inputs:
ntalluri marked this conversation as resolved.
Show resolved Hide resolved

Purpose: Update the references and expected files for the new algorithm.

Instructions:
1. Append the algorithm name to the existing algorithms list.
2. Include a key-value pair in the algo_exp_file dictionary that links the specific algorithm to its expected network file.
3. Store the expected network file at /test/GenerateInputs/expected using the naming convention: {algorithm name}-{network file name}-expected.txt.
1. Include a key-value pair in the algo_exp_file dictionary that links the specific algorithm to its expected network file.
2. Create or obtain the expected network file and save it to `/test/GenerateInputs/expected`. Name it as: `{algorithm name}-{network file name}-expected.txt`.
ntalluri marked this conversation as resolved.
Show resolved Hide resolved

Modify Parse Outputs:

Purpose: Handle and verify the outputs of the new algorithm.

Instructions:
1. Save the raw-pathway output (from the run function in your wrapper) to /test/ParseOutputs/input. Name it as {algorithm name}-raw-pathway.txt.
2. Place the expected parsed output in the test/ParseOutputs/expected directory, following the naming pattern: {algorithm name}-pathway-expected.txt.
3. Add the new algorithm's name to the algorithms list.
1. Obtain the raw-pathway output (ex. from the run function in your wrapper by running the Snakemake workflow) and save it to `/test/ParseOutputs/input`. Name it as `{algorithm name}-raw-pathway.txt`.
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
2. Create or obtain the expected universal output and save it to `test/ParseOutputs/expected` directory. Name it as: `{algorithm name}-pathway-expected.txt`.
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
3. Add the new algorithm's name to the algorithms list in `test/ParseOutputs/test_parse_outputs.py`.

ntalluri marked this conversation as resolved.
Show resolved Hide resolved
### Step 6: Work with SPRAS maintainers to revise the pull request
Step 0 previously described how to create a `local-neighborhood` branch and create a pull request.
Expand All @@ -223,7 +214,7 @@ The pull request will be closed so that the `master` branch of the fork stays sy
1. Import the new class in `src/runner.py` so the wrapper functions can be accessed
1. Document the usage of the Docker wrapper and the assumptions made when implementing the wrapper
1. Add example usage for the new algorithm and its parameters to the template config file
1. Write test functions and provide example input data in a new test subdirectory `test/<algorithm>`
1. Write test functions and provide example input data in a new test subdirectory `test/<algorithm>`. Provide example data and algorithm/expected files names to lists or dicts in `test/GenerateInputs` and `test/ParseOutputs`.
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
1. Extend `.github/workflows/test-spras.yml` to pull and build the new Docker image

When adding new algorithms, there are many other considerations that are not relevant with the simple Local Neighborhood example.
Expand Down
29 changes: 21 additions & 8 deletions spras/analysis/graphspace.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,8 @@
from graphspace_python.graphs.classes.gsgraph import GSGraph


# remove all the the directed = bool in the function names
def write_json(graph_file,out_graph,out_style) -> None:

print("graph file", graph_file)
# get GS Graph
graph_name = os.path.basename(out_graph) # name is the prefix specified.
G = get_gs_graph(graph_file, graph_name)
Expand Down Expand Up @@ -40,8 +38,14 @@ def post_graph(G:GSGraph,username:str,password:str) -> None:
return

def get_gs_graph(graph_file:str,graph_name:str) -> GSGraph:
"""
Creates a GraphSpace graph using the networkx graph or digraph and directionality returned from load_graph

If the graph is empty, that likely means the graph was mized in directionality
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
- this will return an emty GraphSpace graph
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
"""
# read file as networkx graph
# returns a tuple, the graph and directionality
# returns a tuple, (the graph, directionality)
nxG, directed = load_graph(graph_file)

# convert networkx graph to GraphSpace object
Expand All @@ -52,15 +56,21 @@ def get_gs_graph(graph_file:str,graph_name:str) -> GSGraph:
G.add_node_style(n,color='#ACCE9A',shape='rectangle',width=30,height=30)
for u,v in nxG.edges():
if directed:
G.add_edge(u,v,directed=True,popup='Directed Edge %s-%s<br>Rank %d' % (u,v,nxG[u][v]['rank']))
G.add_edge(u,v,directed=True,popup='Directed Edge %s-%s<br>Rank %d' % (u,v,nxG[u][v]['Rank']))
G.add_edge_style(u,v,directed=True,width=2,color='#281D6A')
else:
G.add_edge(u,v,popup='Undirected Edge %s-%s<br>Rank %d' % (u,v,nxG[u][v]['rank']))
G.add_edge(u,v,popup='Undirected Edge %s-%s<br>Rank %d' % (u,v,nxG[u][v]['Rank']))
G.add_edge_style(u,v,width=2,color='#281D6A')
return G


def load_graph(path: str) -> nx.Graph:
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
"""
Returns a Graph or Digraph, accompanied by a boolean indicating if it's directed.

This code is compatible only with fully directed or undirected graphs.
If neither, an empty Graph will be returned
"""
G = nx.Graph()
directed = False

Expand All @@ -72,14 +82,17 @@ def load_graph(path: str) -> nx.Graph:
pathways.columns = ["Interactor1", "Interactor2", "Rank", "Direction"]
mask_u = pathways['Direction'] == 'U'
mask_d = pathways['Direction'] == 'D'
pathways.drop(columns=["Direction"])

if mask_u.all():
G = nx.read_edgelist(path,data=(('rank',float), ('Direction',str)))
G = nx.from_pandas_edgelist(pathways, "Interactor1", "Interactor2", ["Rank"])
directed = False

elif mask_d.all():
G = nx.read_edgelist(path,data=(('rank',float),('Direction',str)), create_using=nx.DiGraph)
G = nx.from_pandas_edgelist(pathways, "Interactor1", "Interactor2", ["rank"], create_using=nx.DiGraph())
directed = True
else:
print("graphspace does not deal with mixed direction type graphs currently")
print(f"{path} could not be visualized. GraphSpace does not deal with mixed direction type graphs currently")


return G, directed
9 changes: 7 additions & 2 deletions spras/analysis/ml.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def summarize_networks(file_paths: Iterable[Union[str, PathLike]]) -> pd.DataFra
node2 = parts[1]
direction = str(parts[3]).strip()
if direction == "U":
edges.append(UNDIR_CONST.join([node1, node2]))
edges.append(UNDIR_CONST.join(sorted([node1, node2])))
elif direction == "D":
edges.append(DIR_CONST.join([node1, node2]))
else:
Expand Down Expand Up @@ -308,5 +308,10 @@ def ensemble_network(dataframe: pd.DataFrame, output_file: str):
"""
row_means = dataframe.mean(axis=1, numeric_only=True).reset_index()
row_means.columns = ['Edges', 'Frequency']

row_means['Direction'] = row_means['Edges'].apply(lambda edge: 'D' if '-->' in edge else 'U')
ntalluri marked this conversation as resolved.
Show resolved Hide resolved
row_means['Node1'] = row_means['Edges'].apply(lambda edge: edge.split('-->')[0] if '-->' in edge else edge.split('---')[0])
row_means['Node2'] = row_means['Edges'].apply(lambda edge: edge.split('-->')[1] if '-->' in edge else edge.split('---')[1])

make_required_dirs(output_file)
row_means.to_csv(output_file, sep='\t', index=False, header=True)
row_means[['Node1', 'Node2', 'Frequency', "Direction"]].to_csv(output_file, sep='\t', index=False, header=True)
8 changes: 3 additions & 5 deletions test/ml/expected/expected-dataframe.csv
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,9 @@ O---P,0,0,1,0,0,0,1,0
P---Q,0,0,1,0,0,0,0,0
node1---node2,0,0,0,1,0,0,0,0
node1---node3,0,0,0,1,1,0,0,0
node4---node5,0,0,0,1,0,0,0,0
TEST---LONGERNAMES,0,0,0,1,0,0,0,0
node3---node2,0,0,0,0,1,0,0,0
node5---node4,0,0,0,0,1,0,0,0
LONGERNAMES---TEST,0,0,0,0,1,0,0,0
node4---node5,0,0,0,1,1,0,0,0
LONGERNAMES---TEST,0,0,0,1,1,0,0,0
node2---node3,0,0,0,0,1,0,0,0
nodes with---spaces in name,0,0,0,0,0,0,1,0
A-->B,0,0,0,0,0,0,0,1
B-->A,0,0,0,0,0,0,0,1
20 changes: 10 additions & 10 deletions test/ml/expected/expected-ensemble-network.tsv
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
Edges Frequency
A---B 0.5
C---D 0.75
E---F 0.75
L---M 0.5
M---N 0.25
O---P 0.25
P---Q 0.25
A-->B 0.25
B-->A 0.25
Node1 Node2 Frequency Direction
A B 0.5 U
C D 0.75 U
E F 0.75 U
L M 0.5 U
M N 0.25 U
O P 0.25 U
P Q 0.25 U
A B 0.25 D
B A 0.25 D
Loading