Add RWR pathway reconstruction algorithm #148

Lyce24 · 2024-02-21T01:01:25Z

This pull request replaces the RWR part of pull request #92. Changes are made to adapt the updated version of SPRAS with directionality.

annaritz

Some clarifications will be helpful, especially about:

The logic of sources/targets/prize files
What happens if there are no sources (or if there are no targets in the source-target mode)
Whether you want to have separate output files or keep all outputs in a single file.

I also will test these changes locally.

annaritz · 2024-03-06T19:29:14Z

spras/rwr.py

+__all__ = ['RWR']
+
+"""
+RWR will construct a directed graph from the provided input file


It's good to note that RWR assumes a directed graph. "From the provided input file" is vague, especially if the user can input a network file and a node prizes file.

annaritz · 2024-03-06T19:29:48Z

spras/rwr.py

+- an edge is represented with a head and tail node, which represents the direction of the interation between two nodes
+- uses networkx Digraph() object
+
+Expected raw input format:


What is edge flux for the input file requirements? And this is the network file, correct?

annaritz · 2024-03-06T19:30:35Z

spras/rwr.py

+- the 'type' column should be 1 for edges, 2 for nodes, and 3 for pathways as we want to keep information about nodes, edges, and pathways.
+- it can include repeated and bidirectional edges
+
+Expected raw input format for prizes:


Is the prizes file optional, or is it required and your code will generate a stub file with all nodes set to 1.0? Also, I think the description here is for the network file, not the node file.

This file should be renamed to focus on sources/targets, with prizes as a secondary attribute. RWR requires at least sources to run, even if there are no targets or prizes, right?

annaritz · 2024-03-06T19:40:11Z

spras/rwr.py

+        sources_targets = data.request_node_columns(["sources", "targets"])
+        if sources_targets is None:
+            if data.contains_node_columns('prize'):
+                sources_targets = data.request_node_columns(['prize'])


Here you are assuming that if there is NO sources file but there's a prize file, all the nodes with a prize are sources, right? Add comments to clearly descibe the logic here.

annaritz · 2024-03-06T19:40:50Z

spras/rwr.py

+                raise ValueError(f"{input_type} filename is missing")
+
+        sources_targets = data.request_node_columns(["sources", "targets"])
+        if sources_targets is None:


What happens if there are targets but no sources?

annaritz · 2024-03-06T19:41:26Z

spras/rwr.py

+                node_df = data.request_node_columns(['prize'])
+                input_df = pd.merge(input_df, node_df, on='NODEID')
+            else:
+                #If there aren't prizes but are sources and targets, make prizes based on them


Minor - rephrase to be "If there aren't prizes but there are sources and targets, set their prize to be 1.0"

annaritz · 2024-03-06T19:43:08Z

spras/rwr.py

+
+    # Skips parameter validation step
+    @staticmethod
+    def run(edges=None, prizes = None, output_file = None, single_source = None, df = None, w = None, f = None, threshold = None, container_framework="docker"):


Should single_source be a Boolean?

annaritz · 2024-03-06T19:47:33Z

spras/rwr.py

+        df_node = df_node.rename(columns={'Node1': 'Node', 'Node2': 'Pr', 'Edge Flux': 'R_Pr', 'Weight': 'Final_Pr', 'InNetwork' : 'InNetwork'})
+        df_node.to_csv(node_output_file, sep="\t", index=False, header=True)
+
+        df_pathway = df.loc[df['Type'] == 3]


I'm confused about that Type 3 is. Is this the subgraph after filtering applying the selection function and threshold? A few more comments here (or at the top) about these output types would be helpful.

annaritz · 2024-03-06T19:49:11Z

test/RWR/test_rwr.py

+
+TEST_DIR = 'test/RWR/'
+OUT_FILE_DEFAULT = TEST_DIR+'output/rwr-edges.txt'
+OUT_FILE_OPTIONAL = TEST_DIR+'output/rwr-edges-optional.txt'


Rename OPTIONAL to OPTIONS or OPTIONAL_ARGS - I think that's what you mean (the optional arguments)

annaritz · 2024-03-06T19:51:34Z

spras/rwr.py

+
+
+    @staticmethod
+    def parse_output(raw_pathway_file, standardized_pathway_file):


I now remember that we talked about how your RWR outputs everything as a single file, so you can parse pieces of it in parse_output(). Would you rather output three separate output files? That would be cleaner conceptually. We can ask the group about how to parse multiple output files.

Erik_L added 3 commits February 18, 2024 05:19

Initial changes done.

928ad3f

Finishing writing test_rwr

85f8da2

Finishing up with the updated Docker Image

a288fc5

annaritz self-requested a review March 6, 2024 19:18

annaritz requested changes Mar 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RWR pathway reconstruction algorithm #148

Add RWR pathway reconstruction algorithm #148

Lyce24 commented Feb 21, 2024

annaritz left a comment •

edited

Loading

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024

annaritz Mar 6, 2024



		@staticmethod
		def parse_output(raw_pathway_file, standardized_pathway_file):

Add RWR pathway reconstruction algorithm #148

Are you sure you want to change the base?

Add RWR pathway reconstruction algorithm #148

Conversation

Lyce24 commented Feb 21, 2024

annaritz left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annaritz left a comment •

edited

Loading