-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RWR and TieDIE Integration #92
Closed
Closed
Changes from 14 commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
ecec5b6
Create Docker Image for random-walk
34d3ef3
Implementing src/random_walk.py (questions about the output files)
aff0893
Updating Dockerfile and src/random_walk.py (new image and single raw_…
5755c62
Make RWWR working (change Dockerfile for current random-walk-with-res…
13dd12a
Make the program running (writing the tests); clean up the codes; upd…
2e6aaa5
Initiate dockerizing TieDIE repo
c214002
Dockerize TieDIE and integrate it into SPRAS
78036de
Fixing the format of TieDIE output pathway; need to consider RWwR
45d5733
Update RandomWalk and TieDIE (add user-defined threshold for RWwR and…
fafbff9
Complete the tests for RWR and TieDIE (test and GitHub actions); add …
c958aa3
Merge branch 'master' into RWR_and_TieDIE
agitter fe863e0
Fixing pre-commit hooks and updating the GitHub Action
62fc246
Fixing the GitHub Action
becf7bc
Updating the Docker Image for RWR and TieDIE
b7d2c94
Completing changes requested
8e94cf3
Renaming random_walk to rwr
f467db1
Fix a minor bug in workflows/test-spras.yaml (RandomWalk to RWR)
7833ea5
Resolved some issues suggested by Neha (updating configfile; adding p…
f7f2ba4
Merge branch 'Reed-CompBio:master' into RWR_and_TieDIE
Lyce24 53faeb0
Resolved some issues suggested by Tony (pushed the ReedCompBio Docker…
b7f007f
Merge branch 'RWR_and_TieDIE' of https://github.com/Lyce24/spras into…
269c14e
Pre-commit check
dd75af8
Merge branch 'Reed-CompBio:master' into RWR_and_TieDIE
Lyce24 1f5a233
Added single source mode to RWR, updated the input file format for RW…
b2d6a8e
Updated tests for RWR
4e25538
Minor updates for RWR (param updates; test updates)
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
FROM python:3.10.7 | ||
|
||
WORKDIR /RandomWalk | ||
|
||
# installing essential packages | ||
RUN pip install networkx==2.8 | ||
RUN pip install numpy==1.24.3 | ||
RUN pip install scipy==1.10.1 | ||
|
||
RUN wget https://raw.githubusercontent.com/Reed-CompBio/random-walk-with-restart/ef6bd61e0c866c13205ae94c1301827817dc1abb/random_walk.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
# RWwR Docker image | ||
|
||
A Docker image for the random-walk-with-start algorithm that is available on [DockerHub](https://hub.docker.com/repository/docker/erikliu24/rwwr). | ||
|
||
To create the Docker image run: | ||
``` | ||
docker build -t eriliu24/RandomWalk -f Dockerfile . | ||
``` | ||
from this directory. | ||
|
||
To inspect the installed Python packages: | ||
``` | ||
winpty docker run erikliu24/rwwr pip list | ||
``` | ||
The `winpty` prefix is only needed on Windows. | ||
|
||
## Testing | ||
Test code is located in `test/RandomWalk`. | ||
The `input` subdirectory contains test files `source_nodes.txt`, `target_nodes.txt` and `edges.txt`. | ||
The Docker wrapper can be tested with `pytest`. | ||
|
||
Alternatively, to test the Docker image directly, run the following command from the root of the `spras` repository | ||
``` | ||
docker run -w /data --mount type=bind,source=/${PWD},target=/data erikliu24/rwwr python random_walk.py \ | ||
/data/test/RandomWalk/input/edges.txt /data/test/RandomWalk/input/source_nodes.txt /data/test/RandomWalk/input/target_nodes.txt --damping_factor 0.85 --selection_function min --threshold 0.001 --output_file /data/test/RandomWalk/output/output.txt | ||
``` | ||
This will run RWR on the test input files and write the output files to the root of the `spras` repository. | ||
Windows users may need to escape the absolute paths so that `/data` becomes `//data`, etc. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
FROM python:2.7.15 | ||
|
||
WORKDIR /TieDIE | ||
|
||
COPY requirements.txt . | ||
RUN pip install -r requirements.txt && \ | ||
commit=c64ab5c4b4e0f6cfac4b5151c7d9f1d7ea331e65 && \ | ||
wget https://github.com/Reed-CompBio/TieDIE/tarball/$commit && \ | ||
tar -zxvf $commit && \ | ||
rm $commit && \ | ||
mv Reed-CompBio-TieDIE-*/* . |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# TieDIE Docker image | ||
|
||
A Docker image for [TieDIE](https://github.com/epaull/TieDIE) that is available on [DockerHub](https://hub.docker.com/r/erikliu24/tiedie). | ||
Lyce24 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
To create the Docker image run: | ||
``` | ||
docker build -t erikliu24/tiedie -f Dockerfile . | ||
Lyce24 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
from this directory. | ||
|
||
To inspect the installed Python packages: | ||
``` | ||
winpty docker run erikliu24/tiedie pip list | ||
``` | ||
The `winpty` prefix is only needed on Windows. | ||
|
||
## Testing | ||
Test code is located in `test/TieDIE`. | ||
The `input` subdirectory contains test files `pathway.txt`, `target.txt` and `source.txt`. | ||
The Docker wrapper can be tested with `pytest`. | ||
Lyce24 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
Lyce24 marked this conversation as resolved.
Show resolved
Hide resolved
|
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
E 1 + | ||
F 1 + | ||
G 1 + |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
A -a> D | ||
B -a> D | ||
C -a> D | ||
D -a> E | ||
D -a> F | ||
D -a> G |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
networkx==1.11 | ||
numpy==1.11.3 | ||
scipy==0.18.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
A 1 + | ||
B 1 + | ||
C 1 + |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
import warnings | ||
from pathlib import Path | ||
|
||
import pandas as pd | ||
|
||
from src.prm import PRM | ||
from src.util import add_rank_column, prepare_volume, run_container | ||
|
||
__all__ = ['RandomWalk'] | ||
|
||
class RandomWalk(PRM): | ||
# we need edges (weighted), source set (with prizes), and target set (with prizes). | ||
required_inputs = ['edges', 'sources', 'targets'] | ||
|
||
@staticmethod | ||
def generate_inputs(data, filename_map): | ||
""" | ||
Access fields from the dataset and write the required input files | ||
@param data: dataset | ||
@param filename_map: a dict mapping file types in the required_inputs to the filename for that type | ||
@return: | ||
""" | ||
# ensures the required input are within the filename_map | ||
for input_type in RandomWalk.required_inputs: | ||
if input_type not in filename_map: | ||
raise ValueError(f"{input_type} filename is missing") | ||
|
||
# will take the sources and write them to files, and repeats with targets | ||
for node_type in ['sources', 'targets']: | ||
nodes = data.request_node_columns([node_type]) | ||
# check if the nodes have prizes or not | ||
if data.contains_node_columns('prize'): | ||
node_df = data.request_node_columns(['prize']) | ||
nodes = pd.merge(nodes, node_df, on='NODEID') | ||
# creates with the node type without headers | ||
nodes.to_csv(filename_map[node_type], index=False, sep= " ", columns=['NODEID', 'prize']) | ||
else: | ||
#If there aren't prizes but are sources and targets, make prizes based on them | ||
nodes = data.request_node_columns([node_type]) | ||
# make all nodes have a prize of 1 | ||
nodes['prize'] = 1.0 | ||
# creates with the node type without headers | ||
nodes.to_csv(filename_map[node_type], index=False, sep= " ", columns=['NODEID', 'prize']) | ||
|
||
# create the network of edges | ||
edges = data.get_interactome() | ||
|
||
# creates the edges files that contains the head and tail nodes and the weights after them | ||
edges.to_csv(filename_map['edges'], sep=" ", index=False, columns=["Interactor1","Interactor2","Weight"]) | ||
|
||
|
||
# Skips parameter validation step | ||
@staticmethod | ||
def run(edges=None, sources=None, targets = None, output_file = None, df : float = 0.85, f : str = 'min' , threshold : float = 0.0001, singularity=False): | ||
""" | ||
Run RandomWalk with Docker | ||
@param nodetypes: input node types with sources and targets (required) | ||
@param network: input network file (required) | ||
@param output_file: path to the output pathway file (required) | ||
@param df: damping factor for restarting (default 0.85) (optional) | ||
@param f: selection function (default 'min') (optional) | ||
@param threshold: threshold for constructing the final pathway (default 0.0001) (optional) | ||
@param singularity: if True, run using the Singularity container instead of the Docker container | ||
""" | ||
|
||
if not edges or not sources or not targets or not output_file: | ||
raise ValueError('Required RandomWalk arguments are missing') | ||
|
||
work_dir = '/spras' | ||
|
||
# Each volume is a tuple (src, dest) - data generated by Docker | ||
volumes = list() | ||
|
||
bind_path, edges_file = prepare_volume(edges, work_dir) | ||
volumes.append(bind_path) | ||
|
||
bind_path, sources_file = prepare_volume(sources, work_dir) | ||
volumes.append(bind_path) | ||
|
||
bind_path, targets_file = prepare_volume(targets, work_dir) | ||
volumes.append(bind_path) | ||
|
||
|
||
out_dir = Path(output_file).parent | ||
# RandomWalk requires that the output directory exist | ||
out_dir.mkdir(parents=True, exist_ok=True) | ||
bind_path, mapped_out_dir = prepare_volume(str(out_dir), work_dir) | ||
volumes.append(bind_path) | ||
mapped_out_prefix= mapped_out_dir + '/out' # Use posix path inside the container | ||
|
||
|
||
command = ['python', | ||
'/RandomWalk/random_walk.py', | ||
'--edges_file', edges_file, | ||
'--sources_file', sources_file, | ||
'--targets_file', targets_file, | ||
'--damping_factor', str(df), | ||
'--selection_function', f, | ||
'--threshold', str(threshold), | ||
'--output_file', mapped_out_prefix] | ||
|
||
print('Running RandomWalk with arguments: {}'.format(' '.join(command)), flush=True) | ||
|
||
|
||
container_framework = 'singularity' if singularity else 'docker' | ||
out = run_container(container_framework, | ||
'erikliu24/rwwr', | ||
command, | ||
volumes, | ||
work_dir) | ||
print(out) | ||
|
||
output = Path(out_dir, 'out') | ||
output.rename(output_file) | ||
|
||
# From edge_output_file, construct a pathway file in the universal format | ||
# 1. Stop when the source and targets are connected. | ||
|
||
@staticmethod | ||
def parse_output(raw_pathway_file, standardized_pathway_file): | ||
""" | ||
Convert a predicted pathway into the universal format | ||
@param raw_pathway_file: pathway file produced by an algorithm's run function | ||
@param standardized_pathway_file: the same pathway written in the universal format | ||
""" | ||
print('Parsing random-walk-with-restart output') | ||
|
||
df = pd.read_csv(raw_pathway_file, sep="\t") | ||
|
||
pathway_output_file = standardized_pathway_file | ||
edge_output_file = standardized_pathway_file.replace('.txt', '') + '_edges.txt' | ||
node_output_file = standardized_pathway_file.replace('.txt', '') + '_nodes.txt' | ||
|
||
# get all rows where type is 1 | ||
df_edge = df.loc[df["Type"] == 1] | ||
|
||
# get rid of the placeholder column and output it to a file | ||
df_edge = df_edge.drop(columns=['Placeholder']) | ||
df_edge = df_edge.drop(columns=['Type']) | ||
df_edge.to_csv(edge_output_file, sep="\t", index=False, header=True) | ||
|
||
# locate the first place where placeholder is not Nan | ||
df_node = df.loc[df['Type'] == 2] | ||
# rename the header to Node, Pr, R_Pr, Final_Pr | ||
df_node = df_node.drop(columns=['Type']) | ||
df_node = df_node.rename(columns={'Node1': 'Node', 'Node2': 'Pr', 'Weight': 'R_Pr', 'Placeholder': 'Final_Pr'}) | ||
df_node.to_csv(node_output_file, sep="\t", index=False, header=True) | ||
|
||
df_pathway = df.loc[df['Type'] == 3] | ||
df_pathway = df_pathway.drop(columns=['Placeholder']) | ||
df_pathway = df_pathway.drop(columns=['Type']) | ||
df_pathway = df_pathway.drop(columns=['Weight']) | ||
# add a colum of 1 to represent the rank | ||
df_pathway = add_rank_column(df_pathway) | ||
df_pathway.to_csv(pathway_output_file, sep="\t", index=False, header=False) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a reminder to change these to
reedcompbio
after @annaritz pushes the containers to the organization account. Same goes for the steps below.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will make changes to this file when Anna pushes the image to
reedcompbio
. And the changes will be made also insrc/random_walk.py
andsrc/tiedie.py
as therun
functions in these two files are also pulling images from my personal account.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built and pushed both of these to DockerHub so you can make the test workflow changes now: