Skip to content

Commit

Permalink
Merge pull request #19 from mwang87/master
Browse files Browse the repository at this point in the history
update
  • Loading branch information
cmaceves committed Mar 18, 2020
2 parents 7dd7c32 + 37a84ba commit f4374dc
Show file tree
Hide file tree
Showing 5 changed files with 97 additions and 22 deletions.
1 change: 0 additions & 1 deletion .github/workflows/productionintegration.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
name: production-integration

on:
push:
schedule:
- cron: '*/60 * * * *'

Expand Down
39 changes: 19 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,22 @@

**This is a community effort and everyone is encouraged to participate by submitting their own data and sample information [instructions](https://mwang87.github.io/ReDU-MS2-Documentation/HowtoContribute). The sharing of new applications (and code) which use ReDU is highly encouraged.**

## We are developing two main branches of functionality:

## Analyze Your Data
* [Compare Your Data to Public Data via Multivariate Analysis](https://mwang87.github.io/ReDU-MS2-Documentation/AnalyzeYourData_MultivariateComparisons) - Projection of your data onto a precalculated principal components analysis score plot of public data. <br>
* [Co-analyze Your Data with Public Data at GNPS](https://mwang87.github.io/ReDU-MS2-Documentation/AnalyzeYourData_CoAnalysis_at_GNPS) - Select files using sample information and assemble public data in groups as desired using the file selector. Launching an analysis loads the files from [MassIVE](https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp) at which point users can add their own data. The following co-analyses can be launched:
* Set Up Co-Analysis with GNPS Molecular Networking (requires GNPS Login)
* Set Up Co-Analysis with GNPS Library Search (requires GNPS Login)
* Launch PCA of Selected Files

## Analyze Public Data
* [Explore Multivariate Analysis of Public Data](https://mwang87.github.io/ReDU-MS2-Documentation/AnalyzePublicData_MultivariateComparisons) - Explore precalculated principal components analysis score plot of public data. <br>
* [Chemical Explorer](https://mwang87.github.io/ReDU-MS2-Documentation/AnalyzePublicData_ChemicalEnrichment) - Explore table of precalculated annotations in public data and default GNPS parameters. Find files and explore sample information associations. <br>
* [Re-analyze Public Data at GNPS](https://mwang87.github.io/ReDU-MS2-Documentation/PublicData_Reanalysis_at_GNPS) - Select files using sample information and assemble public data in groups as desired using the file selector. The following re-analyses can be launched:
* Set Up Co-Analysis with GNPS Molecular Networking (requires GNPS Login)
* Set Up Co-Analysis with GNPS Library Search (requires GNPS Login)
* Launch Group Comparator
* Launch Chemical Explorer

## Data Availability
All sample information can be downloaded from the ReDU-MS2 homepage by clicking "Download Database". The ReDU identification database is publicly available and accessible via GNPS/MassIVE (gnps.ucsd.edu), MSV000084206.
## Testing Procedure

To get ReDU up and running on your local system, it should be as easy as

```
server-compose-interactive
```

## Updating ReDU Data Procedure

One of the key steps in ReDU is the updating of the database to include the latest identifications for files within ReDU. These are the following steps:

1. Download batch template for GNPS at ```/metabatchdump```
1. Run Batch Workflow for Spectral Library Search
1. Get the set of tasks as tsv and save to [here](https://github.com/mwang87/ReDU-MS2-GNPS/blob/refactor-read-me-for-developers/database/global_tasks.tsv).
1. Remove database [here](https://github.com/mwang87/ReDU-MS2-GNPS/tree/refactor-read-me-for-developers/database)
1. Run XXX command to drop identifications table
1. Start ReDU back up and it will autopopulate

16 changes: 15 additions & 1 deletion code/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -626,8 +626,21 @@ def ReDUValidator():
return render_template('ReDUValidator.html')



# API End Points
@app.route('/metabatchdump', methods=['GET'])
def metabatchdump():
df = pd.read_table(config.PATH_TO_ORIGINAL_MAPPING_FILE)
filenames = df["filename"].tolist()
batch_size = 1000
batch_num = len(filenames) // batch_size
row = []
for x in range(batch_num):
files = filenames[(batch_size * x):(batch_size * (x+1))]
string_temp = ';'.join(files)
row.append(string_temp)

new_file = pd.DataFrame({"filename": row})
return new_file.to_csv(sep="\t", index=False)

def allowed_file_metadata(filename):
return '.' in filename and \
Expand Down Expand Up @@ -694,6 +707,7 @@ def validate():
#This displays global PCoA of public data as a web url
@app.route("/displayglobalmultivariate", methods = ["GET"])
def displayglobalmultivariate():

if not (os.path.isfile(config.PATH_TO_ORIGINAL_PCA) and os.path.isfile(config.PATH_TO_EIGS)):
print("Missing Global PCA Calculation, Calculating")
if not os.path.isfile(config.PATH_TO_GLOBAL_OCCURRENCES):
Expand Down
24 changes: 24 additions & 0 deletions database/librarysearchparams.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<parameters>
<parameter name="ANALOG_SEARCH">0</parameter>
<parameter name="FILTER_LIBRARY">1</parameter>
<parameter name="FILTER_PRECURSOR_WINDOW">1</parameter>
<parameter name="FILTER_SNR_PEAK_INT">0.0</parameter>
<parameter name="FILTER_STDDEV_PEAK_INT">0.0</parameter>
<parameter name="MAX_SHIFT_MASS">100.0</parameter>
<parameter name="MIN_MATCHED_PEAKS">6</parameter>
<parameter name="MIN_PEAK_INT">0.0</parameter>
<parameter name="SCORE_THRESHOLD">0.7</parameter>
<parameter name="SEARCH_LIBQUALITY">3</parameter>
<parameter name="TOP_K_RESULTS">1</parameter>
<parameter name="WINDOW_FILTER">1</parameter>
<parameter name="desc">GNPS All REDU LIBRARY SEARCH</parameter>
<parameter name="email">[email protected]</parameter>
<parameter name="library_on_server">d.speclibs;</parameter>
<parameter name="reanalyzed_datasets">MSV000083777</parameter>
<parameter name="spec_on_server">${filename}</parameter>
<parameter name="tolerance.Ion_tolerance">0.5</parameter>
<parameter name="tolerance.PM_tolerance">2.0</parameter>
<parameter name="workflow">MOLECULAR-LIBRARYSEARCH-V2</parameter>
<parameter name="workflow_version">release_10.1</parameter>
</parameters>
39 changes: 39 additions & 0 deletions database/metabatchdump.tsv

Large diffs are not rendered by default.

0 comments on commit f4374dc

Please sign in to comment.