-
Notifications
You must be signed in to change notification settings - Fork 24
docs: tutorials for COMBINE25 #369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Documentation build overview
Show files changed (5 files in total): 📝 1 modified | ➕ 4 added | ➖ 0 deleted
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good outline. I added some detailed thoughts.
We will also need to write an abstract |
- ML | ||
|
||
Slides and/or Information to add to docs directly: | ||
For the larger dataset, teach a user how to turn data into the input structure we require |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can build the egfr dataset if we get how it was created. Any omics dataset that we know how to build will be good to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the "Mass spectrometry data analysis and temporal phosphorylation significance" section of the Supplemental Experimental Procedures of https://doi.org/10.1016/j.celrep.2018.08.085 as well as
In our EGF response analysis, we computed the node prize for each protein using the minimum peptide p-value over all peptides that map to the protein. We computed prizes as −log10 𝑝𝑣𝑎𝑙𝑢𝑒, yielding 701 protein prizes.
docs/tutorial/introduction.rst
Outdated
Pathway reconstruction is a computational approach used in biology to rebuild biological pathways (such as signaling pathways) from high-throughput experimental data. | ||
|
||
Curated pathway databases provide references to pathways, but they are often generalized and may not capture the context-specific details relevant to a particular disease or experimental condition. | ||
To address this, pathway reconstruction algorithms (PRAs) help map molecules of interest (such as proteins, genes, or metabolites identified in omics experiments or that are known as points of reference) onto large-scale interaction networks, called interactomes (maps of molecular interactions in a cell). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to avoid the PRA acronym because it is not generally used
docs/tutorial/basic.rst
Outdated
- Bow Tie Builder | ||
- ResponseNet | ||
|
||
- Each algorithm has an include flag (true/false) to turn it on or off. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link to our docs about these here?
docs/tutorial/basic.rst
Outdated
- data_dir: the path to where the input dataset files live | ||
- other_files: a placefolder for potential need for future delevvelopment | ||
|
||
4. Gold Standards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skip gold standards for the basic intro and introduce in medium?
docs/tutorial/basic.rst
Outdated
- Defines the filepath where reconstructed networks are saved (output directory by default) | ||
- Basic housekeeping for how SPRAS organizes and stores results. | ||
|
||
6. Analysis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much of this do we cover here versus skip until medium? We may not need to explain everything that goes in the config file all at once.
docs/tutorial/basic.rst
Outdated
- egfr | ||
- one algorithm | ||
- three different preset combos | ||
- have them make the configuration file? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is necessary. The basic tutorial can have them start with an premade config, maybe modify it trivially, and make sure they understand what it did. A powerful example would be to run it, add one extra parameter, and run it again to see how much is cached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this style of explaining what each command does and showing the file tree produced.
Once beginner is done, try practicing it live. This first tutorial may end up being mostly beginner content, which is okay.
- mention parameter tuning | ||
- say that parameters are not preset and need to be tuned for each dataset | ||
|
||
CHTC integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CHTC is local to our university. The way to say it may be Snakemake integration with cloud and high-throughput computing resources, which we've prototyped in our local cluster. If we start testing in OSG that would be different because many people are eligible for accounts.
docs/tutorial/beginner.rst
Outdated
|
||
Stores all results generated by SPRAS. Subfolders are created automatically for each run, and their structure can be controlled through the configuration file. | ||
|
||
By default, the directories are set to config/, input/, and output/. The config/, input/, and output/ folders can be placed anywhere within the SPRAS repository. Their input/ and output/ locations can be updated in the configuration file, and the configuration file itself can be found by providing its path when running SPRAS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we place these directories anywhere? They don't have to be subdirectories anymore, do they? Do absolute paths work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have never tried an absolute path to put them anywhere. I thought it was forced to be within spras, but I'm not sure anymore. I'll test it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving Tony's comment:
This is a reminder for all of us that we should freeze development for the 1-2 week period before the tutorial. For instance, #387 changes this syntax, and we don't want to have the tutorial files be slightly out of date. That would be quite confusing.
This tutorial and registration for the conference is due September 28, 2025
Event: https://co.mbine.org/events/
Example tutorials: https://co.mbine.org/author/combine-2023/