This repository hosts the source code to reproduce the results presented in the paper AI-driven Automated Discovery Tools Reveal Diverse Behavioral Competencies of Biological Networks:
Clone the repository on your local machine
git clone https://github.com/flowersteam/curious-exploration-of-grn-competencies.git
The code comes with an anacoda environment that has all the requirements pre-installed. You can create the environment using
conda env create -f env.yaml # creating the environment
Then you can activate the environment and visualise the paper and the tutorials using the jupyter lab
conda activate curious_assistant
jupyter lab
The paper and tutorials are written in the form of jupyter notebooks and are placed in the notebooks
folder. The complete structure of the repository is as follows:
├── README.md
├── LICENSE
├── env.yaml # anaconda environment
│
├── notebook_to_html_export.ipynb # notebook generating the docs
├── docs # generated docs html
│
├── experimental_campaign # data and code for the experiments
│
└── notebooks # jupyter notebooks
├── figures
├── paper.ipynb # paper notebook
├── tuto1.ipynb # tutorial 1 notebook
└── tuto2.ipynb # tutorial 2 notebook
Before trying to regenerate the paper plots please unzip the experiment_data_statistics.zip
and evaluation_data_examples.zip
files in the experimental_campaign/analysis/
folder.
Then you can just run the paper.ipynb
notebook.
As some of the tutorial cells might take considerable amount of time to execute the tutorial notebooks come with an option to load already executed data, which is much faster. This functionality can be used as an initiation to the tutorials, when running them for the first time.
nb_mode = "load" #@param ["run", "load"]
If you want to regenerate all the data and run the tutorial examples set the nb_mode
to
nb_mode = "run" #@param ["run", "load"]
If you want to save the newly generated figures (so that they are updated as well in paper.ipynb
) set
nb_save_figs = True #@param {type:"boolean"}
For running the experimental campaign, activate the conda environment and navigate into the corresponding folder:
conda activate curious_assistant
cd experimental_campaign/
The experimental_campaign
folder structure is as follows
├── resources # database creation
├── *.py # python scripts to generate the database
├── bio_models # folder containing the biological network models
└── *.csv and *.npy # statistics about the databases
├── experiments # running experiments
├── experiment_000001 # python scripts to run the random exploration baseline
├── experiment_000003 # python scripts to run the curiosity-driven exploration and robustness tests
└── analysis # collect the data for analysis
├── *.py # python scripts to gather data necessary for notebooks
└── *.pickle # analysis data
Please note that reproducing the whole experimental campaign over the 432 systems needs a long time to compute and can take a lot of space (>500GB). The final analysis data (~350MB) needed to reproduce the paper main figures is already provided as pickle files in the experimental_campaign/analysis/
folder (after unzipping). For re-generating that data from scratch, you can follow the below steps (1-2-3) but we recommend running the campaign on supercomputers if possible, and then transferring back only the analysis data on local computer.
The database of biological networks used in the paper is contained in the experimental_campaign/resources/
folder, go in it as follows:
cd resources/
You can delete the experimental_campaign/resources/bio_models
folder as well as all the experimental_campaign/resources/*.csv
files and regenerate them as follows:
python run_bio_models_preselection.py
python run_bio_models_nodes_selection.py
The database of random networks as well as the analysis of their versatility (Figure 7 of the main paper) can also be reproduced by deleting the random_networks_versatility.npy
file and regenerating it as follows:
python generate_random_models.py
To run the main experiments, go in the corresponding folder as follows:
cd ../experiments/
For running the curiosity-driven exploration algorithm on the database of biological networks do:
cd experiment_000003/
python run_experiment.py
When finished, you can run the robustness tests as follows:
python run_evaluation.py
You can also run the random search exploration baseline as follows:
cd experiment_000001/
python run_experiment.py
Finally, you can gather the "analysis data" from the data generated in step 2 as follows:
cd ../../analysis
python gather_experiment_data_statistics.py #exploration algorithms data
python gather_evaluation_data_statistics.py #robustness tests data
python gather_evaluation_data_examples.py #robustness tests examples for Figure 5
These final steps output pickle
files that are used to run the paper notebook.
If you have any questions about the notebooks or the paper feel free to contact me at the email: [email protected]