Skip to content

Commit

Permalink
Biohackaton2022 genomes nb (#11)
Browse files Browse the repository at this point in the history
* Add notebook for genome search

* input files used for genome nb

* add output files produced while running genome nb

* new version of the notebook

* last version of genome nb and outputs folder reorganised

* [wip] prepare genomes nb for merge

* fixes spark installation and some minor editorial on genomes notebook

* adds task shortcut to build docker locally

* use lighter base docker; upgrade jupyter (still <4 though)

* updates CI tests

* upgrade query params extension to new template

---------

Co-authored-by: Sandy Rogers <[email protected]>
  • Loading branch information
vestalisvirginis and SandyRogers authored Jul 28, 2023
1 parent e535ec4 commit 9561e7d
Show file tree
Hide file tree
Showing 54 changed files with 428,243 additions and 2,838 deletions.
5 changes: 3 additions & 2 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ jobs:
context: .
file: ./docker/Dockerfile
load: true
tags: quay.io/microbiome-informatics/emg-notebooks.dev
tags: quay.io/microbiome-informatics/emg-notebooks.dev:latest

- name: Run tests
working-directory: ./tests
Expand All @@ -50,4 +50,5 @@ jobs:
name: launching-jupyter-lab-screenshot
path: |
tests/launching_jl.png
tests/shiny_proxy_launched.png
tests/shiny_proxy_launched.png
tests/jl_launched.png
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,8 @@ site_libs
src/docs/*.html
src/*.html
src/notebooks/**/*.html
src/*-listing.json
src/*-listing.json

*.parquet
!**/example-data/**/*.parquet
*.sig
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,14 +76,14 @@ git add depdencies/mgnify-cache.tgz
```

### Changing dependencies and Docker build
The add dependencies, edit the `dependencies/environment.yml` file.
The add dependencies, edit the `dependencies/{py|r}-environment.yml` files.

You can temporarily try things by opening a Terminal inside Jupyter Lab and `mambda install`ing the package(s).
But make sure you reflect everything in the conda environment file.

Then check the environment builds by (re)building the Docker:
```bash
docker build -f docker/Dockerfile -t quay.io/microbiome-informatics/emg-notebooks.dev:latest .
task build-notebook-docker
```

## Generating the documentation site
Expand Down Expand Up @@ -148,22 +148,21 @@ The configuration for this is in the `shiny-proxy` dir.

### Testing with a locally built Docker
```bash
docker build -f docker/Dockerfile -t quay.io/microbiome-informatics/emg-notebooks.dev .
task build-notebook-docker
```
(or just retag with `docker tag mgnify-nb-dev quay.io/microbiome-informatics/emg-notebooks.dev` if you already built it as above).

### Running ShinyProxy
- [Download the latest version of ShinyProxy](https://www.shinyproxy.io/downloads/) (>=2.6 is required). It is a JAR, so you need Java installed. i.e., download ShinyProxy into this repo directory.
- The `application.yml` file must be in the same directory as the location you launch Shiny Proxy from.
- If you want the currently deployed image instead of your local one... `docker pull quay.io/microbiome-informatics/emg-notebooks.dev`
- If you want the currently deployed image instead of your local one... `docker pull quay.io/microbiome-informatics/emg-notebooks.dev:latest`
- `cd shiny-proxy`, `java -jar shinyproxy-2.6.1.jar`
- Browse to the ShinyProxy URL, likely localhost:8080


## Jupyter Lab Extension, for deep-linking
`shiny_proxy_jlab_query_parms` contains a [JupyterLab Extension](https://jupyterlab.readthedocs.io/en/stable/user/extensions.html) to support deep-linking into JupyterLab, especially when running inside Shiny Proxy.
`jlab_query_params` contains a [JupyterLab Extension](https://jupyterlab.readthedocs.io/en/stable/user/extensions.html) to support deep-linking into JupyterLab, especially when running inside Shiny Proxy.

This extension was created using the [JupyterLab Extension Cookiecutter TS project](https://github.com/jupyterlab/extension-cookiecutter-ts), which is [BSD3 Licensed](https://github.com/jupyterlab/extension-cookiecutter-ts/blob/3.0/LICENSE).
This extension was created using the [JupyterLab Extension Template copier project](https://github.com/jupyterlab/extension-template), which is [CC0 licensed](https://github.com/jupyterlab/extension-template/blob/main/LICENSE).

This extenion is needed because Shiny Proxy does not pass the URL path beyond an app's identifier down to the iframe running the app (JupyterLab).

Expand All @@ -181,7 +180,7 @@ This is in the `mgnify_jupyter_lab_ui` folder.

## Testing
A small integration test suite is written using Jest-Puppetteer.
You need to have built or pulled the docker/Dockerfile (tagged as `quay.io/microbiome-informatics/emg-notebooks.dev`), and have Shiny Proxy downloaded first.
You need to have built or pulled the docker/Dockerfile (tagged as `quay.io/microbiome-informatics/emg-notebooks.dev:latest`), and have Shiny Proxy downloaded first.
The test suite runs Shiny Proxy, and makes sure Jupyter Lab opens, the deep-linking works, and variable insertion works in R and Python.

```bash
Expand Down
10 changes: 10 additions & 0 deletions Taskfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,16 @@ tasks:
cmds:
- docker run -it -v $PWD/src/notebooks:/home/jovyan/mgnify-examples -p 8888:8888 quay.io/microbiome-informatics/emg-notebooks.dev:latest

build-notebook-docker:
summary: |
Builds a docker image with the Notebooks Server, locally.
Useful if you're developing the Jupyter extensions, installing new R/Python packages to the conda envs etc.
NOT needed if you're just editing/adding notebooks with no additional dependencies.
cmds:
- docker build -f docker/Dockerfile -t quay.io/microbiome-informatics/emg-notebooks.dev:latest .

build-static-docker:
summary: |
Builds a docker image with Quarto included, for statically rendering the notebook outputs.
Expand Down
10 changes: 9 additions & 1 deletion dependencies/py-environment.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,21 @@
# https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-datascience-notebook
name: mgnify-py-env
channels:
- conda-forge
- nodefaults
dependencies:
- python==3.11
- pyspark==3.3.2
- openjdk
- pip
- pip:
- jsonapi-client==0.9.9
- pandas==1.5.3
- numpy==1.24.2
- ipykernel==6.21.3
- matplotlib==3.7.1
- seaborn==0.12.2
- plotly==5.13.1
- sourmash==4.1.2
- biopython==1.81
- pyarrow==11.0.0

23 changes: 12 additions & 11 deletions dependencies/r-environment.yml
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
# https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-datascience-notebook
name: mgnify-r-env
channels:
- conda-forge
- bioconda
- nodefaults
dependencies:
- bioconda::bioconductor-phyloseq=1.42.0
- bioconda::bioconductor-metagenomeSeq=1.40.0
- bioconda::bioconductor-microbiomeMarker=1.4.0
- bioconda::bioconductor-siamcat=2.2.0
- conda-forge::r-reshape2=1.4.4
- conda-forge::r-vegan=2.6_4
- conda-forge::r-devtools=2.4.5
- conda-forge::r-irkernel=1.3.2
- conda-forge::r-tidyverse=2.0.0
- conda-forge::r-ggplot2=3.4.1
- bioconductor-phyloseq=1.42.0
- bioconductor-metagenomeSeq=1.40.0
- bioconductor-microbiomeMarker=1.4.0
- bioconductor-siamcat=2.2.0
- r-reshape2=1.4.4
- r-vegan=2.6_4
- r-devtools=2.4.5
- r-irkernel=1.3.2
- r-tidyverse=2.0.0
- r-ggplot2=3.4.1
19 changes: 11 additions & 8 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
FROM jupyter/datascience-notebook:r-4.2.2@sha256:c8c75096f0efe1d7cad07ec97ecdf5c6328a119bf6295269c45065c90b0dde2a
FROM jupyter/minimal-notebook:lab-3.6.3@sha256:1443243e00caa7db61de7faaad626a173c4a51d8f15e48e07c6b111de73d0f38
LABEL maintainer="EMBL-EBI Microbiome Informatics (MGnify) Team <[email protected]>"
USER root
ENV CHOWN_HOME_OPTS='-R'
ENV CHOWN_HOME='yes'

# Install Python/R dependencies
COPY dependencies/r-environment.yml /tmp/r-environment.yml
COPY dependencies/py-environment.yml /tmp/py-environment.yml
COPY dependencies/dependencies.R /tmp/dependencies.R
RUN mamba install -y nb_conda_kernels
RUN conda config --system --prepend channels bioconda
COPY dependencies/r-environment.yml /tmp/r-environment.yml
RUN mamba env create -f /tmp/r-environment.yml
COPY dependencies/py-environment.yml /tmp/py-environment.yml
RUN mamba env create -f /tmp/py-environment.yml
COPY dependencies/dependencies.R /tmp/dependencies.R

SHELL ["conda", "run", "-n", "mgnify-r-env", "/bin/bash", "-c"]
RUN Rscript /tmp/dependencies.R
Expand All @@ -22,20 +24,21 @@ RUN Rscript /tmp/populate-mgnifyr-cache.R
SHELL ["/bin/bash", "-c"]

# Install JupyterLab extension to handle query parameter > env vars in ShinyProxy
COPY shiny_proxy_jlab_query_parms /tmp/shiny_proxy_jlab_query_parms
RUN pip install /tmp/shiny_proxy_jlab_query_parms
COPY jlab_query_params /tmp/jlab_query_params
RUN pip install /tmp/jlab_query_params

# Install Jupyter Lab extension providing MGnify-specific help
COPY mgnify_jupyter_lab_ui /tmp/mgnify_jupyter_lab_ui
RUN pip install /tmp/mgnify_jupyter_lab_ui

# Clean yarn cache else chown'ing home is very slow on container start
RUN jlpm cache clean
RUN rm -rf /home/jovyan/.yarn
RUN rm -rf /home/jovyan/.cache
RUN rm -rf /home/jovyan/.npm

# Clean tmp
RUN rm -rf /tmp/*

RUN jupyter kernelspec remove -y julia-1.8
COPY jupyter_config/custom.js /home/jovyan/.jupyter/custom/custom.js
COPY jupyter_config/jupyter_config.json /home/jovyan/.jupyter/jupyter_config.json
COPY src/notebooks /home/jovyan/mgnify-examples
Expand Down
20 changes: 20 additions & 0 deletions jlab_query_params/.copier-answers.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Changes here will be overwritten by Copier; NEVER EDIT MANUALLY
_commit: v4.0.10
_src_path: https://github.com/jupyterlab/extension-template
author_email: [email protected]
author_name: Sandy Rogers
data_format: string
file_extension: ''
has_binder: false
has_settings: false
kind: server
labextension_name: jlab_query_params
mimetype: ''
mimetype_name: ''
project_short_description: JupyterLab extension to set environment variables and support
deep-links via query parameters.
python_name: jlab_query_params
repository: ''
test: false
viewer_name: ''

Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
*.bundle.*
lib/
node_modules/
*.log
.eslintcache
.stylelintcache
*.egg-info/
.ipynb_checkpoints
*.tsbuildinfo
shiny_proxy_jlab_query_parms/labextension
jlab_query_params/labextension
# Version file is handled by hatchling
jlab_query_params/_version.py

# Created by https://www.gitignore.io/api/python
# Edit at https://www.gitignore.io/?templates=python
Expand Down Expand Up @@ -56,6 +61,7 @@ htmlcov/
.coverage.*
.cache
nosetests.xml
coverage/
coverage.xml
*.cover
.hypothesis/
Expand Down Expand Up @@ -110,3 +116,6 @@ dmypy.json

# OSX files
.DS_Store

# Yarn cache
.yarn/
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ node_modules
**/node_modules
**/lib
**/package.json
shiny_proxy_jlab_query_parms
!/package.json
jlab_query_params
3 changes: 3 additions & 0 deletions jlab_query_params/.yarnrc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
enableImmutableInstalls: false

nodeLinker: node-modules
File renamed without changes.
29 changes: 29 additions & 0 deletions jlab_query_params/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
BSD 3-Clause License

Copyright (c) 2023, Sandy Rogers
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Loading

0 comments on commit 9561e7d

Please sign in to comment.