Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: openproblems-bio/openproblems-v2
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 6cdba41787f894aa4d1ff6f4b8d00d866fefb62e
Choose a base ref
...
head repository: openproblems-bio/openproblems-v2
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 45e9e434a7c99ac8beaee17af84a37d99a7e41e2
Choose a head ref
  • 11 commits
  • 116 files changed
  • 4 contributors

Commits on Dec 15, 2023

  1. openproblems_v1: Separate input_id from dataset_id (#311)

    * allow input ids to be different from the dataset ids
    
    * implement changes in the wf
    rcannood authored Dec 15, 2023
    Copy the full SHA
    298dcb1 View commit details

Commits on Dec 16, 2023

  1. Add SIMLR (#312)

    * add SIMLR dimensionality reduction method
    
    * add description and reference
    
    * add SIMLR reference
    
    * change default n_dim and write output to file
    
    * Add SIMLR entry
    
    * Update documentation URL
    
    Co-authored-by: Kai Waldrant <[email protected]>
    
    * Reformat code
    
    * Use explicit namespaces
    
    ---------
    
    Co-authored-by: Kai Waldrant <[email protected]>
    sainirmayi and KaiWaldrant authored Dec 16, 2023
    Copy the full SHA
    bdbf261 View commit details
  2. rework dataset scripts (#310)

    * rework dataset scripts
    
    * fix scripts
    
    * Update src/tasks/dimensionality_reduction/resources_scripts/run_test.sh
    
    Co-authored-by: Kai Waldrant <[email protected]>
    
    * fix scripts
    
    * change dataset_id to input_id
    
    * update dataset_id to input_id
    
    * Update compute environment in resource scripts
    
    ---------
    
    Co-authored-by: Kai Waldrant <[email protected]>
    rcannood and KaiWaldrant authored Dec 16, 2023
    Copy the full SHA
    35e065d View commit details

Commits on Dec 19, 2023

  1. Add Neurips 2021 dataset loader (#309)

    * Add neurips2021 dataset loader
    
    * add test script
    
    * Add process_openproblems_neurips2021_bmmc workflow
    
    * Add resource_test script for processing NeurIPS 2021 BMMC dataset
    
    * Update predict_modality workflow and resource test script
    
    * Update neurips dataset loader
    
    * fix predict_modality to work with new data format
    
    * update neurips2021_bmmc.sh source path
    
    * force ci test
    
    * Add test resource file for openproblems_neurips2021_bmmc
    
    * download full dataset as tempfile
    
    * make fixes to the PM interface
    
    ---------
    
    Co-authored-by: Robrecht Cannoodt <[email protected]>
    KaiWaldrant and rcannood authored Dec 19, 2023
    Copy the full SHA
    cef0e51 View commit details
  2. Copy the full SHA
    e233515 View commit details
  3. Update benchmarking workflows (#313)

    * update denoising
    
    * WIP label_projection
    
    * update denoising process_datasets to store dataset nromaliztion_id
    
    Co-authored-by: sainirmayi <[email protected]>
    
    * fix run_test typo denoising
    
    * fix wf alb_proj and denoising
    
    * add label_proj test
    
    * fix label_projection wf
    
    * update dim_red
    
    * update match_modalities
    
    * update process datasets
    
    * update predict_modality
    
    ---------
    
    Co-authored-by: sainirmayi <[email protected]>
    KaiWaldrant and sainirmayi authored Dec 19, 2023
    Copy the full SHA
    0a22803 View commit details
  4. Add components for extracting dataset info (#315)

    * add dataset comp
    
    * add get_dataset_info to workflow
    
    * commit
    
    * undo changes -- will be addressed in #314
    
    * move get_dataset_info component
    
    * remove unnecessary dependencies
    
    * add component for extracting the dataset info
    
    * fix script
    
    * fix typo
    
    * fix script
    
    * update script
    
    * fix get_dataset_info
    
    ---------
    
    Co-authored-by: Kai Waldrant <[email protected]>
    rcannood and KaiWaldrant authored Dec 19, 2023
    Copy the full SHA
    17cc7cf View commit details
  5. Fix dataset info components (#316)

    * simplify get_dataset_info component
    
    * fix script
    
    * change default into example
    
    * fix script
    
    * fix script
    rcannood authored Dec 19, 2023
    Copy the full SHA
    4b7c085 View commit details
  6. Remove normalization from denoising (#318)

    * remove normalization from denoising task
    
    * remove unused api file
    rcannood authored Dec 19, 2023
    Copy the full SHA
    1930eb1 View commit details
  7. remove multimodal starter dataset (#317)

    * remove multimodal starter dataset
    
    * fix indentation
    
    ---------
    
    Co-authored-by: Robrecht Cannoodt <[email protected]>
    KaiWaldrant and rcannood authored Dec 19, 2023
    Copy the full SHA
    2cf2a73 View commit details
  8. update readmes (#319)

    rcannood authored Dec 19, 2023
    Copy the full SHA
    45e9e43 View commit details
Showing with 2,350 additions and 1,500 deletions.
  1. +2 −0 CHANGELOG.md
  2. +32 −0 src/common/library.bib
  3. +20 −0 src/common/process_task_results/get_dataset_info/config.vsh.yaml
  4. +25 −0 src/common/process_task_results/get_dataset_info/script.R
  5. +0 −4 src/common/process_task_results/get_method_info/config.vsh.yaml
  6. +0 −4 src/common/process_task_results/get_metric_info/config.vsh.yaml
  7. +2 −2 src/common/process_task_results/get_results/script.R
  8. +1 −0 src/common/process_task_results/run/config.vsh.yaml
  9. +1 −2 src/common/process_task_results/run/main.nf
  10. +3 −3 src/common/process_task_results/run/run_nf_tower_test.sh
  11. +2 −2 src/common/process_task_results/run/run_test.sh
  12. +3 −8 src/common/process_task_results/yaml_to_json/script.py
  13. +1 −4 src/common/resources_test_scripts/task_metadata.sh
  14. +70 −0 src/datasets/loaders/openproblems_neurips2021_bmmc/config.vsh.yaml
  15. +106 −0 src/datasets/loaders/openproblems_neurips2021_bmmc/script.py
  16. +70 −0 src/datasets/loaders/openproblems_neurips2021_bmmc/test.py
  17. +6 −2 src/datasets/loaders/openproblems_v1/config.vsh.yaml
  18. +1 −1 src/datasets/loaders/openproblems_v1/script.py
  19. +5 −3 src/datasets/loaders/openproblems_v1/test.py
  20. +6 −2 src/datasets/loaders/openproblems_v1_multimodal/config.vsh.yaml
  21. +1 −1 src/datasets/loaders/openproblems_v1_multimodal/script.py
  22. +6 −4 src/datasets/loaders/openproblems_v1_multimodal/test.py
  23. +126 −6 src/datasets/resource_scripts/cellxgene_census.sh
  24. +32 −0 src/datasets/resource_scripts/cellxgene_census_test.sh
  25. +0 −145 src/datasets/resource_scripts/cellxgene_census_tower.sh
  26. +54 −0 src/datasets/resource_scripts/dataset_info.sh
  27. +55 −0 src/datasets/resource_scripts/openproblems_neurips2021_multimodal.sh
  28. +37 −28 src/datasets/resource_scripts/openproblems_v1.sh
  29. +14 −19 src/datasets/resource_scripts/openproblems_v1_multimodal.sh
  30. +0 −58 src/datasets/resource_scripts/openproblems_v1_multimodal_nf_tower.sh
  31. +45 −0 src/datasets/resource_scripts/openproblems_v1_multimodal_test.sh
  32. +0 −154 src/datasets/resource_scripts/openproblems_v1_nf_tower.sh
  33. +51 −0 src/datasets/resource_scripts/openproblems_v1_test.sh
  34. +0 −87 src/datasets/resource_test_scripts/bmmc_x_starter.sh
  35. +58 −0 src/datasets/resource_test_scripts/neurips2021_bmmc.sh
  36. +34 −0 src/datasets/workflows/extract_dataset_info/config.vsh.yaml
  37. +57 −0 src/datasets/workflows/extract_dataset_info/main.nf
  38. +32 −0 src/datasets/workflows/extract_dataset_info/run_test.sh
  39. +135 −0 src/datasets/workflows/process_openproblems_neurips2021_bmmc/config.vsh.yaml
  40. +171 −0 src/datasets/workflows/process_openproblems_neurips2021_bmmc/main.nf
  41. +5 −1 src/datasets/workflows/process_openproblems_v1/config.vsh.yaml
  42. +2 −1 src/datasets/workflows/process_openproblems_v1/main.nf
  43. +5 −1 src/datasets/workflows/process_openproblems_v1_multimodal/config.vsh.yaml
  44. +2 −1 src/datasets/workflows/process_openproblems_v1_multimodal/main.nf
  45. +0 −25 src/tasks/batch_integration/nf_tower_scripts/process_datasets.sh
  46. +0 −27 src/tasks/batch_integration/nf_tower_scripts/run_benchmark.sh
  47. +28 −22 src/tasks/batch_integration/resources_scripts/process_datasets.sh
  48. +22 −26 src/tasks/batch_integration/resources_scripts/run_benchmark.sh
  49. +5 −5 src/tasks/batch_integration/{nf_tower_scripts → resources_scripts}/run_test.sh
  50. +3 −1 src/tasks/denoising/README.md
  51. +0 −16 src/tasks/denoising/api/file_dataset.yaml
  52. +0 −25 src/tasks/denoising/nf_tower_scripts/process_datasets.sh
  53. +0 −28 src/tasks/denoising/nf_tower_scripts/run_benchmark.sh
  54. +29 −22 src/tasks/denoising/resources_scripts/process_datasets.sh
  55. +24 −16 src/tasks/denoising/resources_scripts/run_benchmark.sh
  56. +5 −4 src/tasks/denoising/{nf_tower_scripts → resources_scripts}/run_test.sh
  57. +25 −3 src/tasks/denoising/workflows/run_benchmark/config.vsh.yaml
  58. +55 −32 src/tasks/denoising/workflows/run_benchmark/main.nf
  59. +4 −4 src/tasks/denoising/workflows/run_benchmark/run_test.sh
  60. +3 −1 src/tasks/dimensionality_reduction/README.md
  61. +57 −0 src/tasks/dimensionality_reduction/methods/simlr/config.vsh.yaml
  62. +67 −0 src/tasks/dimensionality_reduction/methods/simlr/script.R
  63. +0 −25 src/tasks/dimensionality_reduction/nf_tower_scripts/process_datasets.sh
  64. +0 −27 src/tasks/dimensionality_reduction/nf_tower_scripts/run_benchmark.sh
  65. +29 −20 src/tasks/dimensionality_reduction/resources_scripts/process_datasets.sh
  66. +23 −26 src/tasks/dimensionality_reduction/resources_scripts/run_benchmark.sh
  67. +5 −4 src/tasks/dimensionality_reduction/{nf_tower_scripts → resources_scripts}/run_test.sh
  68. +25 −2 src/tasks/dimensionality_reduction/workflows/run_benchmark/config.vsh.yaml
  69. +63 −19 src/tasks/dimensionality_reduction/workflows/run_benchmark/main.nf
  70. +2 −2 src/tasks/dimensionality_reduction/workflows/run_benchmark/run_test.sh
  71. +0 −25 src/tasks/label_projection/nf_tower_scripts/process_datasets.sh
  72. +0 −25 src/tasks/label_projection/nf_tower_scripts/run_benchmark.sh
  73. +29 −22 src/tasks/label_projection/resources_scripts/process_datasets.sh
  74. +22 −27 src/tasks/label_projection/resources_scripts/run_benchmark.sh
  75. +25 −3 src/tasks/label_projection/workflows/run_benchmark/config.vsh.yaml
  76. +58 −18 src/tasks/label_projection/workflows/run_benchmark/main.nf
  77. +31 −0 src/tasks/label_projection/workflows/run_benchmark/run_test.sh
  78. +0 −25 src/tasks/match_modalities/nf_tower_scripts/process_datasets.sh
  79. +0 −25 src/tasks/match_modalities/nf_tower_scripts/run_benchmark.sh
  80. +29 −22 src/tasks/match_modalities/resources_scripts/process_datasets.sh
  81. +22 −27 src/tasks/match_modalities/resources_scripts/run_benchmark.sh
  82. +25 −3 src/tasks/match_modalities/workflows/run_benchmark/config.vsh.yaml
  83. +62 −19 src/tasks/match_modalities/workflows/run_benchmark/main.nf
  84. +31 −0 src/tasks/match_modalities/workflows/run_benchmark/run_test.sh
  85. +34 −24 src/tasks/predict_modality/README.md
  86. +2 −2 src/tasks/predict_modality/api/comp_control_method.yaml
  87. +2 −2 src/tasks/predict_modality/api/comp_method.yaml
  88. +2 −2 src/tasks/predict_modality/api/comp_metric.yaml
  89. +4 −4 src/tasks/predict_modality/api/comp_process_dataset.yaml
  90. +29 −5 src/tasks/predict_modality/api/file_common_dataset_other_mod.yaml
  91. +5 −5 src/tasks/predict_modality/api/file_common_dataset_rna.yaml
  92. +0 −67 src/tasks/predict_modality/api/file_dataset_other_mod.yaml
  93. +0 −43 src/tasks/predict_modality/api/file_dataset_rna.yaml
  94. +1 −1 src/tasks/predict_modality/api/file_prediction.yaml
  95. +1 −1 src/tasks/predict_modality/api/file_score.yaml
  96. +1 −1 src/tasks/predict_modality/api/file_test_mod1.yaml
  97. +1 −1 src/tasks/predict_modality/api/file_test_mod2.yaml
  98. +1 −1 src/tasks/predict_modality/api/file_train_mod1.yaml
  99. +1 −1 src/tasks/predict_modality/api/file_train_mod2.yaml
  100. +4 −4 src/tasks/predict_modality/control_methods/meanpergene/script.py
  101. +3 −3 src/tasks/predict_modality/control_methods/random_predict/script.R
  102. +1 −1 src/tasks/predict_modality/control_methods/solution/script.R
  103. +3 −3 src/tasks/predict_modality/control_methods/zeros/script.py
  104. +3 −3 src/tasks/predict_modality/methods/knnr_py/script.py
  105. +1 −3 src/tasks/predict_modality/methods/newwave_knnr/script.R
  106. +3 −3 src/tasks/predict_modality/metrics/correlation/script.R
  107. +3 −3 src/tasks/predict_modality/metrics/mse/script.py
  108. +0 −25 src/tasks/predict_modality/nf_tower_scripts/process_datasets.sh
  109. +0 −27 src/tasks/predict_modality/nf_tower_scripts/run_benchmark.sh
  110. +25 −21 src/tasks/predict_modality/process_dataset/script.R
  111. +22 −22 src/tasks/predict_modality/resources_scripts/process_datasets.sh
  112. +23 −26 src/tasks/predict_modality/resources_scripts/run_benchmark.sh
  113. +7 −7 src/tasks/predict_modality/resources_test_scripts/{bmmc_x_starter.sh → neurips2021_bmmc.sh}
  114. +25 −3 src/tasks/predict_modality/workflows/run_benchmark/config.vsh.yaml
  115. +54 −17 src/tasks/predict_modality/workflows/run_benchmark/main.nf
  116. +3 −3 src/tasks/predict_modality/workflows/run_benchmark/run_test.sh
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -307,6 +307,8 @@

* `metrics/trustworthiness` should be removed because it is already included in `metrics/coranking`.

* `methods/simlr`: Added new SIMLR method.


## match_modalities (PR #201)

32 changes: 32 additions & 0 deletions src/common/library.bib
Original file line number Diff line number Diff line change
@@ -964,6 +964,18 @@ @article{nestorowa2016single
url = {https://doi.org/10.1182/blood-2016-05-716480}
}

@inproceedings{neurips,
author = {Luecken, Malte and Burkhardt, Daniel and Cannoodt, Robrecht and Lance, Christopher and Agrawal, Aditi and Aliee, Hananeh and Chen, Ann and Deconinck, Louise and Detweiler, Angela and Granados, Alejandro and Huynh, Shelly and Isacco, Laura and Kim, Yang and Klein, Dominik and DE KUMAR, BONY and Kuppasani, Sunil and Lickert, Heiko and McGeever, Aaron and Melgarejo, Joaquin and Mekonen, Honey and Morri, Maurizio and M\"{u}ller, Michaela and Neff, Norma and Paul, Sheryl and Rieck, Bastian and Schneider, Kaylie and Steelman, Scott and Sterr, Michael and Treacy, Daniel and Tong, Alexander and Villani, Alexandra-Chloe and Wang, Guilin and Yan, Jia and Zhang, Ce and Pisco, Angela and Krishnaswamy, Smita and Theis, Fabian and Bloom, Jonathan M},
booktitle = {Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks},
editor = {J. Vanschoren and S. Yeung},
pages = {},
publisher = {Curran},
title = {A sandbox for prediction and integration of DNA, RNA, and proteins in single cells},
url = {https://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/158f3069a435b314a80bdcb024f8e422-Paper-round2.pdf},
volume = {1},
year = {2021}
}


@string{nov = {Nov.}}
@@ -1348,6 +1360,26 @@ @article{wang2013target
}


@article{wang2017visualization,
title = {Visualization and analysis of single-cell {RNA}-seq data by kernel-based similarity learning},
volume = {14},
copyright = {2017 Springer Nature America, Inc.},
issn = {1548-7105},
url = {https://www.nature.com/articles/nmeth.4207},
doi = {10.1038/nmeth.4207},
abstract = {The SIMLR software identifies similarities between cells across a range of single-cell RNA-seq data, enabling effective dimension reduction, clustering and visualization.},
language = {en},
number = {4},
journal = {Nature Methods},
author = {Wang, Bo and Zhu, Junjie and Pierson, Emma and Ramazzotti, Daniele and Batzoglou, Serafim},
month = apr,
year = {2017},
publisher = {Nature Publishing Group},
keywords = {Gene expression, Genome informatics, Machine learning, Statistical methods},
pages = {414--416},
}


@article{welch2019single,
title = {Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity},
author = {Joshua D. Welch and Velina Kozareva and Ashley Ferreira and Charles Vanderburg and Carly Martin and Evan Z. Macosko},
20 changes: 20 additions & 0 deletions src/common/process_task_results/get_dataset_info/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
__merge__: ../api/get_info.yaml
functionality:
name: "get_dataset_info"
description: "Extract dataset info and convert to expected format for website results"
resources:
- type: r_script
path: script.R
test_resources:
- type: file
path: /resources_test/common/task_metadata/dataset_info.yaml
dest: test_file.yaml
platforms:
- type: docker
image: ghcr.io/openproblems-bio/base_r:1.0.2
setup:
- type: r
cran: [ yaml, jsonlite ]
- type: nextflow
directives:
label: [lowmem, lowtime, lowcpu]
25 changes: 25 additions & 0 deletions src/common/process_task_results/get_dataset_info/script.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
requireNamespace("jsonlite", quietly = TRUE)
requireNamespace("yaml", quietly = TRUE)

## VIASH START
par <- list(
input = "resources_test/common/task_metadata/dataset_info.yaml",
output = "output/dataset_info.json"
)
## VIASH END

datasets <- yaml::yaml.load_file(par$input)

# transform into format expected by website
datasets_formatted <- lapply(datasets, function(dataset) {
dataset$data_url <- dataset$dataset_url
dataset$data_reference <- dataset$dataset_reference
dataset
})

jsonlite::write_json(
datasets_formatted,
par$output,
auto_unbox = TRUE,
pretty = TRUE
)
Original file line number Diff line number Diff line change
@@ -15,10 +15,6 @@ platforms:
setup:
- type: r
cran: [ purrr, dplyr, yaml, rlang, processx ]
- type: apt
packages: [ curl, default-jdk ]
- type: docker
run: "curl -fsSL dl.viash.io | bash && mv viash /usr/bin/viash"
- type: nextflow
directives:
label: [lowmem, lowtime, lowcpu]
Original file line number Diff line number Diff line change
@@ -15,10 +15,6 @@ platforms:
setup:
- type: r
cran: [ purrr, dplyr, yaml, rlang, processx ]
- type: apt
packages: [ curl, default-jdk ]
- type: docker
run: "curl -fsSL dl.viash.io | bash && mv viash /usr/bin/viash"
- type: nextflow
directives:
label: [lowmem, lowtime, lowcpu]
4 changes: 2 additions & 2 deletions src/common/process_task_results/get_results/script.R
Original file line number Diff line number Diff line change
@@ -3,8 +3,8 @@ library(rlang)

## VIASH START
par <- list(
input_scores = "output/v2/batch_integration/scores.yaml",
input_execution = "output/v2/batch_integration/trace.txt",
input_scores = "resources/batch_integration/results/scores.yaml",
input_execution = "resources/batch_integration/results/trace.txt",
output = "test.json"
)
## VIASH END
1 change: 1 addition & 0 deletions src/common/process_task_results/run/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -79,6 +79,7 @@ functionality:
- name: common/process_task_results/get_results
- name: common/process_task_results/get_method_info
- name: common/process_task_results/get_metric_info
- name: common/process_task_results/get_dataset_info
- name: common/process_task_results/yaml_to_json
platforms:
- type: nextflow
3 changes: 1 addition & 2 deletions src/common/process_task_results/run/main.nf
Original file line number Diff line number Diff line change
@@ -34,8 +34,7 @@ workflow run_wf {
}
)

| yaml_to_json.run(
key: "dataset_info",
| get_dataset_info.run(
fromState: [
"input": "input_dataset_info",
"output": "output_dataset_info"
6 changes: 3 additions & 3 deletions src/common/process_task_results/run/run_nf_tower_test.sh
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#!/bin/bash

DATASETS_DIR="s3://openproblems-nextflow/output/v2/batch_integration"
DATASETS_DIR="s3://openproblems-data/resources/batch_integration/results/"

# try running on nf tower
cat > /tmp/params.yaml << HERE
cat > /tmp/params.yaml << 'HERE'
id: batch_integration_transform
input_scores: "$DATASETS_DIR/scores.yaml"
input_dataset_info: "$DATASETS_DIR/dataset_info.yaml"
@@ -33,6 +33,6 @@ tw launch https://github.com/openproblems-bio/openproblems-v2.git \
--pull-latest \
--main-script target/nextflow/common/workflows/transform_meta/main.nf \
--workspace 53907369739130 \
--compute-env 7IkB9ckC81O0dgNemcPJTD \
--compute-env 1pK56PjjzeraOOC2LDZvN2 \
--params-file /tmp/params.yaml \
--config /tmp/nextflow.config
4 changes: 2 additions & 2 deletions src/common/process_task_results/run/run_test.sh
Original file line number Diff line number Diff line change
@@ -10,8 +10,8 @@ set -e

# export TOWER_WORKSPACE_ID=53907369739130

DATASETS_DIR="output/v2/batch_integration"
OUTPUT_DIR="/home/kai/Documents/openroblems/website/results/batch_integration_feature/data"
DATASETS_DIR="resources/batch_integration/results"
OUTPUT_DIR="../website/results/batch_integration_feature/data"

if [ ! -d "$OUTPUT_DIR" ]; then
mkdir -p "$OUTPUT_DIR"
11 changes: 3 additions & 8 deletions src/common/process_task_results/yaml_to_json/script.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,16 @@
from os import path
import yaml
import json

## VIASH START
par = {
"input" : ".",
"task_id" : "denoising",
"input": ".",
"task_id": "denoising",
"output": "output/task.json",

}
meta = { "functionality" : "foo" }

## VIASH END

with open(par["input"], "r") as f:
yaml_file = yaml.safe_load(f)


with open(par["output"], "w") as out:
json.dump(yaml_file, out, indent=2)
json.dump(yaml_file, out, indent=2)
5 changes: 1 addition & 4 deletions src/common/resources_test_scripts/task_metadata.sh
Original file line number Diff line number Diff line change
@@ -128,9 +128,6 @@ nextflow run . \
-entry auto \
--input_states "$DATASETS_DIR/**/state.yaml" \
--rename_keys 'input_dataset:output_dataset,input_solution:output_solution' \
--settings '{"output_scores": "scores.yaml", "output_dataset_info": "dataset_info.yaml", "output_method_configs": "method_configs.yaml", "output_metric_configs": "metric_configs.yaml"}' \
--settings '{"output_scores": "scores.yaml", "output_dataset_info": "dataset_info.yaml", "output_method_configs": "method_configs.yaml", "output_metric_configs": "metric_configs.yaml", "output_task_info": "task_info.yaml"}' \
--publish_dir "$OUTPUT_DIR" \
--output_state "state.yaml"

# Copy task info
cp src/tasks/batch_integration/api/task_info.yaml "$OUTPUT_DIR/task_info.yaml"
70 changes: 70 additions & 0 deletions src/datasets/loaders/openproblems_neurips2021_bmmc/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
functionality:
name: "openproblems_neurips2021_bmmc"
namespace: "datasets/loaders"
description: "Fetch a dataset from the OpenProblems NeurIPS2021 competition"
argument_groups:
- name: Inputs
arguments:
- name: "--input"
type: file
description: Processed h5ad file published at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122.
required: true
example: GSE194122_openproblems_neurips2021_cite_BMMC_processed.h5ad
- name: "--mod1"
type: string
description: Name of the first modality.
required: true
example: GEX
- name: "--mod2"
type: string
description: Name of the second modality.
required: true
example: ADT
- name: Metadata
arguments:
- name: "--dataset_name"
type: string
description: Nicely formatted name.
required: true
- name: "--dataset_url"
type: string
description: Link to the original source of the dataset.
required: false
- name: "--dataset_reference"
type: string
description: Bibtex reference of the paper in which the dataset was published.
required: false
- name: "--dataset_summary"
type: string
description: Short description of the dataset.
required: true
- name: "--dataset_description"
type: string
description: Long description of the dataset.
required: true
- name: "--dataset_organism"
type: string
description: The organism of the dataset.
required: false
- name: Outputs
arguments:
- name: "--output_mod1"
__merge__: ../../api/file_raw.yaml
direction: "output"
- name: "--output_mod2"
__merge__: ../../api/file_raw.yaml
direction: "output"
resources:
- type: python_script
path: script.py
test_resources:
- type: python_script
path: test.py
- type: file
path: /resources_test/common/openproblems_neurips2021/neurips2021_bmmc_cite.h5ad
platforms:
- type: docker
image: ghcr.io/openproblems-bio/base_python:1.0.2
- type: nextflow
directives:
label: [ highmem, midcpu , midtime]
Loading