-
Notifications
You must be signed in to change notification settings - Fork 81
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* update to viash 0.9 and categorise datasets * group workflows * add api for spatial datasets * add more metadata * update publish dir path * update project config * update namespace * fix id * update example * fix example * update test resources * update helper resources * fix multiple separator --------- Co-authored-by: Robrecht Cannoodt <[email protected]>
- Loading branch information
1 parent
8f49337
commit e7b3859
Showing
117 changed files
with
2,985 additions
and
2,761 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,15 @@ | ||
functionality: | ||
namespace: "datasets/loaders" | ||
info: | ||
type: dataset_loader | ||
type_info: | ||
label: Dataset loader | ||
summary: A component which generates a "Common dataset". | ||
description: | | ||
A dataset loader will typically have an identifier (e.g. a GEO identifier) | ||
or URL as input argument and additional arguments to define where the script needs to download a dataset from and how to process it. | ||
arguments: | ||
- name: "--output" | ||
__merge__: file_raw.yaml | ||
direction: "output" | ||
required: true | ||
test_resources: [] | ||
# namespace: "datasets/loaders" | ||
info: | ||
type: dataset_loader | ||
type_info: | ||
label: Dataset loader | ||
summary: A component which generates a "Common dataset". | ||
description: | | ||
A dataset loader will typically have an identifier (e.g. a GEO identifier) | ||
or URL as input argument and additional arguments to define where the script needs to download a dataset from and how to process it. | ||
arguments: | ||
- name: "--output" | ||
__merge__: file_raw.yaml | ||
direction: "output" | ||
required: true | ||
test_resources: [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,36 +1,35 @@ | ||
functionality: | ||
namespace: "datasets/normalization" | ||
info: | ||
type: dataset_normalization | ||
type_info: | ||
label: Dataset normalization | ||
summary: | | ||
A normalization method which processes the raw counts into a normalized dataset. | ||
description: | ||
A component for normalizing the raw counts as output by dataset loaders into a normalized dataset. | ||
arguments: | ||
- name: "--input" | ||
__merge__: file_raw.yaml | ||
direction: input | ||
required: true | ||
- name: "--output" | ||
__merge__: file_normalized.yaml | ||
direction: output | ||
required: true | ||
- name: "--normalization_id" | ||
type: string | ||
description: "The normalization id to store in the dataset metadata. If not specified, the functionality name will be used." | ||
required: false | ||
- name: "--layer_output" | ||
type: string | ||
default: "normalized" | ||
description: The name of the layer in which to store the normalized data. | ||
- name: "--obs_size_factors" | ||
type: string | ||
default: "size_factors" | ||
description: In which .obs slot to store the size factors (if any). | ||
test_resources: | ||
- path: /resources_test/common/pancreas | ||
dest: resources_test/common/pancreas | ||
- type: python_script | ||
path: /src/common/comp_tests/run_and_check_adata.py | ||
namespace: "datasets/normalization" | ||
info: | ||
type: dataset_normalization | ||
type_info: | ||
label: Dataset normalization | ||
summary: | | ||
A normalization method which processes the raw counts into a normalized dataset. | ||
description: | ||
A component for normalizing the raw counts as output by dataset loaders into a normalized dataset. | ||
arguments: | ||
- name: "--input" | ||
__merge__: file_raw.yaml | ||
direction: input | ||
required: true | ||
- name: "--output" | ||
__merge__: file_normalized.yaml | ||
direction: output | ||
required: true | ||
- name: "--normalization_id" | ||
type: string | ||
description: "The normalization id to store in the dataset metadata. If not specified, the functionality name will be used." | ||
required: false | ||
- name: "--layer_output" | ||
type: string | ||
default: "normalized" | ||
description: The name of the layer in which to store the normalized data. | ||
- name: "--obs_size_factors" | ||
type: string | ||
default: "size_factors" | ||
description: In which .obs slot to store the size factors (if any). | ||
test_resources: | ||
- path: /resources_test/common/pancreas | ||
dest: resources_test/common/pancreas | ||
- type: python_script | ||
path: /common/component_tests/run_and_check_output.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,40 +1,39 @@ | ||
functionality: | ||
namespace: "datasets/processors" | ||
info: | ||
type: dataset_processor | ||
type_info: | ||
label: HVG | ||
summary: | | ||
Computes the highly variable genes scores. | ||
description: | | ||
The resulting AnnData will contain both a boolean 'hvg' column in 'var', as well as a numerical 'hvg_score' in 'var'. | ||
arguments: | ||
- name: "--input" | ||
__merge__: file_normalized.yaml | ||
required: true | ||
direction: input | ||
- name: "--input_layer" | ||
type: string | ||
default: "normalized" | ||
description: Which layer to use as input. | ||
- name: "--output" | ||
direction: output | ||
__merge__: file_hvg.yaml | ||
required: true | ||
- name: "--var_hvg" | ||
type: string | ||
default: "hvg" | ||
description: "In which .var slot to store whether a feature is considered to be hvg." | ||
- name: "--var_hvg_score" | ||
type: string | ||
default: "hvg_score" | ||
description: "In which .var slot to store the gene variance score (normalized dispersion)." | ||
- name: "--num_features" | ||
type: integer | ||
default: 1000 | ||
description: "The number of HVG to select" | ||
test_resources: | ||
- path: /resources_test/common/pancreas | ||
dest: resources_test/common/pancreas | ||
- type: python_script | ||
path: /src/common/comp_tests/run_and_check_adata.py | ||
namespace: "datasets/processors" | ||
info: | ||
type: dataset_processor | ||
type_info: | ||
label: HVG | ||
summary: | | ||
Computes the highly variable genes scores. | ||
description: | | ||
The resulting AnnData will contain both a boolean 'hvg' column in 'var', as well as a numerical 'hvg_score' in 'var'. | ||
arguments: | ||
- name: "--input" | ||
__merge__: file_normalized.yaml | ||
required: true | ||
direction: input | ||
- name: "--input_layer" | ||
type: string | ||
default: "normalized" | ||
description: Which layer to use as input. | ||
- name: "--output" | ||
direction: output | ||
__merge__: file_hvg.yaml | ||
required: true | ||
- name: "--var_hvg" | ||
type: string | ||
default: "hvg" | ||
description: "In which .var slot to store whether a feature is considered to be hvg." | ||
- name: "--var_hvg_score" | ||
type: string | ||
default: "hvg_score" | ||
description: "In which .var slot to store the gene variance score (normalized dispersion)." | ||
- name: "--num_features" | ||
type: integer | ||
default: 1000 | ||
description: "The number of HVG to select" | ||
test_resources: | ||
- path: /resources_test/common/pancreas | ||
dest: resources_test/common/pancreas | ||
- type: python_script | ||
path: /common/component_tests/run_and_check_output.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,39 +1,38 @@ | ||
functionality: | ||
namespace: "datasets/processors" | ||
info: | ||
type: dataset_processor | ||
type_info: | ||
label: KNN | ||
summary: | | ||
Computes the k-nearest-neighbours for each cell. | ||
description: | | ||
The resulting AnnData will contain both the knn distances and the knn connectivities in 'obsp'. | ||
arguments: | ||
- name: "--input" | ||
__merge__: file_pca.yaml | ||
required: true | ||
direction: input | ||
- name: "--input_layer" | ||
type: string | ||
default: "normalized" | ||
description: Which layer to use as input. | ||
- name: "--output" | ||
direction: output | ||
__merge__: file_knn.yaml | ||
required: true | ||
- name: "--key_added" | ||
type: string | ||
default: "knn" | ||
description: | | ||
The neighbors data is added to `.uns[key_added]`, | ||
distances are stored in `.obsp[key_added+'_distances']` and | ||
connectivities in `.obsp[key_added+'_connectivities']`. | ||
- name: "--num_neighbors" | ||
type: integer | ||
default: 15 | ||
description: "The size of local neighborhood (in terms of number of neighboring data points) used for manifold approximation." | ||
test_resources: | ||
- path: /resources_test/common/pancreas | ||
dest: resources_test/common/pancreas | ||
- type: python_script | ||
path: /src/common/comp_tests/run_and_check_adata.py | ||
namespace: "datasets/processors" | ||
info: | ||
type: dataset_processor | ||
type_info: | ||
label: KNN | ||
summary: | | ||
Computes the k-nearest-neighbours for each cell. | ||
description: | | ||
The resulting AnnData will contain both the knn distances and the knn connectivities in 'obsp'. | ||
arguments: | ||
- name: "--input" | ||
__merge__: file_pca.yaml | ||
required: true | ||
direction: input | ||
- name: "--input_layer" | ||
type: string | ||
default: "normalized" | ||
description: Which layer to use as input. | ||
- name: "--output" | ||
direction: output | ||
__merge__: file_knn.yaml | ||
required: true | ||
- name: "--key_added" | ||
type: string | ||
default: "knn" | ||
description: | | ||
The neighbors data is added to `.uns[key_added]`, | ||
distances are stored in `.obsp[key_added+'_distances']` and | ||
connectivities in `.obsp[key_added+'_connectivities']`. | ||
- name: "--num_neighbors" | ||
type: integer | ||
default: 15 | ||
description: "The size of local neighborhood (in terms of number of neighboring data points) used for manifold approximation." | ||
test_resources: | ||
- path: /resources_test/common/pancreas | ||
dest: resources_test/common/pancreas | ||
- type: python_script | ||
path: /common/component_tests/run_and_check_output.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,49 +1,48 @@ | ||
functionality: | ||
namespace: "datasets/processors" | ||
info: | ||
type: dataset_processor | ||
type_info: | ||
label: PCA | ||
summary: | | ||
Computes a PCA embedding of the normalized data. | ||
description: | ||
The resulting AnnData will contain an embedding in obsm, as well as optional loadings in 'varm'. | ||
arguments: | ||
- name: "--input" | ||
__merge__: file_hvg.yaml | ||
required: true | ||
direction: input | ||
- name: "--input_layer" | ||
type: string | ||
default: "normalized" | ||
description: Which layer to use as input. | ||
- name: "--input_var_features" | ||
type: string | ||
description: Column name in .var matrix that will be used to select which genes to run the PCA on. | ||
default: hvg | ||
- name: "--output" | ||
direction: output | ||
__merge__: file_pca.yaml | ||
required: true | ||
- name: "--obsm_embedding" | ||
type: string | ||
default: "X_pca" | ||
description: "In which .obsm slot to store the resulting embedding." | ||
- name: "--varm_loadings" | ||
type: string | ||
default: "pca_loadings" | ||
description: "In which .varm slot to store the resulting loadings matrix." | ||
- name: "--uns_variance" | ||
type: string | ||
default: "pca_variance" | ||
description: "In which .uns slot to store the resulting variance objects." | ||
- name: "--num_components" | ||
type: integer | ||
example: 25 | ||
description: Number of principal components to compute. Defaults to 50, or 1 - minimum dimension size of selected representation. | ||
test_resources: | ||
- path: /resources_test/common/pancreas | ||
dest: resources_test/common/pancreas | ||
- type: python_script | ||
path: /src/common/comp_tests/run_and_check_adata.py | ||
namespace: "datasets/processors" | ||
info: | ||
type: dataset_processor | ||
type_info: | ||
label: PCA | ||
summary: | | ||
Computes a PCA embedding of the normalized data. | ||
description: | ||
The resulting AnnData will contain an embedding in obsm, as well as optional loadings in 'varm'. | ||
arguments: | ||
- name: "--input" | ||
__merge__: file_hvg.yaml | ||
required: true | ||
direction: input | ||
- name: "--input_layer" | ||
type: string | ||
default: "normalized" | ||
description: Which layer to use as input. | ||
- name: "--input_var_features" | ||
type: string | ||
description: Column name in .var matrix that will be used to select which genes to run the PCA on. | ||
default: hvg | ||
- name: "--output" | ||
direction: output | ||
__merge__: file_pca.yaml | ||
required: true | ||
- name: "--obsm_embedding" | ||
type: string | ||
default: "X_pca" | ||
description: "In which .obsm slot to store the resulting embedding." | ||
- name: "--varm_loadings" | ||
type: string | ||
default: "pca_loadings" | ||
description: "In which .varm slot to store the resulting loadings matrix." | ||
- name: "--uns_variance" | ||
type: string | ||
default: "pca_variance" | ||
description: "In which .uns slot to store the resulting variance objects." | ||
- name: "--num_components" | ||
type: integer | ||
example: 25 | ||
description: Number of principal components to compute. Defaults to 50, or 1 - minimum dimension size of selected representation. | ||
test_resources: | ||
- path: /resources_test/common/pancreas | ||
dest: resources_test/common/pancreas | ||
- type: python_script | ||
path: /common/component_tests/run_and_check_output.py | ||
|
Oops, something went wrong.