Template

A one sentence summary of purpose and methodology. Used for creating an overview tables.

Description

Provide a clear and concise description of your task, detailing the specific problem it aims to solve. Outline the input data types, the expected output, and any assumptions or constraints. Be sure to explain any terminology or concepts that are essential for understanding the task.

Explain the motivation behind your proposed task. Describe the biological or computational problem you aim to address and why it’s important. Discuss the current state of research in this area and any gaps or challenges that your task could help address. This section should convince readers of the significance and relevance of your task.

Authors & contributors

name	roles
John Doe	author, maintainer

API

flowchart LR
  file_common_dataset("Common Dataset")
  comp_data_processor[/"Data processor"/]
  file_solution("Solution")
  file_test_h5ad("Test data")
  file_train_h5ad("Training data")
  comp_control_method[/"Control Method"/]
  comp_metric[/"Metric"/]
  comp_method[/"Method"/]
  file_prediction("Predicted data")
  file_score("Score")
  file_common_dataset---comp_data_processor
  comp_data_processor-->file_solution
  comp_data_processor-->file_test_h5ad
  comp_data_processor-->file_train_h5ad
  file_solution---comp_control_method
  file_solution---comp_metric
  file_test_h5ad---comp_control_method
  file_test_h5ad---comp_method
  file_train_h5ad---comp_control_method
  file_train_h5ad---comp_method
  comp_control_method-->file_prediction
  comp_metric-->file_score
  comp_method-->file_prediction
  file_prediction---comp_metric

File format: Common Dataset

A subset of the common dataset.

Example file: resources_test/common/pancreas/dataset.h5ad

Format:

AnnData object
 obs: 'cell_type', 'batch'
 var: 'hvg', 'hvg_score'
 obsm: 'X_pca'
 layers: 'counts', 'normalized'
 uns: 'dataset_id', 'dataset_name', 'dataset_url', 'dataset_reference', 'dataset_summary', 'dataset_description', 'dataset_organism', 'normalization_id'

Data structure:

Slot	Type	Description
`obs["cell_type"]`	`string`	Cell type information.
`obs["batch"]`	`string`	Batch information.
`var["hvg"]`	`boolean`	Whether or not the feature is considered to be a ‘highly variable gene’.
`var["hvg_score"]`	`double`	A ranking of the features by hvg.
`obsm["X_pca"]`	`double`	The resulting PCA embedding.
`layers["counts"]`	`integer`	Raw counts.
`layers["normalized"]`	`double`	Normalized expression values.
`uns["dataset_id"]`	`string`	A unique identifier for the dataset.
`uns["dataset_name"]`	`string`	Nicely formatted name.
`uns["dataset_url"]`	`string`	(Optional) Link to the original source of the dataset.
`uns["dataset_reference"]`	`string`	(Optional) Bibtex reference of the paper in which the dataset was published.
`uns["dataset_summary"]`	`string`	Short description of the dataset.
`uns["dataset_description"]`	`string`	Long description of the dataset.
`uns["dataset_organism"]`	`string`	(Optional) The organism of the sample in the dataset.
`uns["normalization_id"]`	`string`	Which normalization was used.

Component type: Data processor

A data processor.

Arguments:

Name	Type	Description
`--input`	`file`	A subset of the common dataset.
`--output_train`	`file`	(Output) The training data in h5ad format.
`--output_test`	`file`	(Output) The subset of molecules used for the test dataset.
`--output_solution`	`file`	(Output) The solution for the test data.

File format: Solution

The solution for the test data

Example file: resources_test/task_template/pancreas/solution.h5ad

Format:

AnnData object
 obs: 'label', 'batch'
 var: 'hvg', 'hvg_score'
 obsm: 'X_pca'
 layers: 'counts', 'normalized'
 uns: 'dataset_id', 'dataset_name', 'dataset_url', 'dataset_reference', 'dataset_summary', 'dataset_description', 'dataset_organism', 'normalization_id'

Data structure:

Slot	Type	Description
`obs["label"]`	`string`	Ground truth cell type labels.
`obs["batch"]`	`string`	Batch information.
`var["hvg"]`	`boolean`	Whether or not the feature is considered to be a ‘highly variable gene’.
`var["hvg_score"]`	`double`	A ranking of the features by hvg.
`obsm["X_pca"]`	`double`	The resulting PCA embedding.
`layers["counts"]`	`integer`	Raw counts.
`layers["normalized"]`	`double`	Normalized counts.
`uns["dataset_id"]`	`string`	A unique identifier for the dataset.
`uns["dataset_name"]`	`string`	Nicely formatted name.
`uns["dataset_url"]`	`string`	(Optional) Link to the original source of the dataset.
`uns["dataset_reference"]`	`string`	(Optional) Bibtex reference of the paper in which the dataset was published.
`uns["dataset_summary"]`	`string`	Short description of the dataset.
`uns["dataset_description"]`	`string`	Long description of the dataset.
`uns["dataset_organism"]`	`string`	(Optional) The organism of the sample in the dataset.
`uns["normalization_id"]`	`string`	Which normalization was used.

File format: Test data

The subset of molecules used for the test dataset

Example file: resources_test/task_template/pancreas/test.h5ad

Format:

AnnData object
 obs: 'batch'
 var: 'hvg', 'hvg_score'
 obsm: 'X_pca'
 layers: 'counts', 'normalized'
 uns: 'dataset_id', 'normalization_id'

Data structure:

Slot	Type	Description
`obs["batch"]`	`string`	Batch information.
`var["hvg"]`	`boolean`	Whether or not the feature is considered to be a ‘highly variable gene’.
`var["hvg_score"]`	`double`	A ranking of the features by hvg.
`obsm["X_pca"]`	`double`	The resulting PCA embedding.
`layers["counts"]`	`integer`	Raw counts.
`layers["normalized"]`	`double`	Normalized counts.
`uns["dataset_id"]`	`string`	A unique identifier for the dataset.
`uns["normalization_id"]`	`string`	Which normalization was used.

File format: Training data

The training data in h5ad format

Example file: resources_test/task_template/pancreas/train.h5ad

Format:

AnnData object
 obs: 'label', 'batch'
 var: 'hvg', 'hvg_score'
 obsm: 'X_pca'
 layers: 'counts', 'normalized'
 uns: 'dataset_id', 'normalization_id'

Data structure:

Slot	Type	Description
`obs["label"]`	`string`	Ground truth cell type labels.
`obs["batch"]`	`string`	Batch information.
`var["hvg"]`	`boolean`	Whether or not the feature is considered to be a ‘highly variable gene’.
`var["hvg_score"]`	`double`	A ranking of the features by hvg.
`obsm["X_pca"]`	`double`	The resulting PCA embedding.
`layers["counts"]`	`integer`	Raw counts.
`layers["normalized"]`	`double`	Normalized counts.
`uns["dataset_id"]`	`string`	A unique identifier for the dataset.
`uns["normalization_id"]`	`string`	Which normalization was used.

Component type: Control Method

Quality control methods for verifying the pipeline.

Arguments:

Name	Type	Description
`--input_train`	`file`	The training data in h5ad format.
`--input_test`	`file`	The subset of molecules used for the test dataset.
`--input_solution`	`file`	The solution for the test data.
`--output`	`file`	(Output) A predicted dataset as output by a method.

Component type: Metric

A task template metric.

Arguments:

Name	Type	Description
`--input_solution`	`file`	The solution for the test data.
`--input_prediction`	`file`	A predicted dataset as output by a method.
`--output`	`file`	(Output) File indicating the score of a metric.

Component type: Method

A method.

Arguments:

Name	Type	Description
`--input_train`	`file`	The training data in h5ad format.
`--input_test`	`file`	The subset of molecules used for the test dataset.
`--output`	`file`	(Output) A predicted dataset as output by a method.

File format: Predicted data

A predicted dataset as output by a method.

Example file: resources_test/task_template/pancreas/prediction.h5ad

Format:

AnnData object
 obs: 'label_pred'
 uns: 'dataset_id', 'normalization_id', 'method_id'

Data structure:

Slot	Type	Description
`obs["label_pred"]`	`string`	Predicted labels for the test cells.
`uns["dataset_id"]`	`string`	A unique identifier for the dataset.
`uns["normalization_id"]`	`string`	Which normalization was used.
`uns["method_id"]`	`string`	A unique identifier for the method.

File format: Score

File indicating the score of a metric.

Example file: resources/score.h5ad

Format:

AnnData object
 uns: 'dataset_id', 'normalization_id', 'method_id', 'metric_ids', 'metric_values'

Data structure:

Slot	Type	Description
`uns["dataset_id"]`	`string`	A unique identifier for the dataset.
`uns["normalization_id"]`	`string`	Which normalization was used.
`uns["method_id"]`	`string`	A unique identifier for the method.
`uns["metric_ids"]`	`string`	One or more unique metric identifiers.
`uns["metric_values"]`	`double`	The metric values obtained for the given prediction. Must be of same length as ‘metric_ids’.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
.vscode		.vscode
common @ bf64ebc		common @ bf64ebc
scripts		scripts
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
_viash.yaml		_viash.yaml
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Template

Description

Authors & contributors

API

File format: Common Dataset

Component type: Data processor

File format: Solution

File format: Test data

File format: Training data

Component type: Control Method

Component type: Metric

Component type: Method

File format: Predicted data

File format: Score

About

Releases

Packages

Languages

License

openproblems-bio/test

Folders and files

Latest commit

History

Repository files navigation

Template

Description

Authors & contributors

API

File format: Common Dataset

Component type: Data processor

File format: Solution

File format: Test data

File format: Training data

Component type: Control Method

Component type: Metric

Component type: Method

File format: Predicted data

File format: Score

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages