frictionless-collab

Frictionless JSON Data Packages for Life Science

Table layouts

Here is a demo how dataflows:

converts a wide table to a long table
converts a long table to a wide table
generates automatically a package descriptor in both cases

$ make install
$ python3 layouts/wide.py
$ python3 layouts/long.py

Gene expression data examples

GSE60450_Lactation-GenewiseCounts.csv: gene expression matrix with simple layout, with the first 2 fields being molecular entity descriptors, the remainder of the fields correspond to read counts per samples. GSE60450
P1-SARS-CoV2_Virus_FPKM.csv: gene expression matrix with simple layout, with the first field being molecular entity identifier, the remaineder of the fields correspond to FPKM measure per sample (file produced during the Elixir BioHackathon on COVID19).
GSE52778_All_Sample_FPKM_Matrix.csv : gene expression matrix of more complex structure, with the first 9 fields (columns[A-I]) being molecular entity descriptors, then 4 sets of 3 fields, matching the 4 experimental conditions and 3 quantitation types (including FPKM measures) (columns[J-Y], then individual experimental conditions (per cell line) column[Z-AO], from NCBI GEO experiment GSE52778

Feature Requests to Transformation:

allow 'flatening of multi row headers' as documented by @lilwinfree

!pip install tabulator

then:

from tabulator import Stream

with Stream('oxford.csv', headers=[1,2], multiline_headers_joiner='.') as stream:
  print(stream.headers)

  ['Gene Name', 'Sample_id1.mean', 'Sample_id1.standard dev', 'Sample_id2.mean', 'Sample_id2.standard dev']

pivot/unpivot operation: expand on existing code provided by @roll to enable an offset parameter (fixing the number of field associated with molecular entity descriptors) and 2 additional parameters, one to obtain the number of experimental conditions or unique samples, and another one, lising the number of quantitation types)

:important: possibly agree on a conventional separator to detect flattened headers as in Sample_id1.standard dev. Separator could be selected from [".","|","__"]

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
2-factor-tms		2-factor-tms
2-group-comparison-Ttest		2-group-comparison-Ttest
attractDB		attractDB
covid-19-counts		covid-19-counts
covid19-case-counts		covid19-case-counts
ekrt		ekrt
elife-jats-example		elife-jats-example
genexp_examples		genexp_examples
layouts		layouts
multiple-group-comparison-Ftest		multiple-group-comparison-Ftest
neuroscience-brain-imaging		neuroscience-brain-imaging
primer-probe-report		primer-probe-report
signal_intensity		signal_intensity
sirna-screen		sirna-screen
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
contributing.md		contributing.md
datapackage-checker.py		datapackage-checker.py
headers.tsv		headers.tsv
headers_txt.txt		headers_txt.txt
multirow-header-table.csv		multirow-header-table.csv
table-header.md		table-header.md
tables.html.zip		tables.html.zip
validate-datapkg.py		validate-datapkg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

frictionless-collab

Table layouts

Gene expression data examples

Feature Requests to Transformation:

About

Releases

Packages

Contributors 3

Languages

License

ISA-tools/frictionless-collab

Folders and files

Latest commit

History

Repository files navigation

frictionless-collab

Table layouts

Gene expression data examples

Feature Requests to Transformation:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages