-
Notifications
You must be signed in to change notification settings - Fork 1
Count file
Keiran Raine edited this page Jan 24, 2022
·
3 revisions
The counts file (*.counts.tsv.gz
) is based around the minimal guide library file as indicated here.
...
indicates data has been truncated.
Core fields plus:
-
unique_guide
- 0/1 indicates if guide is unique. -
reads_SAMPLE
-SAMPLE
replaced with value provided during execution or from header of BAM/CRAM.
##Command: pycroquet single-guide -g ...
##Version: 1.3.0
#id sgrna_ids sgrna_seqs gene_pair_id unique_guide reads_SAMPLE
...
11023 ACAP3_CCDS19.2_ex10_1:1233213-1233236:+_5-3 CTGTCAGGGCTCTCGCGGT ACAP3 1 1
...
The merged format extends this further. reads_SAMPLE
becomes the sum of the input counts.
Each count input file adds a new numbered meta-data header line (Count-col-#N
) incorporating:
- md5 of input file
- original command from input file header
- version from input file header
For header item a corresponding numbered column follows reads_SAMPLE
with the original counts from the input files:
##Command: pycroquet merge-counts -o ...
##Version: 1.3.0
##Count-col-#1: md5: acdc800d36e38641995137678a9727c1; Version: 1.3.0; Command: pycroquet single-guide -g ...
##Count-col-#2: md5: 8ea9dce29a685e3f1db0bb8a44da9853; Version: 1.3.0; Command: pycroquet single-guide -g ...
#id sgrna_ids sgrna_seqs gene_pair_id unique_guide reads_SAMPLE 1 2
...
11023 ACAP3_CCDS19.2_ex10_1:1233213-1233236:+_5-3 CTGTCAGGGCTCTCGCGGT ACAP3 1 2 1 1
...