Skip to content

Statistics file

Keiran Raine edited this page Jan 24, 2022 · 3 revisions

The statistics file (*.stats.json) contains detailed statistics relating to the counts.

Data fields

These fields concern execution, options and the raw input data:

  • command
    • Command executed to generate this file
  • version
    • Version of pyCROQUET used to generate this file
  • sample_name
    • Sample as determined from command or from BAM/CRAM header
  • reversed_reads
    • Reads were reversed before attempting counts
  • total_guides
    • Total number of guides in library
  • total_reads
    • Total number of reads in input data
  • total_pairs
    • When dual guide, number of read pairs

Filter related fields

Fields that indicate volume of data excluded for various reasons:

  • length_excluded_reads
    • Number of reads excluded due to minimum length not being achieved
  • vendor_failed_reads
    • Number of reads excluded due to vendor failed flag (when selected)

Read count fields

  • mapped_to_guide_reads
    • Number of reads mapped to a guide
  • multimap_reads
    • Number of reads mapping to multiple guides equally well
  • unmapped_reads
    • Number of reads failing to map to a guide

Guide count fields

  • low_count_guides_lt_15
    • Number of guides with less-than 15 reads
  • low_count_guides_lt_30
    • Number of guides with less-than 30 reads
  • low_count_guides_user
    • User definable threshold, when active contains a structure of {"count": 101064, "lt": 5}
  • mean_count_per_guide
    • Mean reads per guide
  • zero_count_guides
    • Number of guides with zero reads

pair_classifications

Pair classification is only defined for dual-guide mode. Details can be found here.

merged_from

The complete statistics from files merged together are incorporated into this field in an array matching the order in the header for the *.counts.tsv.gz.

Example can be seen here.

Clone this wiki locally