Skip to content

chicViewpoint hangs indefinitely when reference file is a BED file with (non-unique) gene names in 4th column #880

Open
@mtekman

Description

@mtekman

Main command

version: hicexplorer==3.7.3 (micromamba install)

chicViewpoint -m 3_hicmatrices/*_matrix.h5 \
                --averageContactBin 5 --range 1000000 1000000 \
                -rp <reference_points.bed> \ 
                -bmf 4_background_model/bg.txt \
                --outFileName 5_calc_all_interactions/all_interactions.ugenes.hdf5 --fixateRange 500000 --threads 30

Problem

If the -rp parameter is a BED file, it hangs if the 4th column is not unique:

e.g. A reference point bed file with these entries will hang indefinitely with 100% CPU for several days until killed.

11      108810997       108811117       Axin2
11      108919925       108920044       Axin2
16      45044224        45044344        Btla
16      45044616        45044736        Btla
8       107329854       107329974       Cdh1

e.g.2 A reference point bed file with these entries will run near instantly without problems:

11      108810997       108811117       Axin2-1
11      108919925       108920044       Axin2-2
16      45044224        45044344        Btla-1
16      45044616        45044736        Btla-2
8       107329854       107329974       Cdh1-1

(generated via awk '{gmap[$4] = gmap[$4] + 1; print $0"-"gmap[$4];}' <original_rp.bed>)

I think a small bit of text that mentions this in the help text of the --referencePoints parameter would be enough

Cheers!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions