Interaction with IDR (or other IDR-like) framework? #4

michaelbale · 2022-10-03T18:36:43Z

The output peak file is a 3-column BED file without a score or signal.value field which eschews any ability to use this peak caller in conjunction with something like IDR if we have replicates. Is there any plans or way to have an output that can be used with something like this?

gartician · 2022-11-21T18:19:31Z

Hello @michaelbale,

Sorry for the delay to address this issue, we understand the importance of adding a fourth column for compatibility with IDR. However GoPeaks doesn't test the whole peak for significance, but rather the individual bins that make up a peak. One way to get one p-value per peak (similar to the output of macs2), is to somehow transform/combine a series of p-values in a peak into one value. We have yet to find an approach that does this, but are open to suggestions.

michaelbale · 2022-11-22T15:50:16Z

Hi @gartician; thanks for the reply! Basically the method identifies the boundaries of "significant islands" through co-dependent bins? One could potentially use like a harmonic-mean p-value or Brown's extension of the Fisher method (given that the p-values are dependent). But I'm not so sure how to test the validity of the test - I've used the methods in other instances, but they were much more well-behaved.

gartician · 2022-11-22T18:57:21Z

Hi @michaelbale, GoPeaks indeed identifies boundaries of significant bins but I'm debating whether bins are co-dependent. To re-iterate, the HMP and Brown's extension of Fisher seem very interesting and they assume the p-values are dependent. My question is that (biologically speaking) the significance of bins certainly depends on the genome position (adjacent bins usually form into a peak) and the biological system, but the binomial distribution assumes independent tests among bins. I am not sure which interpretation is more correct, but I wonder if those perspectives violate or align with the assumptions of the HMP/Brown?

michaelbale · 2022-11-23T01:43:03Z

I guess it would depend on the size of bin and how gopeaks interprets the reads - i.e. if a read spanning multiple bins is counted within each bin it spans or if just the read start/end is counted. I also would imagine that the likelihood is dependent given that clusters of bins are significant in a local neighborhood as that would be the definition of a 'peak' in this instance. I guess to be clear, the tests are independent, but the p-values are not.

gartician · 2022-11-23T17:49:13Z

In GoPeaks if a read (technically a fragment, which is the content between R1 and R2) spans multiple bins then it is counted across the bins. The independent tests with dependent p-values is an interesting perspective and kinda makes sense. I can take a stab at implementing it in December and I will also gladly accept a PR but that's not required. Thank you for the constructive feedback!

michaelbale · 2023-04-15T17:54:12Z

Hi @gartician; I was wondering if you or anyone had had a chance to look into this recently! I didn't realize you mentioned accepting a PR; as much as I'd love to try - my familiarity with the go language is...not great.

martinezvbs · 2024-08-08T05:35:42Z

Hi,

I have been using GoPeaks for CUT&RUN/ATAC-seq data. I would like to know if since like time, there has been a change to continue this conversation (testing different methods). Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interaction with IDR (or other IDR-like) framework? #4

Interaction with IDR (or other IDR-like) framework? #4

michaelbale commented Oct 3, 2022

gartician commented Nov 21, 2022

michaelbale commented Nov 22, 2022

gartician commented Nov 22, 2022 •

edited

Loading

michaelbale commented Nov 23, 2022

gartician commented Nov 23, 2022

michaelbale commented Apr 15, 2023 •

edited

Loading

martinezvbs commented Aug 8, 2024

Interaction with IDR (or other IDR-like) framework? #4

Interaction with IDR (or other IDR-like) framework? #4

Comments

michaelbale commented Oct 3, 2022

gartician commented Nov 21, 2022

michaelbale commented Nov 22, 2022

gartician commented Nov 22, 2022 • edited Loading

michaelbale commented Nov 23, 2022

gartician commented Nov 23, 2022

michaelbale commented Apr 15, 2023 • edited Loading

martinezvbs commented Aug 8, 2024

gartician commented Nov 22, 2022 •

edited

Loading

michaelbale commented Apr 15, 2023 •

edited

Loading