Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APR PR main-post-v1p5p0 #33

Open
wants to merge 202 commits into
base: main
Choose a base branch
from

Conversation

alexpreynolds
Copy link

This PR for the HiGlass higlass-pileup track includes Pixi resource memory leak fixes, PNG/SVG export fixes, more graceful error handling for null Pixi objects, and feature enhancements for mouseover tooltips.

This includes support for extracting rendering and clustering methylation event data from fibertools BAM datasets, as well as BAMs derived from ENCODE Index DHS datasets and some basic high-density interval sets (e.g. TFBSs). This includes webpack patches for building and incorporating a newer version of the bam-js library that retrieves BAM data faster and supports case-sensitive BAM tag data (such as is found in data from fiberseq and PacBio pipelines).

This also integrates fixes that have been merged into the upstream (base) HiGlass repository in the main branch (currently post v1.5.0).

  • Update the version number in unpkg link in README.md if building a new release

@pkerpedjiev
Copy link
Member

This is a pretty massive PR. Do you have a screenshot of the changes?

Is it rebased on the latest main? I've introduced some large-ish changes in the past few weeks for rendering paired end reads.

@alexpreynolds
Copy link
Author

I haven't rebased against the most recent changes. I have made some changes to the Webpack and Babel configuration in order to bring in a current version of the bam-js library. I just need to be careful about integration of your changes to prevent reverting them.

I will need to document the following in the README, but in the meantime, as an example, here is a screenshot of an app I am working on, which demonstrates different subtypes of a higlass-pileup track (outside of use for rendering a generic sequencing read BAM file):

image

(Via: https://explore.altius.org/fsv.dev/?interval=chr2:10121353-10123373&mode=CD3+_D2_HAP1_Stim_052824)

In this figure, the following pileup track variants are displayed:

  • 6mA and 5mC/5hmC base modification data ("Stimulated HAP1")
  • FIRE regulatory element calls of varying significance from fibertools ("Stimulated HAP1 FIRE")
  • Generic BED element sets (TFBS 1e-6, Repeat Elements, Footprints)
  • Index DHS regions (Meuleman 2020, Encode 2024)

Each of these is a pileup track defined in the view config via properties stored under options, including: methylation, fire, tfbs, genericBed, and indexDHS, respectively.

Each of these properties contains various subtype options specific to that subtype. For example, methylation has a probabilityThresholdRange property that renders modified bases if their ML probability falls within the specified range. Other properties exist and will be documented once I update the README.

(The remaining tracks in this example are bigWig (Coverage, DNase I, Perc Acc) and transcripts (Gencode v38) rendered with core HiGlass and higlass-transcripts; data are retrieved via higlass-bigwig-datafetcher and higlass-tabix-datafetcher.)

@pkerpedjiev
Copy link
Member

I'll look at this in more depth a little later but I think the ideal outcome here would be to define a standard interface for what's being rendered: segments (single or paired end).

I don't know how committed you are to this, but perhaps we can define a spec that defines a generic datatype, write the renderer and then create an adapter to convert BAM reads or methylation data or whatever we want rendered to the standard datatype.

For a generic datatype I'm thinking of something not too different from the "BED custom track with multiple blocks" in the UCSC genome browser. I think we'd actually only need to extend that to specify custom colors for each "thick" block.

The other challenge will be providing custom mouseover texts when hovering.

I think if we did something like that, we could eventually port some of the other tracks to use this performant renderer.

@alexpreynolds
Copy link
Author

Having a more abstract BAM segment renderer makes perfect sense to me. At the moment, everything I do to calculate the appearance of a segment is specific to the track subtype (generic SE or PE BAM read, base-modification segment, regulatory element, etc.).

Perhaps we could have a wrapper function that each track subtype would just call to generate the segment coordinates and apply metadata.

Coordinates would be the usual xy space, while metadata could include — based on what I have used the pileup track to render so far — various display characteristics for each block of a given segment, such as its relative or fractional block height, block appearance type (a rectangle, a line, or perhaps an arrowhead or other simple primitive, say), block color and transparency, and block mouseover data.

One reason this PR is bulky is that the methylation track type includes code for hierarchical clustering, redrawing the layout when a particular clustering function is applied over a specified subregion.

For performance, I think it makes sense to have the clustering or generic reordering code tied closely to how pileups are calculated before rendering, but it does make the overall code more complicated and I don't know how you feel about that.

I also have code throughout that listens for tab-specific broadcast channel (BC) events from the calling frontend to, for example, trigger re-rendering/reordering upon clustering of a methylation track. Or the track communicates an event back to the frontend, when rendering or other steps are finished.

These events have been useful, for instance, when clustering is performed on a methylation track, when it has an associated FIRE regulatory element track in the same fiberscope browser instance.

FIRE element segments are calculated from m6A segment data by the fibertools kit, and so these two tracks have an association when they are shown together. A subsequent BC event fires to tell a FIRE regulatory element track to re-render its segments in the same order and pileup layout as those in the recently-clustered methylation track.

I'm not sure how you feel about the pileup track getting too complicated, or how you feel about using BCs to communicate commands and state between the track and the parent frontend. But I did want to take a second to explain why this PR might seem to be larger than a typical request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants