Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Waterfall plot for Entity Occurrences #130

Open
tillenglert opened this issue Aug 7, 2024 · 2 comments
Open

Waterfall plot for Entity Occurrences #130

tillenglert opened this issue Aug 7, 2024 · 2 comments
Labels
enhancement Improvement for existing functionality

Comments

@tillenglert
Copy link
Collaborator

tillenglert commented Aug 7, 2024

Description of feature

An idea for downstream visualisation of comparison between conditions:

Waterfall plot of unique peptides between 2 conditions, with overlapping peptides (non unique). Example figure is attached, in this case y axis = sample prevalence. As we predict the peptides, we can compare the number of entities the peptide is contained in (e.g. bins, contigs, taxids)
Screenshot 2024-08-07 at 11 15 06

@tillenglert tillenglert added the enhancement Improvement for existing functionality label Aug 7, 2024
@tillenglert
Copy link
Collaborator Author

Bin the entity occurrence to plot less data points (one bin == one entity occurrence)

@tillenglert
Copy link
Collaborator Author

Waterfall plot makes sense if it shows high frequency peptides. Frequency in this case would mean a predicted occurrence within one bin/taxa. For these input types the plot generation should be relatively straight forward? (maybe we need to be more stringent in filtering or like I posted above by binning the frequency, to reduce the number of bars -> resources) For assemblies on the other hand it will not be that straight forward, as we do not have information about the taxa/binning in taxa. Still the plot could be generated by using a frequency within contigs, which can be used to compare between communities, but is also dependent on read depth etc and may be biased. Thus either plots are only generated for bins/taxa or the user is warned, that the plot may not show the result that is expected.

Still for large datasets, e.g. full test the plot will generate 2 million bars. Therefore, it may be more helpful to reduce the plot to a table including the result, this could also be a table which filters the highest unique predicted binders (binding score) for each condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement for existing functionality
Projects
None yet
Development

No branches or pull requests

1 participant