MEME output #1828

elodieaudemar · 2025-03-24T15:16:14Z

Dear Sergei,

I'm working on MEME method and some questions couldn't find answer.
My version for HyPhy is 2.5.64.

-On one of your exchanges, you said the "# branche" is the number of branche > 100 (EBF) however I find nowhere the EBF (either ouput and json), I have only the LRT.
-The omega on the "### Improving branch lengths, nucleotide substitution biases, and global dN/dS ratios under a full codon model section" is the global omega for all the branches how were detected on the "# branche" ?
-I have a surprinsing hight omega+ (1677), Do I need to interroge myself when the omega+ is high like >100 or >1000? I would like to say yes but I would prefre a confirmation.
-I gave a unlabelled tree, how can I know what are the node it has created?
-For all my sequences, I have aroud 1 to 12-15 % of N and gaps but for some others I can have higer pourcentage (30 to 60% or more). How much of pourcentage you can't guarantee the results?
-Is MEME sensitive if our sequences diverge only slightly?

Thank you a lot for your help.

Best,
Elodie

spond · 2025-03-24T17:31:32Z

Dear @elodieaudemar,

On one of your exchanges, you said the "# branche" is the number of branche > 100 (EBF) however I find nowhere the EBF (either ouput and json), I have only the LRT.

EBF is stored in the JSON, and can be viewed using https://observablehq.com/@spond/meme

If you are interested in accessing EBF for Branch/Site pairs, it can be done via Pythong or anything else that reads JSON files.

The omega on the "### Improving branch lengths, nucleotide substitution biases, and global dN/dS ratios under a full codon model section" is the global omega for all the branches how were detected on the "# branche" ?

This is the global ω for all the tested branches (if you supplied --branches), otherwise it's for all branches.

-I have a surprinsing hight omega+ (1677), Do I need to interroge myself when the omega+ is high like >100 or >1000? I would like to say yes but I would prefre a confirmation.

No; this is effectively an ∞ You should not trust point estimates of site-level ω they are likely to be very noisy. The reportable outcome is the p-value for the LRT of positive selection.

I gave a unlabelled tree, how can I know what are the node it has created?

You mean NodeXXX? You can see what they are in the tree viewer (check in show internal box)

and also in the MEME json output as a part of the Newick tree string

For all my sequences, I have aroud 1 to 12-15 % of N and gaps but for some others I can have higer pourcentage (30 to 60% or more). How much of pourcentage you can't guarantee the results?

As the famous saying goes, "but in this world nothing can be said to be certain, except death and taxes" (https://en.wikipedia.org/wiki/Death_and_taxes_(idiom))

MEME should fail safe, i.e. if you have no data, the power to detect anything will decay. One area of concern, is that alignments with many N and - can be of poor quality.

Is MEME sensitive if our sequences diverge only slightly?

Hard to say. Generally, you have lower power for low diveregence sequences. You can try --resample 100 option to engage the option for parametric bootstrap (higher sensitivity); it will be much slower. How many sequences do you have?

Best,
Sergei

elodieaudemar · 2025-03-25T08:57:05Z

Dear @spond,

Thank you so much for your clear answers and rapidity, and I didn'y know about this famous saying haha.
I'm working on 3200 genes and each gene has the 8 sames species (but one has bad quality), only the CDS change so I'm in a cluster to make loops.
And if I understood, "#branche" is not an importante variable. If at my codon 12 it says 2, that just mean that I have 2 branches under selection for this codon. And we can know which ones are, with .json?

Best,
Elodie

spond · 2025-03-27T17:27:26Z

Dear @elodieaudemar,

For 8 species, I don't expect MEME to have great power to detect anything (you typically need 20+ sequences AND decent divergence). Typically you would either use BUSTED to detect genes (as units) under selection, or aBSREL to look for branches under selection.

Best,
Sergei

elodieaudemar · 2025-04-04T08:33:05Z

Dear @spond,

Thank you a lot for your help, because yes MEME detected weird codons (now for sure they were faulse positif).
I use aBSREL as well (I'm writting questions about this method as well) but I wanted about the codon and not the gene so I took FUBAR after red that it is great for little and a lot of sequences and it is well better after running the same genes than MEME.

Thank you again for your help.

Best,
Elodie AUDEMAR

spond · 2025-04-04T12:43:17Z

Dear @elodieaudemar,

aBSREL outputs some indications of where the selection signal comes from. For example

hyphy absrel tests/hbltests/libv3/data/CD2.nex

Then load the .json into the visualization module (you can also get this information from the JSON programmatically), https://observablehq.com/@spond/absrel

Here's a heat map of individual codons contributing selection signal

And here's a sorted list of specific codons with high empirical bayes factors on one of the branches found to be under selection by aBSREL, CAT.

Best,
Sergei

elodieaudemar · 2025-04-10T12:22:26Z

Hi @spond,

Thank you a lot for your help.

Best,
Elodie AUDEMAR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MEME output #1828

MEME output #1828

elodieaudemar commented Mar 24, 2025

spond commented Mar 24, 2025

elodieaudemar commented Mar 25, 2025

spond commented Mar 27, 2025

elodieaudemar commented Apr 4, 2025

spond commented Apr 4, 2025

elodieaudemar commented Apr 10, 2025

MEME output #1828

MEME output #1828

Comments

elodieaudemar commented Mar 24, 2025

spond commented Mar 24, 2025

elodieaudemar commented Mar 25, 2025

spond commented Mar 27, 2025

elodieaudemar commented Apr 4, 2025

spond commented Apr 4, 2025

elodieaudemar commented Apr 10, 2025