Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

low sample size #267

Closed
shenhaizhongdechanrao opened this issue Dec 27, 2024 · 1 comment
Closed

low sample size #267

shenhaizhongdechanrao opened this issue Dec 27, 2024 · 1 comment
Assignees

Comments

@shenhaizhongdechanrao
Copy link

shenhaizhongdechanrao commented Dec 27, 2024

Hi,

  1. I am using WES data from 14 samples to extract signals from scratch. I understand that the sample size is small (these samples are difficult to collect), but I would like to know which parameters should be adjusted to accommodate small sample sizes, and I would appreciate specific parameter recommendations.
  2. How can I determine whether the results are truly reliable? For example, which parameters in the SBS96_selection_plot or All_solutions_stat files can be used to assess this?

image
image

@marcos-diazg
Copy link
Member

Dear @shenhaizhongdechanrao,

I hope you are doing well, and thanks so much for your interest in our tools! As you are aware, extracting mutational signatures with a low number of samples and especially, with a low number of mutations (given your WES data), is definitely challenging. There are different options that could be used, but those are more related to the experimental design and analysis rather than specific tool parameters. Regarding SigProfilerExtractor, the suggestion is to use low definition mutational profiles, like SBS-96, and also add the exome=True parameter, which also controls the statistics used for determining the optimal number of signatures. Happy to discuss other potential analyses by email at [email protected].

Regarding your second question, the tool automatically selects the optimal number of signatures from your extraction, taken into account the average and minimum stability of the clustering process, as explained in our main publication (Islam et al. 2022 Cell Genomics). In your particular case, the selection is very clear due to the drastic decrease in average stability shown in the selection plot.

I hope this helps, and thanks again for your interest!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants