Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obtaining batch effect estimates at cluster level #199

Open
wmacnair opened this issue Jul 26, 2023 · 1 comment
Open

Obtaining batch effect estimates at cluster level #199

wmacnair opened this issue Jul 26, 2023 · 1 comment
Labels
question Further information is requested

Comments

@wmacnair
Copy link

Hi

First of all, thanks for developing and maintaining a fantastic tool 🥳 It's great to have a conceptually simple, sensibly motivated and fast tool to integrate single cell data iteratively.

I'm interested in understanding which genes Harmony identifies as being "batchy". From this tutorial, I have tried the following code

batch_adj_mat = Loadings(seu_obj, reduction = "pca") %*% t(hmny_obj$W)

which I think gives me the estimated effect on each gene, averaged for all cells in each cluster. Is that right?

batch_adj_mat is a matrix that is n_genes * n_samples. What I would like to get to is something like batch_adj_tensor that is n_genes * n_samples * n_clusters. I would then average this across some known celltype labels I have, so I would end up with n_genes * n_samples * n_celltypes. I'm curious about how much the Harmony adjustment differs across the different clusters.

Is this somehow achievable...? 😅 I'm hoping that all I have to do is add one line to what I used above, but I don't see an obvious way to do it so far...

Thanks!
Will

@pati-ni pati-ni added the question Further information is requested label Sep 21, 2023
@pati-ni pati-ni closed this as completed Oct 6, 2023
@pati-ni pati-ni reopened this Oct 6, 2023
@pati-ni
Copy link
Collaborator

pati-ni commented Oct 6, 2023

I just saw your mention about cell types and clusters. My previous comments are not relevant so I will delete them. Under normal operation, W is retaining information for the last cluster so the information you are looking is not there. I believe what you are asking can be found in a different function that does return a cell x gene x number of clusters, so you can project your own covariates.

To access this information you have to re-run the last part of the integration:

so you need to do the following: hmny_obj$moe_ridge_get_betas_cpp()

Hope this is more helpful than my previous responses

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants