Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simple T/BCR doublet detector, use markdown for roxygen, and slight refactoring of combineExpression #417

Merged
merged 20 commits into from
Oct 30, 2024

Conversation

Qile0317
Copy link
Collaborator

@Qile0317 Qile0317 commented Oct 14, 2024

This partially resolves #388. See NEWS.md for the changes added. Most importantly please let me know whether the doublet detection change as of right now makes any sense. (see the R/getContigDoublets.R file) If so I'll proceed to add additional features related to doublet detection.

The vast majority of the files changed are from the automatic conversion of regular roxygen to roxygenmd. If you don't mind this change in convention (which I personally prefer a lot more over regular roxygen while being compatible) you can pretty much ignore the vast majority of the files here as I don't believe the actual Rd documents changed much if at all. I may be wrong but I don't think so.

@Qile0317
Copy link
Collaborator Author

@ncborcherding also configured PRs to dev to run R-CMD-check

@Qile0317 Qile0317 mentioned this pull request Oct 14, 2024
@ncborcherding ncborcherding marked this pull request as ready for review October 14, 2024 13:07
@Qile0317
Copy link
Collaborator Author

@ncborcherding I see that you've marked this ready for review - just wanted to reinforce that I'm actually not completely sure the code is actually fully correct and will be adding more tests and manually verifying things and I'll ping again when I'm confident.

@ncborcherding
Copy link
Owner

Yep - that is why I am holding for now. I had to mark it to be able to see some testing.

@Qile0317
Copy link
Collaborator Author

@ncborcherding Ok, so after locally testing on a real life dataset with TCRs and BCRs (let me know if you want the notebook) I am pretty confident with the doublet detector (afterall the core idea is to just filter for rows where the barcode is in the intersection of TCR and BCR lists) but I had slight trouble coming up with what exactly to do after in a way that is generalizable to a function.

I have two ideas to make this something practical for users:

  1. add a vignette showcasing ways to look at whether these doublets are of concern (i.e. if they are confounded with certain variables in the metadata by using glm(), overlaying them on the UMAP, how to remove them, etc.)
  2. Some function to do pairwise glm regression of other variables in the metadata all at once in some aggregate view to look at potential confounding.

Please let me know what you think. And also lemme know if the output of getContigDoublets is awkward - I'm afraid some users may feel so.

@Qile0317
Copy link
Collaborator Author

@ncborcherding if you want to leave the previous comment potentially for another future release/PR and dont see a need for changes - this is ready to merge now.

The function is marked with the "experimental" badge so that we can feel better about making larger changes on it or just keeping it as an internal function / removing it, etc.

@ncborcherding ncborcherding merged commit a256239 into ncborcherding:dev Oct 30, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants