Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with j in getG #55

Open
kennaas opened this issue Mar 23, 2021 · 2 comments
Open

Problem with j in getG #55

kennaas opened this issue Mar 23, 2021 · 2 comments

Comments

@kennaas
Copy link

kennaas commented Mar 23, 2021

Hi,

I'm using a genomic data set with 180K SNPs to create G matrices. I want to exclude around 3K of these SNPs from the calculation, so I used the j argument to include all other columns than these 3K. This gave a wildly different answer than when not excluding columns, with generally much larger entries than what we should be seeing.

This problem persisted even when I excluded just 1 single randomly selected SNP column (again, out of 180K SNPs, so this should have a large effect, right?). This also shows that the problem is not specific to the columns I wanted to exclude, but to excluding columns with j at all.

Luckily I was able to get around this issue by instead making a new BGData object where the unwanted SNP columns were not included in the first place, and then not specifying j in getG. This gave the expected/right answer, so there seems to be some problem with using j.

Thank you for the package, it is very helpful.

@agrueneberg
Copy link
Member

Hi,

I cannot reproduce the problem on my machine. Can you try to show me what you did with the example BGData object that is bundled with the package?

library(BGData)
DATA <- BGData:::loadExample()
X <- geno(DATA)
G1 <- getG(X)
G2 <- getG(X, j = [...])

Alternatively, could you make code and dataset available to me?

Thank you,
Alex

@kennaas
Copy link
Author

kennaas commented Mar 25, 2021

I am also unable to recreate the error on the example data. As for sharing the dataset, I will have to get back to you. If I can not release it now, it should be available within a few months.

An off-topic PS: it might be nice to add a check on the value of chunkSize - I had accidentally set it to 0 when working with the example data, which resulted in a confusing error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants