Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deal with older vcf formats and samples with missing data #147

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

eniktab
Copy link

@eniktab eniktab commented Jan 10, 2025

No description provided.

@sbslee
Copy link
Owner

sbslee commented Jan 12, 2025

Hi @eniktab,

Thank you for the PR. However, I can't merge it for several reasons:

  1. The target branch for this PR is master which should have been a dev branch such as 0.26.0-dev.
  2. I cannot understand the motivations behind the PR. I'd greatly appreciate if you could provide more details on why you wrote the PR and what you are trying to fix/improve. That being said, in the future, I'd strongly recommend that you first open an issue at this repository before opening a PR. The same is true for another PR you wrote for one of my other packages, fuc (Deal with duplicates and different formats  fuc#76).
  3. I intentionally designed PyPGx such that when there are two different sample sets, PyPGx won't allow the process to complete when calling genotypes. When this occurs, it's better for the user to go back and try to understand why his or her datasets have different sample sets, rather than blindly allowing them to move forward with only overlapping samples. This may be obvious for you but many of the PyPGx users are not too savvy in bioinformatics.

pypgx/pypgx/api/genotype.py

Lines 622 to 624 in f1dd988

_diff = set(cnv_calls.data.index)- set(alleles.data.index)
print(_diff)
print("Are missing and will be droped")

  1. Could you provide more details on why you had to make the changes for:

pypgx/pypgx/api/utils.py

Lines 1303 to 1306 in f1dd988

regions = [
item for item in regions
if item.split(':')[0].isdigit() or item.split(':')[0] in {'X', 'Y'}
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants