Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work with mm39 genome #1

Open
FerriolCalvet opened this issue Aug 12, 2024 · 5 comments
Open

Work with mm39 genome #1

FerriolCalvet opened this issue Aug 12, 2024 · 5 comments

Comments

@FerriolCalvet
Copy link
Contributor

Since there is a new version of the mice genome and this is part of bgdata and bgsignature, it would be great to add it to the list of available genomes from OncodriveCLUSTL.

this is where the options are listed:
https://bitbucket.org/bbglab/oncodriveclustl/src/2b3842ef45fef12f35b3615a0636ef62910f6350/oncodriveclustl/main.py?at=master#lines-47

And this is where the information from the genome is used:
https://bitbucket.org/bbglab/oncodriveclustl/src/2b3842ef45fef12f35b3615a0636ef62910f6350/oncodriveclustl/utils/run.py?at=master#lines-200

since it is done through bgreference as long as we make sure that the latest version of bgreference is installed with the installation requirements of oncodriveclustl I guess it should be fine!

(happy to provide test data from deepCSA)

@FerriolCalvet
Copy link
Contributor Author

Miguel added mm39 to the list of available genomes, but this will only be reflected in if the bgreference version installed is the latest one.
When making this modification the requirements of the installation of oncodriveclustl should request bgreference v0.7 at least
https://bitbucket.org/bgframework/bgreference/commits/603424267d019a67b2415c8fe4e9904d1056b6a7

@XiaoYan000
Copy link

Hi, I am wondering if I can use the mm38 as reference, but I don't know how to prepare the file. Can you give me some guidence? Thank you very much

@FerriolCalvet
Copy link
Contributor Author

Hi,
mm38 is not a commonly used name for mouse genomes.
If you want to use GRCm38 this is the same as mm10 which is already available in oncodriveclustl.
See: reference for mm10 genome name
I hope this helps, let me know otherwise.

@XiaoYan000
Copy link

Hi. Thank you very much for your reply. I have set -g mm10; however, it stucked. So I execute the bgdata :

bgdata get datasets/genomereference/mm10
2024-10-05 00:17:16 bgdata.manager INFO -- Tag "master" for "datasets/genomereference/mm10" resolved as 20180109
2024-10-05 00:17:16 bgdata.manager INFO -- Package "datasets/genomereference/mm10-20180109" not found in local repo
2024-10-05 00:17:17 bgdata.repository INFO -- Downloading datasets/genomereference/mm10-20180109
100% 564.2 MiB 301.9 KiB/s 0:00:00 ETA
2024-10-05 00:49:12 bgdata.package INFO -- Extracting package.tar.xz
2024-10-05 00:50:21 bgdata.repository INFO -- Package datasets/genomereference/mm10-20180109 ready
2024-10-05 00:50:21 bgdata INFO -- Dataset downloaded
/home/xiaoxinyi/.bgdata/datasets/genomereference/mm10-20180109
It seems that the file download successfully.

However, when I run oncodriveClustl, Error showed 'Sequence 'GL456210.1' not found in genome build 'mm10''. (I have already filter the chromosome:
chromosomes = set(map(str, list(range(1, 19)) + ['X', 'Y']))
gencode['CHROMOSOME_FILTER'] = gencode.apply(lambda x: 'PASS' if x['CHROMOSOME'] in chromosomes else 'FAIL', axis=1)
gencode = gencode.loc[gencode['CHROMOSOME_FILTER'] == 'PASS'].copy()
gencode.drop(['CHROMOSOME_FILTER'], axis=1)

Can you give me some advice for this bug?
Thanks a lot!

@FerriolCalvet
Copy link
Contributor Author

Hi
I am not among the developers of the tool so I can't help a lot with this, I find it weird that if you have filtered the input this sequence raises an error so I would make sure again that this is not there, but apart from this I am afraid I can't help much.
Can you send the details on how you run oncodriveclustl after filtering these variants?

Ferriol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants