-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An error occurred while generating random regions. #14
Comments
Hi, from the error traceback it looks like no regions were selected. Could you please run the command again with the --debug flag and share the resulting output? Please make sure that you provide the same reference version (e.g. hg19/hg38) that the bam file was mapped to. Additionally, if possible it would be great if you share your bam file and your whole software environment. This way I could see why no regions were created. One possible cause of this behavior could be that your reference 2bit file and your bam file do not share common chromosome names, which should be handled by an automatically created mapping between these files. |
Thank you for your enthusiastic response. Your reply provided me with ideas on how to solve the problem, and I successfully ran the computeGCBias_background.py script by replacing the reference .2bit file as you suggested. However, I encountered the following issues in subsequent tests.
The above exception was the direct cause of the following exception: Traceback (most recent call last):
|
Hi, thank you for using the software and reporting issues you ran into. I'll try to answer your questions as good as possible.
Related to the issue, could you please run the script on one of your bam files with the options
Otherwise, it's hard to determine, what causes the memory demand. I spent some time optimizing the resource requirements, and had no problems on reasonable hardware. One idea I have, might be the number of cores. You can think of it as spawning workers that read lots of chunks from your BAM file. If many of them are open at the same time, the memory footprint increases. If you are comfortable to do so, you could profile the memory usage with a memory profiler (I had good experiences with scalene). I hope this helps you in any way. |
Does this software have any other requirements for BAM files? I provided the sorted BAM file for running, and the command is as follows:
python /public/home/yjq/tools/cfDNA_GCcorrection/cfDNA_GCcorrection/computeGCBias_background.py
-b /public/home/yjq/projects/PA_projects/data/NBT_WGS/bamfilter/SRR17478154_filter.sorted.bam
-g /public/home/yjq/genome_anno/hg19/hg19_UCSC.2bit
-p 2
-i
--output /public/home/yjq/projects/PA_projects/data/NBT_WGS/GC_correction/background/
--debug
The following error occurs:
Traceback (most recent call last):
File "/public/home/yjq/tools/cfDNA_GCcorrection/cfDNA_GCcorrection/computeGCBias_background.py", line 596, in
main()
File "/public/home/yjq/.local/lib/python3.8/site-packages/click/core.py", line 1161, in call
return self.main(*args, **kwargs)
File "/public/home/yjq/.local/lib/python3.8/site-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
File "/public/home/yjq/.local/lib/python3.8/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/public/home/yjq/.local/lib/python3.8/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
File "/public/home/yjq/tools/cfDNA_GCcorrection/cfDNA_GCcorrection/computeGCBias_background.py", line 549, in main
regions = get_regions(
File "/public/home/yjq/tools/cfDNA_GCcorrection/cfDNA_GCcorrection/computeGCBias_background.py", line 125, in get_regions
random_regions.to_dataframe(
File "/public/home/yjq/miniconda3/envs/celfeer_env/lib/python3.8/site-packages/pybedtools/bedtool.py", line 3762, in to_dataframe
return pandas.read_csv(self.fn, *args, sep="\t", **kwargs) # type: ignore
File "/public/home/yjq/miniconda3/envs/celfeer_env/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 912, in read_csv
return _read(filepath_or_buffer, kwds)
File "/public/home/yjq/miniconda3/envs/celfeer_env/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 577, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/public/home/yjq/miniconda3/envs/celfeer_env/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1407, in init
self._engine = self._make_engine(f, self.engine)
File "/public/home/yjq/miniconda3/envs/celfeer_env/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1679, in _make_engine
return mapping[engine](f, **self.options)
File "/public/home/yjq/miniconda3/envs/celfeer_env/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 557, in pandas._libs.parsers.TextReader.cinit
pandas.errors.EmptyDataError: No columns to parse from file.
I am sure that I have successfully installed pandas and pybedtools. My deeptools version is 3.5.5, pandas version is 2.0.3, bedtools version is v2.31.1, pybedtools version is 0.11.0, and the Python version is 3.8.19.
The text was updated successfully, but these errors were encountered: