-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError: 'Coding sequence' #255
Comments
Could be, did you run sarek beforehand to obtain the VEP annotation? |
I did not use Sarek. I used Sentieon for variant calling and and vcf2maf for VEP annotation (and conversion to MAF) |
I reprocessed the VCF with SAREK, annotation only, but I still get an error at the same step Caused by: Command executed: create folder for MHCflurry downloads to avoid permission problems when running pipeline with docker profile and mhcflurry selectedmkdir -p mhcflurry-data specify MHCflurry release for which to download models, need to be updated here as well when MHCflurry will be updatedexport MHCFLURRY_DOWNLOADS_CURRENT_RELEASE=1.4.0 Add non-free software to the PATHshopt -s nullglob epaa.py --identifier 661T.GATK_sn_L.gene_region.WES.output-tnhap2_snpEff_VEP.ann.chr15 --alleles 'A03:01;A03:01;B07:02;B15:01;C04:01;C07:02' --tools 'syfpeithi' --max_length 11 --min_length 8 --versions versions.csv --genome_reference 'https://www.ensembl.org' --somatic_mutation 661T.GATK_sn_L.gene_region.WES.output-tnhap2_snpEff_VEP.ann.chr15.vcf cat <<-END_VERSIONS > versions.yml Command exit status: Command output: Command error: |
Can you go into @christopher-mohr Have you seen this error before? |
Description of the bug
Hi,
I am getting the following error. Can you please help ?
I am quite new to this sort of analysis and I am not very familiar with the tools. From what I understand, I may be missing some fields in my VEP annotated VCF.
Thanks
ERROR ~ Error executing process > 'NFCORE_EPITOPEPREDICTION:EPITOPEPREDICTION:EPYTOPE_PEPTIDE_PREDICTION_VAR (4)'
Caused by:
Process
NFCORE_EPITOPEPREDICTION:EPITOPEPREDICTION:EPYTOPE_PEPTIDE_PREDICTION_VAR (4)
terminated with an error exit status (1)Command executed:
create folder for MHCflurry downloads to avoid permission problems when running pipeline with docker profile and mhcflurry selected
mkdir -p mhcflurry-data
export MHCFLURRY_DATA_DIR=./mhcflurry-data
specify MHCflurry release for which to download models, need to be updated here as well when MHCflurry will be updated
export MHCFLURRY_DOWNLOADS_CURRENT_RELEASE=1.4.0
Add non-free software to the PATH
shopt -s nullglob
IFS=',' read -r -a netmhc_paths_string <<< ""
for p in "${netmhc_paths_string[@]}"; do
export PATH="$(realpath -s "$p"):$PATH";
done
shopt -u nullglob
epaa.py --identifier 649T.GATK_sn_L.gene_region.WES.output-tnhap2.vep.chr12 --alleles 'A01:01;A68:01;B27:05;B57:01;C02:02;C06:02' --tools 'mhcflurry' --max_length 11 --min_length 8 --versions versions.csv --genome_reference 'https://www.ensembl.org' --somatic_mutation 649T.GATK_sn_L.gene_region.WES.output-tnhap2.vep.chr12.vcf
cat <<-END_VERSIONS > versions.yml$(mhcflurry-predict --version 2>&1 | sed 's/^mhcflurry //; s/ .*$ //')
"NFCORE_EPITOPEPREDICTION:EPITOPEPREDICTION:EPYTOPE_PEPTIDE_PREDICTION_VAR":
python: $(python --version 2>&1 | sed 's/Python //g')
epytope: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('epytope').version)")
pandas: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('pandas').version)")
pyvcf: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('PyVCF3').version)")
mhcflurry:
mhcnuggets: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('mhcnuggets').version)")
END_VERSIONS
Command exit status:
1
Command output:
2025-01-09 13:45:07,311 - main - INFO - Running Epitope Prediction And Annotation version: 1.1
2025-01-09 13:45:07,312 - main - INFO - Starting predictions at 2025-01-09 13:45:07
2025-01-09 13:45:07,312 - main - INFO - Running epaa for variants...
2025-01-09 13:45:07,338 - main - WARNING - FORMAT entry PID not defined for 649N. Skipping.
2025-01-09 13:45:07,338 - main - WARNING - FORMAT entry PGT not defined for 649N. Skipping.
2025-01-09 13:45:07,338 - main - WARNING - FORMAT entry PS not defined for 649N. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PID not defined for 649T. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PGT not defined for 649T. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PS not defined for 649T. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PID not defined for 649N. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PGT not defined for 649N. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PS not defined for 649N. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PID not defined for 649T. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PGT not defined for 649T. Skipping.
2025-01-09 13:45:07,339 - main - WARNING - FORMAT entry PS not defined for 649T. Skipping.
Command error:
INFO:main:Running epaa for variants...
WARNING:main:FORMAT entry PID not defined for 649N. Skipping.
WARNING:main:FORMAT entry PGT not defined for 649N. Skipping.
WARNING:main:FORMAT entry PS not defined for 649N. Skipping.
WARNING:main:FORMAT entry PID not defined for 649T. Skipping.
WARNING:main:FORMAT entry PGT not defined for 649T. Skipping.
WARNING:main:FORMAT entry PS not defined for 649T. Skipping.
WARNING:main:FORMAT entry PID not defined for 649N. Skipping.
WARNING:main:FORMAT entry PGT not defined for 649N. Skipping.
WARNING:main:FORMAT entry PS not defined for 649N. Skipping.
WARNING:main:FORMAT entry PID not defined for 649T. Skipping.
WARNING:main:FORMAT entry PGT not defined for 649T. Skipping.
WARNING:main:FORMAT entry PS not defined for 649T. Skipping.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Coding sequence'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/myhome/.nextflow/assets/nf-core/epitopeprediction/bin/epaa.py", line 1310, in
main()
File "/home/myhome/.nextflow/assets/nf-core/epitopeprediction/bin/epaa.py", line 1146, in main
transcriptProteinTable,
File "/home/myhome/.nextflow/assets/nf-core/epitopeprediction/bin/epaa.py", line 722, in make_predictions_from_variants
generator.generate_transcripts_from_variants(variants_all, martsadapter, ID_SYSTEM_USED)
File "/home/myhome/.nextflow/assets/nf-core/epitopeprediction/bin/epaa.py", line 720, in
p
File "/usr/local/lib/python3.7/site-packages/epytope/Core/Generator.py", line 406, in generate_proteins_from_transcripts
for t in transcripts:
File "/usr/local/lib/python3.7/site-packages/epytope/Core/Generator.py", line 350, in generate_transcripts_from_variants
query = dbadapter.get_transcript_information(tId, type=id_type, _db=db)
File "/usr/local/lib/python3.7/site-packages/epytope/IO/MartsAdapter.py", line 462, in get_transcript_information
if result.empty or 'Sequence unavailable' in result.at[0, attributes["coding"]]:
File "/usr/local/lib/python3.7/site-packages/pandas/core/indexing.py", line 2275, in getitem
return super().getitem(key)
File "/usr/local/lib/python3.7/site-packages/pandas/core/indexing.py", line 2222, in getitem
return self.obj._get_value(*key, takeable=self._takeable)
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 3568, in _get_value
series = self._get_item_cache(col)
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 3884, in _get_item_cache
loc = self.columns.get_loc(item)
File "/usr/local/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: 'Coding sequence'
Command used and terminal output
Relevant files
No response
System information
Nextflow/24.04.4
HPC
slurm
Singularity
Linux EL 8.8
epitopeprediction v2.3.1
The text was updated successfully, but these errors were encountered: