pipeline-runner error #4574
Replies: 2 comments 6 replies
-
I am not sure what |
Beta Was this translation helpful? Give feedback.
-
Hi Hana. { Moreover, I tried with the original scripts load_data.sh and seqr_loading.py and got the same identical error message. Can you help fixing this? |
Beta Was this translation helpful? Give feedback.
-
Dear Hana,
since yesterday I'm having the following error message when I run the pipeline runner command.
Can you help?
Thanks.
docker-compose exec pipeline-runner load_data2.sh 37 WES mend2022_00075 GRCh37/MEND2022_00075.GATK.final.vep.vcf.gz
/usr/local/lib/python3.7/site-packages/luigi/parameter.py:347: UserWarning: OptionalParameter "source_path" with value "<luigi.parameter.Parameter object at 0x7f9ba2a942e8>" is not of type string or None.
param_name, param_value))
/usr/local/lib/python3.7/site-packages/luigi/parameter.py:283: UserWarning: Parameter "es_password" with value "None" is not of type string.
warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
/usr/local/lib/python3.7/site-packages/luigi/parameter.py:283: UserWarning: Parameter "hgmd_ht_path" with value "None" is not of type string.
warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
/usr/local/lib/python3.7/site-packages/elasticsearch/connection/base.py:190: ElasticsearchDeprecationWarning: Elasticsearch built-in security features are not enabled. Without authentication, your cluster could be accessible to anyone. See https://www.elastic.co/guide/en/elasticsearch/reference/7.16/security-minimal-setup.html to enable security.
warnings.warn(message, category=ElasticsearchDeprecationWarning)
DEBUG: Checking if SeqrMTToESTask(source_path=<luigi.parameter.Parameter object at 0x7f9ba2a942e8>, use_temp_loading_nodes=True, es_host=elasticsearch, es_port=9200, es_index=mend2022_00075, es_username=pipeline, es_password=None, es_index_min_num_shards=1, source_paths=/input_vcfs/GRCh37/MEND2022_00075.GATK.final.vep.vcf.gz, dest_path=/input_vcfs/GRCh37/MEND2022_00075.mt, genome_version=37, vep_runner=VEP, reference_ht_path=/seqr-reference-data/GRCh37/combined_reference_data_grch37.ht, clinvar_ht_path=/seqr-reference-data/GRCh37/clinvar.GRCh37.ht, hgmd_ht_path=None, sample_type=WES, dont_validate=False, dataset_type=VARIANTS, remap_path=, subset_path=, vep_config_json_path=/vep_configs/vep-GRCh37-loftee2.json, grch38_to_grch37_ref_chain=gs://hail-common/references/grch38_to_grch37.over.chain.gz) is complete
DEBUG: Checking if SeqrVCFToMTTask(source_paths=/input_vcfs/GRCh37/MEND2022_00075.GATK.final.vep.vcf.gz, dest_path=/input_vcfs/GRCh37/MEND2022_00075.mt, genome_version=37, vep_runner=VEP, reference_ht_path=/seqr-reference-data/GRCh37/combined_reference_data_grch37.ht, clinvar_ht_path=/seqr-reference-data/GRCh37/clinvar.GRCh37.ht, hgmd_ht_path=None, sample_type=WES, dont_validate=False, dataset_type=VARIANTS, remap_path=, subset_path=, vep_config_json_path=/vep_configs/vep-GRCh37-loftee2.json, grch38_to_grch37_ref_chain=gs://hail-common/references/grch38_to_grch37.over.chain.gz) is complete
INFO: Informed scheduler that task SeqrMTToESTask__seqr_reference__VARIANTS__input_vcfs_GRCh_9c8512a26a has status PENDING
DEBUG: Checking if VcfFile(filename=/input_vcfs/GRCh37/MEND2022_00075.GATK.final.vep.vcf.gz) is complete
INFO: Informed scheduler that task SeqrVCFToMTTask__seqr_reference__VARIANTS__input_vcfs_GRCh_a2208fd55d has status PENDING
INFO: Informed scheduler that task VcfFile__input_vcfs_GRCh_a7e1bfa30b has status DONE
INFO: Done scheduling tasks
INFO: Running Worker with 1 processes
DEBUG: Asking scheduler for work...
DEBUG: Pending tasks: 2
INFO: [pid 6307] Worker Worker(salt=578279483, workers=1, host=f6fb3cf91633, username=root, pid=6307) running SeqrVCFToMTTask(source_paths=/input_vcfs/GRCh37/MEND2022_00075.GATK.final.vep.vcf.gz, dest_path=/input_vcfs/GRCh37/MEND2022_00075.mt, genome_version=37, vep_runner=VEP, reference_ht_path=/seqr-reference-data/GRCh37/combined_reference_data_grch37.ht, clinvar_ht_path=/seqr-reference-data/GRCh37/clinvar.GRCh37.ht, hgmd_ht_path=None, sample_type=WES, dont_validate=False, dataset_type=VARIANTS, remap_path=, subset_path=, vep_config_json_path=/vep_configs/vep-GRCh37-loftee2.json, grch38_to_grch37_ref_chain=gs://hail-common/references/grch38_to_grch37.over.chain.gz)
Initializing Hail with default parameters...
2025-01-08 11:22:09 WARN NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2025-01-08 11:22:09 WARN Hail:43 - This Hail JAR was compiled for Spark 3.1.2, running with Spark 3.1.3.
Compatibility is not guaranteed.
Running on Apache Spark version 3.1.3
SparkUI available at http://f6fb3cf91633:4040
Welcome to
__ __ <>__
/ // /__ __/ /
/ __ / _ `/ / /
// //_,/// version 0.2.95-513139587f57
LOGGING: writing to /hail-20250108-1022-0.2.95-513139587f57.log
ERROR: [pid 6307] Worker Worker(salt=578279483, workers=1, host=f6fb3cf91633, username=root, pid=6307) failed SeqrVCFToMTTask(source_paths=/input_vcfs/GRCh37/MEND2022_00075.GATK.final.vep.vcf.gz, dest_path=/input_vcfs/GRCh37/MEND2022_00075.mt, genome_version=37, vep_runner=VEP, reference_ht_path=/seqr-reference-data/GRCh37/combined_reference_data_grch37.ht, clinvar_ht_path=/seqr-reference-data/GRCh37/clinvar.GRCh37.ht, hgmd_ht_path=None, sample_type=WES, dont_validate=False, dataset_type=VARIANTS, remap_path=, subset_path=, vep_config_json_path=/vep_configs/vep-GRCh37-loftee2.json, grch38_to_grch37_ref_chain=gs://hail-common/references/grch38_to_grch37.over.chain.gz)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/luigi/worker.py", line 199, in run
new_deps = self._run_get_new_deps()
File "/usr/local/lib/python3.7/site-packages/luigi/worker.py", line 141, in _run_get_new_deps
task_gen = self.task.run()
File "/hail-elasticsearch-pipelines/luigi_pipeline/seqr_loading2.py", line 84, in run
if self.grch38_to_grch37_ref_chain: check_if_path_exists(self.grch38_to_grch37_ref_chain, "grch38_to_grch37_ref_chain")
File "/hail-elasticsearch-pipelines/luigi_pipeline/seqr_loading2.py", line 22, in check_if_path_exists
if (path.startswith("gs://") and not hl.hadoop_exists(path)) or (not path.startswith("gs://") and not os.path.exists(path)):
File "/usr/local/lib/python3.7/site-packages/hail/utils/hadoop_utils.py", line 146, in hadoop_exists
return Env.fs().exists(path)
File "/usr/local/lib/python3.7/site-packages/hail/fs/hadoop_fs.py", line 55, in exists
return self._jfs.exists(path)
File "/usr/local/lib/python3.7/site-packages/py4j/java_gateway.py", line 1305, in call
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.7/site-packages/hail/backend/py4j_backend.py", line 31, in deco
raise fatal_error_from_java_error_triplet(deepest, full, error_id) from None
hail.utils.java.FatalError: TokenResponseException: 400 Bad Request
{
"error" : "invalid_grant",
"error_description" : "Invalid JWT Signature."
}
Java stack trace:
java.io.IOException: Error accessing gs://hail-common/references/grch38_to_grch37.over.chain.gz
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getObject(GoogleCloudStorageImpl.java:1945)
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:1851)
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.getFileInfoInternal(GoogleCloudStorageFileSystem.java:1148)
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.getFileInfo(GoogleCloudStorageFileSystem.java:1116)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getFileStatus(GoogleHadoopFileSystemBase.java:1121)
at is.hail.io.fs.HadoopFS.fileStatus(HadoopFS.scala:166)
at is.hail.io.fs.FS.exists(FS.scala:184)
at is.hail.io.fs.FS.exists$(FS.scala:182)
at is.hail.io.fs.HadoopFS.exists(HadoopFS.scala:72)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:750)
com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.auth.oauth2.TokenResponseException: 400 Bad Request
{
"error" : "invalid_grant",
"error_description" : "Invalid JWT Signature."
}
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.auth.oauth2.TokenResponseException.from(TokenResponseException.java:105)
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.auth.oauth2.TokenRequest.executeUnparsed(TokenRequest.java:326)
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.auth.oauth2.TokenRequest.execute(TokenRequest.java:346)
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory$GoogleCredentialWithRetry.executeRefreshToken(CredentialFactory.java:162)
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.auth.oauth2.Credential.refreshToken(Credential.java:494)
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.auth.oauth2.Credential.intercept(Credential.java:217)
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:897)
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:499)
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549)
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getObject(GoogleCloudStorageImpl.java:1939)
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:1851)
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.getFileInfoInternal(GoogleCloudStorageFileSystem.java:1148)
at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.getFileInfo(GoogleCloudStorageFileSystem.java:1116)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getFileStatus(GoogleHadoopFileSystemBase.java:1121)
at is.hail.io.fs.HadoopFS.fileStatus(HadoopFS.scala:166)
at is.hail.io.fs.FS.exists(FS.scala:184)
at is.hail.io.fs.FS.exists$(FS.scala:182)
at is.hail.io.fs.HadoopFS.exists(HadoopFS.scala:72)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:750)
Hail version: 0.2.95-513139587f57
Error summary: TokenResponseException: 400 Bad Request
{
"error" : "invalid_grant",
"error_description" : "Invalid JWT Signature."
}
DEBUG: 1 running tasks, waiting for next task to finish
INFO: Informed scheduler that task SeqrVCFToMTTask__seqr_reference__VARIANTS__input_vcfs_GRCh_a2208fd55d has status FAILED
DEBUG: Asking scheduler for work...
DEBUG: Done
DEBUG: There are no more tasks to run at this time
DEBUG: There are 2 pending tasks possibly being run by other workers
DEBUG: There are 2 pending tasks unique to this worker
DEBUG: There are 2 pending tasks last scheduled by this worker
INFO: Worker Worker(salt=578279483, workers=1, host=f6fb3cf91633, username=root, pid=6307) was stopped. Shutting down Keep-Alive thread
INFO:
===== Luigi Execution Summary =====
Scheduled 3 tasks of which:
This progress looks :( because there were failed tasks
===== Luigi Execution Summary =====
Beta Was this translation helpful? Give feedback.
All reactions