From 0198714110238330fb654ae0cafe291c2715c177 Mon Sep 17 00:00:00 2001 From: <> Date: Tue, 23 Jan 2024 21:47:34 +0000 Subject: [PATCH] Deployed 9b8cab6 with MkDocs version: 1.1.2 --- search/search_index.json | 2 +- sitemap.xml | 72 ++++++++-------- sitemap.xml.gz | Bin 472 -> 471 bytes .../job-router-overview/index.html | 4 +- .../optional-configuration/index.html | 2 +- .../writing-job-routes/index.html | 8 +- v23/installation/central-collector/index.html | 2 +- v23/installation/htcondor-ce/index.html | 2 +- v23/troubleshooting/common-issues/index.html | 6 +- .../debugging-tools/index.html | 79 +++++++++++++----- v23/troubleshooting/logs/index.html | 2 +- 11 files changed, 109 insertions(+), 70 deletions(-) diff --git a/search/search_index.json b/search/search_index.json index ad42d685f..803e4d193 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"HTCondor-CE \u00b6 The HTCondor-CE software is a Compute Entrypoint (CE) based on HTCondor for sites that are part of a larger computing grid (e.g. European Grid Infrastructure , The OSG Consortium ). As such, HTCondor-CE serves as a \"door\" for incoming resource allocation requests (RARs) \u2014 it handles authorization and delegation of these requests to a grid site's local batch system. Supported batch systems include Grid Engine , HTCondor , LSF , PBS Pro / Torque , and Slurm . For an introduction to HTCondor-CE, watch our recorded webinar from the EGI Community Webinar Programme: What is a Compute Entrypoint? \u00b6 A Compute Entrypoint (CE) is the door for remote organizations to submit requests to temporarily allocate local compute resources. These resource allocation requests are submitted as pilot jobs that create an environment for end-user jobs to match and ultimately run within the pilot job. CEs are made up of a thin layer of software that you install on a machine that already has the ability to submit and manage jobs in your local batch system. What is HTCondor-CE? \u00b6 HTCondor-CE is a special configuration of the HTCondor software designed as a Compute Entrypoint. It is configured to use the HTCondor Job Router daemon to delegate resource allocation requests by transforming and submitting them to the site\u2019s batch system. Benefits of running the HTCondor-CE: Scalability: HTCondor-CE is capable of supporting ~16k concurrent RARs Debugging tools: HTCondor-CE offers many tools to help troubleshoot issues with RARs Routing as configuration: HTCondor-CE\u2019s mechanism to transform and submit RARs is customized via configuration variables, which means that customizations will persist across upgrades and will not involve modification of software internals to route jobs Getting HTCondor-CE \u00b6 Learn how to get and install HTCondor-CE through our documentation . Contact Us \u00b6 HTCondor-CE is developed and maintained by the Center for High Throughput Computing . If you have questions or issues regarding HTCondor-CE, please see the HTCondor support page for how to contact us.","title":"Overview"},{"location":"#htcondor-ce","text":"The HTCondor-CE software is a Compute Entrypoint (CE) based on HTCondor for sites that are part of a larger computing grid (e.g. European Grid Infrastructure , The OSG Consortium ). As such, HTCondor-CE serves as a \"door\" for incoming resource allocation requests (RARs) \u2014 it handles authorization and delegation of these requests to a grid site's local batch system. Supported batch systems include Grid Engine , HTCondor , LSF , PBS Pro / Torque , and Slurm . For an introduction to HTCondor-CE, watch our recorded webinar from the EGI Community Webinar Programme:","title":"HTCondor-CE"},{"location":"#what-is-a-compute-entrypoint","text":"A Compute Entrypoint (CE) is the door for remote organizations to submit requests to temporarily allocate local compute resources. These resource allocation requests are submitted as pilot jobs that create an environment for end-user jobs to match and ultimately run within the pilot job. CEs are made up of a thin layer of software that you install on a machine that already has the ability to submit and manage jobs in your local batch system.","title":"What is a Compute Entrypoint?"},{"location":"#what-is-htcondor-ce","text":"HTCondor-CE is a special configuration of the HTCondor software designed as a Compute Entrypoint. It is configured to use the HTCondor Job Router daemon to delegate resource allocation requests by transforming and submitting them to the site\u2019s batch system. Benefits of running the HTCondor-CE: Scalability: HTCondor-CE is capable of supporting ~16k concurrent RARs Debugging tools: HTCondor-CE offers many tools to help troubleshoot issues with RARs Routing as configuration: HTCondor-CE\u2019s mechanism to transform and submit RARs is customized via configuration variables, which means that customizations will persist across upgrades and will not involve modification of software internals to route jobs","title":"What is HTCondor-CE?"},{"location":"#getting-htcondor-ce","text":"Learn how to get and install HTCondor-CE through our documentation .","title":"Getting HTCondor-CE"},{"location":"#contact-us","text":"HTCondor-CE is developed and maintained by the Center for High Throughput Computing . If you have questions or issues regarding HTCondor-CE, please see the HTCondor support page for how to contact us.","title":"Contact Us"},{"location":"architecture/","text":"How Jobs Run \u00b6 Once an incoming pilot job is authorized, it is placed into HTCondor-CE\u2019s scheduler where the Job Router creates a transformed copy (called the routed job ) and submits the copy to the batch system (called the batch system job ). After submission, HTCondor-CE monitors the batch system job and communicates its status to the original pilot job, which in turn notifies the original submitter (e.g., job factory) of any updates. When the batch job job completes, files are transferred along the same chain: from the batch system to the CE, then from the CE to the original submitter. On HTCondor batch systems \u00b6 For a site with an HTCondor batch system , the Job Router uses HTCondor protocols to place a transformed copy of the pilot job directly into the batch system\u2019s scheduler, meaning that the routed job is also the batch system job. Thus, there are three representations of your job, each with its own ID (see diagram below): Submitter: the HTCondor job ID in the original queue HTCondor-CE: the incoming pilot job\u2019s ID HTCondor batch system: the routed job\u2019s ID In an HTCondor-CE/HTCondor setup, file transfer is handled natively between the two sets of daemons by the underlying HTCondor software. If you are running HTCondor as your batch system, you will have two HTCondor configurations side-by-side (one residing in /etc/condor/ and the other in /etc/condor-ce ) and will need to make sure to differentiate the two when modifying any configuration. On other batch systems \u00b6 For non-HTCondor batch systems, the Job Router transforms the pilot job into a routed job on the CE and the routed job submits a job into the batch system via a process called the BLAHP. Thus, there are four representations of your job, each with its own ID (see diagram below): Submitter: the HTCondor job ID in the original queue HTCondor-CE: the incoming pilot job\u2019s ID and the routed job\u2019s ID Non-HTCondor batch system: the batch system\u2019s job ID Although the following figure specifies the PBS case, it applies to all non-HTCondor batch systems: With non-HTCondor batch systems, HTCondor-CE cannot use internal HTCondor protocols to transfer files so its \"spool\" directory must be exported to a shared file system that is mounted on the batch system\u2019s worker nodes. Hosted CE over SSH \u00b6 The Hosted CE is designed to be an HTCondor-CE as a Service offered by a central grid operations team. Hosted CEs submit jobs to remote clusters over SSH, providing a simple starting point for opportunistic resource owners that want to start contributing to a computing grid with minimal effort. If your site intends to run over 10,000 concurrent pilot jobs, you will need to host your own HTCondor-CE because the Hosted CE has not yet been optimized for such loads. How the CE is Customized \u00b6 Aside from the [basic configuration] required in the CE installation, there are two main ways to customize your CE (if you decide any customization is required at all): Deciding which Virtual Organizations (VOs) are allowed to run at your site: HTCondor-CE leverages HTCondor's built-in ability to authenticate incoming jobs based on their OAuth token credentials. How to filter and transform the pilot jobs to be run on your batch system: Filtering and transforming pilot jobs (i.e., setting site-specific attributes or resource limits), requires configuration of your site\u2019s job routes. For examples of common job routes, consult the job router configuration pages. How Security Works \u00b6 In the grid, security depends on a PKI infrastructure involving Certificate Authorities (CAs) where CAs sign and issue certificates. When these clients and hosts wish to communicate with each other, the identities of each party is confirmed by cross-checking their certificates with the signing CA and establishing trust. In its default configuration, HTCondor-CE supports token-based authentication and authorization to the remote submitter's credentials.","title":"Architecture"},{"location":"architecture/#how-jobs-run","text":"Once an incoming pilot job is authorized, it is placed into HTCondor-CE\u2019s scheduler where the Job Router creates a transformed copy (called the routed job ) and submits the copy to the batch system (called the batch system job ). After submission, HTCondor-CE monitors the batch system job and communicates its status to the original pilot job, which in turn notifies the original submitter (e.g., job factory) of any updates. When the batch job job completes, files are transferred along the same chain: from the batch system to the CE, then from the CE to the original submitter.","title":"How Jobs Run"},{"location":"architecture/#on-htcondor-batch-systems","text":"For a site with an HTCondor batch system , the Job Router uses HTCondor protocols to place a transformed copy of the pilot job directly into the batch system\u2019s scheduler, meaning that the routed job is also the batch system job. Thus, there are three representations of your job, each with its own ID (see diagram below): Submitter: the HTCondor job ID in the original queue HTCondor-CE: the incoming pilot job\u2019s ID HTCondor batch system: the routed job\u2019s ID In an HTCondor-CE/HTCondor setup, file transfer is handled natively between the two sets of daemons by the underlying HTCondor software. If you are running HTCondor as your batch system, you will have two HTCondor configurations side-by-side (one residing in /etc/condor/ and the other in /etc/condor-ce ) and will need to make sure to differentiate the two when modifying any configuration.","title":"On HTCondor batch systems"},{"location":"architecture/#on-other-batch-systems","text":"For non-HTCondor batch systems, the Job Router transforms the pilot job into a routed job on the CE and the routed job submits a job into the batch system via a process called the BLAHP. Thus, there are four representations of your job, each with its own ID (see diagram below): Submitter: the HTCondor job ID in the original queue HTCondor-CE: the incoming pilot job\u2019s ID and the routed job\u2019s ID Non-HTCondor batch system: the batch system\u2019s job ID Although the following figure specifies the PBS case, it applies to all non-HTCondor batch systems: With non-HTCondor batch systems, HTCondor-CE cannot use internal HTCondor protocols to transfer files so its \"spool\" directory must be exported to a shared file system that is mounted on the batch system\u2019s worker nodes.","title":"On other batch systems"},{"location":"architecture/#hosted-ce-over-ssh","text":"The Hosted CE is designed to be an HTCondor-CE as a Service offered by a central grid operations team. Hosted CEs submit jobs to remote clusters over SSH, providing a simple starting point for opportunistic resource owners that want to start contributing to a computing grid with minimal effort. If your site intends to run over 10,000 concurrent pilot jobs, you will need to host your own HTCondor-CE because the Hosted CE has not yet been optimized for such loads.","title":"Hosted CE over SSH"},{"location":"architecture/#how-the-ce-is-customized","text":"Aside from the [basic configuration] required in the CE installation, there are two main ways to customize your CE (if you decide any customization is required at all): Deciding which Virtual Organizations (VOs) are allowed to run at your site: HTCondor-CE leverages HTCondor's built-in ability to authenticate incoming jobs based on their OAuth token credentials. How to filter and transform the pilot jobs to be run on your batch system: Filtering and transforming pilot jobs (i.e., setting site-specific attributes or resource limits), requires configuration of your site\u2019s job routes. For examples of common job routes, consult the job router configuration pages.","title":"How the CE is Customized"},{"location":"architecture/#how-security-works","text":"In the grid, security depends on a PKI infrastructure involving Certificate Authorities (CAs) where CAs sign and issue certificates. When these clients and hosts wish to communicate with each other, the identities of each party is confirmed by cross-checking their certificates with the signing CA and establishing trust. In its default configuration, HTCondor-CE supports token-based authentication and authorization to the remote submitter's credentials.","title":"How Security Works"},{"location":"v23/operation/","text":"Operating an HTCondor-CE \u00b6 To verify that you have a working installation of HTCondor-CE, ensure that all the relevant services are started and enabled then perform the validation steps below. Managing HTCondor-CE services \u00b6 In addition to the HTCondor-CE job gateway service itself, there are a number of supporting services in your installation. The specific services are: Software Service name Your batch system condor or pbs_server or \u2026 HTCondor-CE condor-ce (Optional) APEL uploader condor-ce-apel and condor-ce-apel.timer Start and enable the services in the order listed and stop them in reverse order. As a reminder, here are common service commands (all run as root ): To... On EL7, run the command... Start a service systemctl start Stop a service systemctl stop Enable a service to start on boot systemctl enable Disable a service from starting on boot systemctl disable Validating HTCondor-CE \u00b6 To validate an HTCondor-CE, perform the following steps: Verify that local job submissions complete successfully from the CE host. For example, if you have a Slurm cluster, run sbatch from the CE and verify that it runs and completes with scontrol and sacct . Verify that all the necessary daemons are running with condor_ce_status -any . Verify the CE's network configuration using condor_ce_host_network_check . Verify that jobs can complete successfully using condor_ce_trace . Draining an HTCondor-CE \u00b6 To drain an HTCondor-CE of jobs, perform the following steps: Set CONDORCE_MAX_JOBS = 0 in /etc/condor-ce/config.d Run condor_ce_reconfig to apply the configuration change Use condor_ce_rm as needed to stop and remove any jobs that should stop running Once draining is completed, don't forget to restore the value of CONDORCE_MAX_JOBS to its previous value before trying to operate the HTCondor-CE again. Checking User Authentication \u00b6 The authentication method for submitting jobs to an HTCondor-CE is SciTokens. To see which authentication method and identity were used to submit a particular job (or modify existing jobs), you can look in /var/log/condor-ce/AuditLog . If SciTokens authentication was used, you'll see a set of lines like this: 10/15/21 17:54:08 (cid:130) (D_AUDIT) Command=QMGMT_WRITE_CMD, peer=<172.17.0.2:37869> 10/15/21 17:54:08 (cid:130) (D_AUDIT) AuthMethod=SCITOKENS, AuthId=https://demo.scitokens.org,htcondor-ce-dev, CondorId=testuser@users.htcondor.org 10/15/21 17:54:08 (cid:130) (D_AUDIT) Submitting new job 2.0 Lines pertaining to the same client request will have the same cid value. Lines from different client requests may be interleaved. Getting Help \u00b6 If any of the above validation steps fail, consult the troubleshooting guide . If that still doesn't resolve your issue, please contact us for assistance.","title":"Operation"},{"location":"v23/operation/#operating-an-htcondor-ce","text":"To verify that you have a working installation of HTCondor-CE, ensure that all the relevant services are started and enabled then perform the validation steps below.","title":"Operating an HTCondor-CE"},{"location":"v23/operation/#managing-htcondor-ce-services","text":"In addition to the HTCondor-CE job gateway service itself, there are a number of supporting services in your installation. The specific services are: Software Service name Your batch system condor or pbs_server or \u2026 HTCondor-CE condor-ce (Optional) APEL uploader condor-ce-apel and condor-ce-apel.timer Start and enable the services in the order listed and stop them in reverse order. As a reminder, here are common service commands (all run as root ): To... On EL7, run the command... Start a service systemctl start Stop a service systemctl stop Enable a service to start on boot systemctl enable Disable a service from starting on boot systemctl disable ","title":"Managing HTCondor-CE services"},{"location":"v23/operation/#validating-htcondor-ce","text":"To validate an HTCondor-CE, perform the following steps: Verify that local job submissions complete successfully from the CE host. For example, if you have a Slurm cluster, run sbatch from the CE and verify that it runs and completes with scontrol and sacct . Verify that all the necessary daemons are running with condor_ce_status -any . Verify the CE's network configuration using condor_ce_host_network_check . Verify that jobs can complete successfully using condor_ce_trace .","title":"Validating HTCondor-CE"},{"location":"v23/operation/#draining-an-htcondor-ce","text":"To drain an HTCondor-CE of jobs, perform the following steps: Set CONDORCE_MAX_JOBS = 0 in /etc/condor-ce/config.d Run condor_ce_reconfig to apply the configuration change Use condor_ce_rm as needed to stop and remove any jobs that should stop running Once draining is completed, don't forget to restore the value of CONDORCE_MAX_JOBS to its previous value before trying to operate the HTCondor-CE again.","title":"Draining an HTCondor-CE"},{"location":"v23/operation/#checking-user-authentication","text":"The authentication method for submitting jobs to an HTCondor-CE is SciTokens. To see which authentication method and identity were used to submit a particular job (or modify existing jobs), you can look in /var/log/condor-ce/AuditLog . If SciTokens authentication was used, you'll see a set of lines like this: 10/15/21 17:54:08 (cid:130) (D_AUDIT) Command=QMGMT_WRITE_CMD, peer=<172.17.0.2:37869> 10/15/21 17:54:08 (cid:130) (D_AUDIT) AuthMethod=SCITOKENS, AuthId=https://demo.scitokens.org,htcondor-ce-dev, CondorId=testuser@users.htcondor.org 10/15/21 17:54:08 (cid:130) (D_AUDIT) Submitting new job 2.0 Lines pertaining to the same client request will have the same cid value. Lines from different client requests may be interleaved.","title":"Checking User Authentication"},{"location":"v23/operation/#getting-help","text":"If any of the above validation steps fail, consult the troubleshooting guide . If that still doesn't resolve your issue, please contact us for assistance.","title":"Getting Help"},{"location":"v23/reference/","text":"Reference \u00b6 Configuration \u00b6 The following directories contain the configuration for HTCondor-CE. The directories are parsed in the order presented and thus configuration within the final directory will override configuration specified in the previous directories. Location Comment /usr/share/condor-ce/config.d/ Configuration defaults (overwritten on package updates) /etc/condor-ce/config.d/ Files in this directory are parsed in alphanumeric order (i.e., 99-local.conf will override values in 01-ce-auth.conf ) For a detailed order of the way configuration files are parsed, run the following command: user@host $ condor_ce_config_val -config Users \u00b6 The following users are needed by HTCondor-CE at all sites: User Comment condor The HTCondor-CE will be run as root, but perform most of its operations as the condor user. Certificates \u00b6 File User that owns certificate Path to certificate Host certificate root /etc/grid-security/hostcert.pem Host key root /grid-security/hostkey.pem Networking \u00b6 Service Name Protocol Port Number Inbound Outbound Comment Htcondor-CE tcp 9619 X HTCondor-CE shared port Allow inbound and outbound network connection to all internal site servers, such as the batch system head-node only ephemeral outgoing ports are necessary.","title":"Reference"},{"location":"v23/reference/#reference","text":"","title":"Reference"},{"location":"v23/reference/#configuration","text":"The following directories contain the configuration for HTCondor-CE. The directories are parsed in the order presented and thus configuration within the final directory will override configuration specified in the previous directories. Location Comment /usr/share/condor-ce/config.d/ Configuration defaults (overwritten on package updates) /etc/condor-ce/config.d/ Files in this directory are parsed in alphanumeric order (i.e., 99-local.conf will override values in 01-ce-auth.conf ) For a detailed order of the way configuration files are parsed, run the following command: user@host $ condor_ce_config_val -config","title":"Configuration"},{"location":"v23/reference/#users","text":"The following users are needed by HTCondor-CE at all sites: User Comment condor The HTCondor-CE will be run as root, but perform most of its operations as the condor user.","title":"Users"},{"location":"v23/reference/#certificates","text":"File User that owns certificate Path to certificate Host certificate root /etc/grid-security/hostcert.pem Host key root /grid-security/hostkey.pem","title":"Certificates"},{"location":"v23/reference/#networking","text":"Service Name Protocol Port Number Inbound Outbound Comment Htcondor-CE tcp 9619 X HTCondor-CE shared port Allow inbound and outbound network connection to all internal site servers, such as the batch system head-node only ephemeral outgoing ports are necessary.","title":"Networking"},{"location":"v23/releases/","text":"Releases \u00b6 HTCondor-CE 23 is distributed via RPM and are available from the following Yum repositories: HTCondor LTS and Feature Releases The OSG Consortium Known Issues \u00b6 Known bugs affecting HTCondor-CEs can be found in Jira Updating to HTCondor-CE 23 \u00b6 Updating from HTCondor-CE < 6 If updating to HTCondor-CE 23 from HTCondor-CE < 6, be sure to also consult the HTCondor-CE 6 upgrade instructions . Finding relevant configuration changes When updating HTCondor-CE RPMs, .rpmnew and .rpmsave files may be created containing new defaults that you should merge or new defaults that have replaced your customzations, respectively. To find these files for HTCondor-CE, run the following command: root@host # find /etc/condor-ce/ -name '*.rpmnew' -name '*.rpmsave' HTCondor-CE 23 is very close in functionality to HTCondor-CE 6. As such, upgrading should be very easy. HTCondor-CE 23 Version History \u00b6 This section contains release notes for each version of HTCondor-CE 23. Full HTCondor-CE version history can be found on GitHub . 23.0.3 \u00b6 This release includes the following new features: Ensure that jobs requesting GPUs land on HTCondor EPs with GPUs 23.0.1 \u00b6 This release includes the following new features: Add condor_ce_test_token command 23.0.0 \u00b6 This release includes the following new features: Add grid CA and host certificate/key locations to default SSL search paths Verifies that HTCondor-CE can access the local HTCondor's SPOOL directory Can use condor_ce_trace without SciToken to test batch system integration condor_ce_upgrade_check checks compatibility with HTCondor 23.0 Adds deprecation warnings for old job router configuration syntax Getting Help \u00b6 If you have any questions about the release process or run into issues with an upgrade, please contact us for assistance.","title":"Releases"},{"location":"v23/releases/#releases","text":"HTCondor-CE 23 is distributed via RPM and are available from the following Yum repositories: HTCondor LTS and Feature Releases The OSG Consortium","title":"Releases"},{"location":"v23/releases/#known-issues","text":"Known bugs affecting HTCondor-CEs can be found in Jira","title":"Known Issues"},{"location":"v23/releases/#updating-to-htcondor-ce-23","text":"Updating from HTCondor-CE < 6 If updating to HTCondor-CE 23 from HTCondor-CE < 6, be sure to also consult the HTCondor-CE 6 upgrade instructions . Finding relevant configuration changes When updating HTCondor-CE RPMs, .rpmnew and .rpmsave files may be created containing new defaults that you should merge or new defaults that have replaced your customzations, respectively. To find these files for HTCondor-CE, run the following command: root@host # find /etc/condor-ce/ -name '*.rpmnew' -name '*.rpmsave' HTCondor-CE 23 is very close in functionality to HTCondor-CE 6. As such, upgrading should be very easy.","title":"Updating to HTCondor-CE 23"},{"location":"v23/releases/#htcondor-ce-23-version-history","text":"This section contains release notes for each version of HTCondor-CE 23. Full HTCondor-CE version history can be found on GitHub .","title":"HTCondor-CE 23 Version History"},{"location":"v23/releases/#2303","text":"This release includes the following new features: Ensure that jobs requesting GPUs land on HTCondor EPs with GPUs","title":"23.0.3"},{"location":"v23/releases/#2301","text":"This release includes the following new features: Add condor_ce_test_token command","title":"23.0.1"},{"location":"v23/releases/#2300","text":"This release includes the following new features: Add grid CA and host certificate/key locations to default SSL search paths Verifies that HTCondor-CE can access the local HTCondor's SPOOL directory Can use condor_ce_trace without SciToken to test batch system integration condor_ce_upgrade_check checks compatibility with HTCondor 23.0 Adds deprecation warnings for old job router configuration syntax","title":"23.0.0"},{"location":"v23/releases/#getting-help","text":"If you have any questions about the release process or run into issues with an upgrade, please contact us for assistance.","title":"Getting Help"},{"location":"v23/remote-job-submission/","text":"Submitting Jobs Remotely to an HTCondor-CE \u00b6 This document outlines how to submit jobs to an HTCondor-CE from a remote client using two different methods: With dedicated tools for quickly verifying end-to-end job submission, and From an existing HTCondor submit host, useful for developing pilot submission infrastructure If you are the administrator of an HTCondor-CE, consider verifying your HTCondor-CE using the administrator-focused documentation . Before Starting \u00b6 Before attempting to submit jobs to an HTCondor-CE as documented below, ensure the following: The HTCondor-CE administrator has independently verified their HTCondor-CE The HTCondor-CE administrator has added your credential information (e.g. SciToken or grid proxy) to the HTCondor-CE authentication configuration Your credentials are valid and unexpired Submission with Debugging Tools \u00b6 The HTCondor-CE client contains debugging tools designed to quickly test an HTCondor-CE. To use these tools, install the RPM package from the relevant Yum repository : root@host # yum install htcondor-ce-client Verify end-to-end submission \u00b6 The HTCondor-CE client package includes a debugging tool that perform tests of end-to-end job submission called condor_ce_trace . To submit a diagnostic job with condor_ce_trace , run the following command: user@host $ condor_ce_trace --debug Replacing with the hostname of the CE you wish to test. On success, you will see Job status: Completed and the job's environment on the worker node where it ran. If you do not see the expected output, refer to the troubleshooting guide . CONDOR_CE_TRACE_ATTEMPTS For a busy site cluster, it may take longer than the default 5 minutes to test end-to-end submission. To extend the length of time that condor_ce_trace waits for the job to complete, prepend the command with _condor_CONDOR_CE_TRACE_ATTEMPTS=