Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP to add lammps #2

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

WIP to add lammps #2

wants to merge 2 commits into from

Conversation

vsoch
Copy link
Member

@vsoch vsoch commented Jun 19, 2023

I'm still figuring out how this works with mpi - when I don't add something called a "universe" parallel it seems to run (I see a .out and a .err file) but there are errors with respect to a binary that is in the mpi bin (which is on the path). When I add parallel it seems to hang in the IDLE state, and all I see is the .log file. Still debugging - I am new to HTCondor so trying to get my feet wet!

What I think (maybe?) is happening is that after I ask for this universe, it tries to fall back to some kind of token auth and then fails. E.g., I see this in one of the logs:

TrustDomain = "htcondor-sample-manager-0-0.htc-service.htcondor-operator.svc.cluster.local"
06/19/23 04:48:31 (pid:48) attempt to connect to <10.244.0.69:9618> failed: timed out after 20 seconds.
06/19/23 04:48:31 (pid:48) ERROR: SECMAN:2003:TCP connection to collector htcondor-sample-manager-0-0.htc-service.htcondor-operator.svc.cluster.local failed.
06/19/23 04:48:31 (pid:48) Failed to start non-blocking update to <10.244.0.69:9618>.
06/19/23 04:48:31 (pid:48) SECMAN: FAILED: Received "DENIED" from server for user condor_pool@ using method IDTOKENS.
06/19/23 04:48:31 (pid:48) Failed to send RESCHEDULE to negotiator htcondor-sample-manager-0-0.htc-service.htcondor-operator.svc.cluster.local: SECMAN:2010:Received "DENIED" from server for user condor_pool@ using method IDTOKENS.
06/19/23 04:49:15 (pid:48) Received a superuser command
06/19/23 04:49:15 (pid:48) Number of Active Workers 0

So I'm wondering if a logical next step is to try to set up the token auth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant