Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: override default PluginDir location for when slurmd is used in configless mode #18

Merged
merged 4 commits into from
Jun 11, 2024

Conversation

NucciTheBoss
Copy link
Member

Description

This pull request fixes configless mode for slurmd by patching Slurm's src/common/Makefile.am file to set the default PluginDir and sysconfig (slurm.conf) locations to snap-specific locations under $SNAP_COMMON.

Fixes #17

Changes

  • Add patch for src/common/Makefile.am and patch part in snapcraft.yaml.
    • Add --authinfo to configure munge authentication when running in configless mode.
  • Remove -f flag from slurmd command when running in configless mode.

Misc.

Updates the Slurm version to bugfix version 23.11.7.

Also removed '-f' flag that points to system location of slurm.conf
if running slurmd in configless mode. Slurm supports "included" configuration
- where configuration is stored within multiple files - but it does not support
overloading configuration. This means that if we're running in configless mode
and pulling slurm.conf from the slurm controller, '-f /path/to/slurm.conf' is ignored.

Signed-off-by: Jason C. Nucciarone <[email protected]>
Switch part plugin from autools -> nil and embed configure
parameters into the override-build section.

Signed-off-by: Jason C. Nucciarone <[email protected]>
@jedel1043
Copy link
Contributor

Will test on a live cluster and report the results.

@jedel1043
Copy link
Contributor

jedel1043 commented Jun 11, 2024

image

Apparently it works, but still needs the SlurmctldHost and ClusterName options. Is that the correct behaviour?

@NucciTheBoss
Copy link
Member Author

Per further investigation, the issue is that we set SLURM_CONF to $SNAP_COMMON/etc/slurm/slurm.conf in snapcraft.yaml. This is fine and dandy until when we're running Slurm in configless mode where the config file is actually located under $SNAP_COMMON/var/spool/slurmd/conf-cache/slurm.conf. This breaks all the Slurm CLI commands as they read the SLURM_CONF environment variable from their execution environment which points to $SNAP_COMMON/etc/slurm/slurm.conf.

Need to fix this before this pull request can land 🔧

Copy link
Contributor

@jedel1043 jedel1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, we can fix the spool problem later.

@NucciTheBoss
Copy link
Member Author

Yep, it seems like setting runstatedir and localstatedir when compiling slurm doesn't have any effect so we'll need to override the PID and spool dir locations directly in the slurm.conf file. Won't be too bad as we can just do this in the install hook when we fix #19.

Might need some fancy logic for locating slurm.conf file when in configless mode, but we'll cross that bridge 🌉

@NucciTheBoss NucciTheBoss merged commit 2ae816e into main Jun 11, 2024
3 checks passed
@NucciTheBoss NucciTheBoss deleted the fix-configless branch June 25, 2024 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slurm configless mode is broken because of bad PluginDir
2 participants