Slurm - RPM Binary Packages

SchedMD RPM Spec & Distribution Packages

HPC
Published

March 15, 2023

Modified

April 8, 2024

Packages

If RPM packages is the method of choice for deployment 1

  • …multiple source for packages available
  • …many clusters use RPM packages for deployment 2
  • …RPMs build from the official RPM Spec files

RPM Spec

Find the official slurm.spec 3 of SchedMD on GitHub.…

  • …defines a common denominator for all RPM based environments
  • …allows to build Slurm in a required version for any target platform
  • …includes Systemd service unit files and Slurm configuration examples

System integration is distribution specific and not included in the RPM spec. Following configuration are assumed to be done by other means:

  • …adds a slurm group and user to the system
  • …create and set permissions for directories: /etc/slurm/, /var/{lib,log,spool}/slurm
  • …create a Systemd tmp-file configuration for /run/slurm
  • …add log-rotation for /etc/logrotate.d/slurm

Distributions

Fedora

SchedMD recommends to not use RPMs from EPEL! 4

Fedora Slurm packages build from a custom slurm.spec 5 to follow Fedora policies:

  • …build limited to Slurm dependencies in versions supported in Fedora
  • …dose not support Slurm dependencies like GPU libraries
  • System integration in the Fedora RPM packages 6
    • tempfile.d config to create /run/slurm
    • /var/log/slurm and .../logrotate.d/slurm
    • /var/run/slurm, /var/spool/slurm….PID files…
    • slurm{dbd,cltd,d,restd}.service systemd units

Security patches and major version upgrades:

  • …security patches are supplied to recent Fedora releases
  • …the state of security patches for EPEL packages is unclear
  • …EPEL does not get major version updates after release
  • …new major version are typically only build for Fedora rawhide

OpenSUSE

OpenSUSE Slurm package is build from a custom slurm.spec 7:

  • …OpenSUSE Tumbleweed use a normal naming schema like slurm-23.02.7
  • …OpenSUSE Leap uses package names like slurm_23_02-23.02.7
    • …codifies version in the name
    • …likely to distinguish multiple versions on a platform

Debian

For reference… Debian/Ubuntu packages (since 23.11) 8

  • …see slurm-23.11/debian …packages will be under a common slurm-smd-* prefix
  • …avoids conflicts with the existing mix of slurm-wlm/slurm-llnl packages
  • …package layout aligned with the RPM layout from slurm.spec …not the existing unofficial Slurm debian packages

Prerequisites

Install the build environment locally:

# ...install DNF plugins package and the EPEL repository
dnf install -y dnf-plugins-core epel-release @development rpm-build
# ...enable the PowerTools Repository
dnf config-manager --set-enabled powertools

Create an Apptainer container with the build environment…

# Store build artifacts to a temporary directory...
cd $(mktemp -d /var/tmp/$USER-apptainer-XXXXXX)
export APPTAINER_TMPDIR=$PWD && export APPTAINER_CACHEDIR=$PWD

# Create an Apptainer definition file...
cat > apptainer-el8.def <<EOF
Bootstrap: docker
From: quay.io/rockylinux/rockylinux:8

%environment
    export LC_ALL=C

%post
    dnf install -y dnf-plugins-core epel-release @development rpm-build
    dnf config-manager --set-enabled powertools
EOF


# ...and build the base container
apptainer build el8.sif apptainer-el8.def

Dependencies

Read the list of dependencies 9 collected by SchedMD.

List of RPM packages for Slurm dependencies…

  • hwloc-develtask/cgroup plugin
  • hdf5-devel …HDF5 Job Profiling
  • man2html …HTML Man Pages
  • freeipmi-develacct_gather_energy/ipmi accounting plugin
  • rdma-core-develacct_gather_interconnect/ofed
  • libjwt-devel …for JWT authentication
  • lua-devel …Lua API support
  • munge-develauth/munge plugin
  • mariadb-devel …MySQL support for accounting
  • pam-devel …PAM support
  • numactl-develtask/affinity plugin
  • readline-devel …Readline support in scontrol and sacctmgr
  • rrdtool-develext_sensors/rrd plugin
  • http-parser-devel & json-c-develslurmrestd REST API
# ...install Slurm dependencies
dnf install -y \
      bzip2-devel \
      freeipmi-devel \
      glib2-devel gtk2-devel \
      hdf5-devel http-parser-devel hwloc hwloc-devel json-c-devel \
      libcurl-devel libibmad libibumad libssh2-devel libjwt-devel \
      lua lua-devel lz4-devel 
      ncurses-devel numactl numactl-devel \
      man2html mariadb-server mariadb-devel munge munge-libs munge-devel \
      openmpi openssl openssl-devel \
      pam-devel pmix-devel perl-Switch perl-ExtUtils-MakeMaker python3 \
      readline-devel rdma-core-devel rrdtool-devel \
      ucx ucx-devel ucx-ib \
      zlib-devel

OpenMPI & UCX

If possible use the latest versions of both OemPMIx and UCX…

  • PMIx compatibility should not be an issue anymore 10
  • OpenMPI (and UCX) is available from Nvidia in the MLNX OFED distribution 11

Install following packages for support…

  • ucx, ucx-devel and ucx-ib for the UCX communication layer
  • pmix-devel for PMIx support in MPI launches …enable with build configuration option --with-pmi

Hardware Support

Support for specific hardware features (GPUs, network topology, etc) and there corresponding interface libraries…

NVIDIA GPUs require libnvidia-ml development library

# ...CUDA on EL 8
dnf config-manager --add-repo \
    https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
dnf -y install cuda-nvml-devel-12-2 cuda-12-2

AMD Instinct GPUs 12 requires the ROCm development library…

  • …build configuration option to enable is --with-rsmi
  • …configuration script searches for rocm_smi.h
  • …since Fedore EPEL 9 13 a rocm-smi-devel package is available

Package Build

Mock

Build default RPMs packages for a target platform from the SchedMD source code archive 14

cd $(mktemp -d /tmp/slurm-XXXXX)

wget https://download.schedmd.com/slurm/slurm-23.11.1.tar.bz2
tar -xf slurm-23.11.1.tar.bz2

# ...create the SRPM package first
mock --root rocky+epel-8-x86_64 --install perl
mock --root rocky+epel-8-x86_64 --no-clean \
     --sources slurm-23.11.1.tar.bz2 \
     --spec slurm-23.11.1/slurm.spec \
     --buildsrpm --resultdir=$PWD

# ...build the RPM packages from the source package
mock --root rocky+epel-8-x86_64 \
     --rebuild slurm-23.11.1-1.el8.src.rpm \
     --resultdir=$PWD

Configure build options …in case of issue check the build.log

# ...install dependencies
mock --root rocky+epel-8-x86_64 --install \
      bzip2-devel \
      freeipmi-devel \
      glib2-devel gtk2-devel \
      hdf5-devel http-parser-devel hwloc hwloc-devel json-c-devel \
      libcurl-devel libibmad libibumad libssh2-devel libjwt-devel \
      lua lua-devel lz4-devel \
      ncurses-devel numactl numactl-devel \
      man2html mariadb-server mariadb-devel munge munge-libs munge-devel \
      openmpi openssl openssl-devel \
      pam-devel pmix-devel perl-Switch perl-ExtUtils-MakeMaker python3 \
      readline-devel rdma-core-devel rrdtool-devel \
      ucx ucx-devel ucx-ib \
      zlib-devel

# ...build the Slurm packages
mock --root rocky+epel-8-x86_64 --no-clean \
     --with hdf5 --with hwloc --with lua \
     --with mysql --with numa --with pmix \
     --with slurmrestd --with ucx \
     --without debug --without x11 \
     --rebuild slurm-23.11.1-1.el8.src.rpm \
     --resultdir=$PWD

Apptainer

Create an Apptainer container with all Slurm build dependencies using following definition file apptainer.def:

Bootstrap: localimage
From: el8.sif

%post
   dnf install -y \
      bzip2-devel \
      freeipmi-devel \
      glib2-devel gtk2-devel \
      hdf5-devel http-parser-devel hwloc hwloc-devel json-c-devel \
      libcurl-devel libibmad libibumad libssh2-devel libjwt-devel \
      lua lua-devel lz4-devel \
      ncurses-devel numactl numactl-devel \
      man2html mariadb-server mariadb-devel munge munge-libs munge-devel \
      openmpi openssl openssl-devel \
      pam-devel pmix-devel perl-Switch perl-ExtUtils-MakeMaker python3 \
      readline-devel rdma-core-devel rrdtool-devel \
      ucx ucx-devel ucx-ib \
      zlib-devel

Build the container from the definition file above…

apptainer build build-container.sif apptainer.def

…and build Slurm packages within this container:

# Download the required version of Slurm...
export VERSION=23.11.1
wget https://download.schedmd.com/slurm/slurm-$VERSION.tar.bz2

# ...and build the package in the Apptainer container
apptainer exec build-container.sif \
  rpmbuild -ta slurm-$VERSION.tar.bz2 \
    --with hdf5 \
    --with hwloc \
    --with lua \
    --with mysql \
    --with numa \
    --with pmix \
    --with slurmrestd \
    --with ucx \
    --without debug \
    --without x11 \
  | tee build.log

# ...packages should be available in
ls -1 ~/rpmbuild/{RPMS,SRPMS}/**/slurm*.rpm

Extend the apptainer.def file above to include RPM files from your home-directory ~/rpmbuild/ path, which may store previously build dependencies packages:

#...
%files
    ${HOME}/rpmbuild/RPMS/x86_64/* /localrepo/

%post
    #...
    dnf install -y createrepo_c
    cd /localrepo && createrepo .
    echo -e "[local-repo]\nname=local-repo\nbaseurl=/localrepo\nenabled=1\nmetadata_expire=1d\ngpgcheck=0" > /etc/yum.repos.d/local_repo.repo

Customize

Export environment variables to customize the builds process…

  • VERSION version of Slurm to build
  • DOMAIN unique identifier for your environment …to distinguish packages from Fedora EPEL 15
export VERSION=23.02.0
# ...derive domain name from the host
export DOMAIN=$(hostname -d | cut -d. -f1)

Modify the RPM slurm.spec configuration in the source archive…

# ...dowload the source archive from SchedMD
wget https://download.schedmd.com/slurm/slurm-$VERSION.tar.bz2

# ...extract the source archive
tar -xf slurm-$VERSION.tar.bz2
mv slurm-$VERSION $DOMAIN-slurm-$VERSION

# ...modify the Slurm RPM Spec configuration according to your needs
$EDITOR $DOMAIN-slurm-$VERSION/slurm.spec

Prefix the package-name to distinguish it from other Fedora EPEL packages:

Name: $DOMAIN-slurm
Conflicts: slurm-contribs,slurm-devel,slurm-doc,slurm-gui,slurm-libs,slurm-nss_slurm,slurm-openlava,slurm-pam_slurm,slurm-perlapi,slurm-pmi,slurm-pmi-devel,slurm-rrdtool,slurm-slurmctld,slurm-slurmd,slurm-slurmdbd,slurm-slurmrestd,slurm-torque

Create a new source archive…

tar -cjf $DOMAIN-slurm-$VERSION.tar.bz2 $DOMAIN-slurm-$VERSION
rm -rf $DOMAIN-slurm-$VERSION

Build new Slurm packages …build options described in slurm.spec:

# ...run the RPM build command
rm -rf ~/rpmbuild \
  && rpmbuild -ta $DOMAIN-slurm-$VERSION.tar.bz2 \
    --define "%domain $DOMAIN" \
    --with hdf5 \
    --with hwloc \
    --with lua \
    --with mysql \
    --with numa \
    --with pmix \
    --with slurmrestd \
    --with ucx \
    --without debug \
    --without x11 \
  | tee rpmbuild.log

# ...copy RPM packages from the build directory
cp ~/rpmbuild/{RPMS,SRPMS}/**/*.rpm .

# ...list all files in the packages
rpm -qlp $DOMAIN-slurm*.rpm

Modify the installation prefix with a RPM macro file 16:

# ...unconventional file locations
cat >> ~/.rpmmacros <<EOF
%_prefix /opt/slurm/$VERSION
%_slurm_sysconfdir %{_prefix}/etc/slurm
%_defaultdocdir %{_prefix}/doc
EOF

Test

First test usually performed in a virtual environment…

  • …one option is to us Vagrant to setup a test-environment 17
  • …otherwise create an accessible RPM package repository and use it from a dedicated test infrastructure

Upgrades

In the Slurm User Group (SUG) meetings following details are presented on the topic of upgrades 18:

  • Upgrade Slurm to take advantage of…
    • …security patches
    • …performance improvements
    • …new features (i.e. support for recent hardware)
  • SchedMD developers provide bug fixes only for the recent releases
  • Support contracts required to stay on deprecated releases

Release Cycle

Major releases move at a six month cycle (since 2024/Q2) 19

  • Supports upgrades from prior three releases (beginning with 24.11)
  • 18-month support cycle by SchedMD 20
  • Slurm release 21 naming convention…
    • …major releases <year>.<month> for example 24.05 (May 2024)
    • …minor releases suffixed by a numeric counter for example 23.11.5
  • Minor maintenance releases with a six week cadence

Compatibility

Changes in the RPCs (remote procedure calls) and state files…

  • …only be made if the major release number changes
  • …may require to rebuild applications…
    • …using Slurm MPI libraries
    • …locally developed Slurm plugins

Slurm daemons will support RPCs and state files…

  • …from the two previous major releases
  • …means that upgrading at least once each year is recommended

Recommendations

SchedMD discourages use of packages for deployment …makes upgrade to the software more difficult 22 23:

“RPMs do make this process difficult to do with the system live.” …While we ship and support the slurm.spec file, we do not actually recommend using RPMs to install Slurm.

We suggest structuring installs in version-specific directories, and using sym-links and/or module files to manage versions …this makes rolling upgrades much simpler.

Usually a slide is includes with following proposed Slurm deployment structure, usually hosted on a shared network file-system (typically NFS):

./configure --prefix=/apps/slurm/21.08.0/ --sysconfdir=/apps/slurm/etc/
ln -s /apps/slurm/21.08.0 /apps/slurm/dbd
ln -s /apps/slurm/21.08.0 /apps/slurm/ctld
ln -s /apps/slurm/21.08.0 /apps/slurm/d
ln -s /apps/slurm/21.08.0 /apps/slurm/current
  • Use the appropriate symlink in each service file, and add /apps/slurm current symlink into $PATH (through /etc/profile.d/ or a module file).
  • This makes a rolling upgrade much simpler, just move the symlink when ready to move that component forward onto the newer release.

Live-Upgrades

“Live-Upgrade” in this context means to restart slurmdbd (including database migration) and slurmctld within a small enough time frame to not interrupt service on the cluster. The tolerances for service interrupts are defined by SlurmctldTimeout and SlurmdTimeout. If the Slurm daemons are down for longer than the specified timeout during an upgrade, nodes may be marked DOWN and their jobs killed. To further clarify: If slurmd daemons are not able to contact slurmctld within the specified tolerance it is unavoidable that the payload of the entire cluster is killed.

The first time slurmdbd is started after an upgrade it will take some time to update existing records in the database. If slurmdbd is started with systemd, it may think that slurmdbd is not responding and kill the process when it reaches its timeout value, which causes problems with the upgrade. We recommend starting slurmdbd by calling the command directly rather than using systemd when performing an upgrade.

The run-time of the database migration depends on the size of the accounting database. Therefore it relevant to have a reasonable estimate about the run-time of the database migration. Unfortunately it is difficult to find relevant references about this particular issue. Make a dry run database upgrade (Nilfheim Supercomputing Center, Denmark) may be interesting to gain experience with the production environment on site.

A common approach when performing upgrades is to install the new version of Slurm to a unique directory and use a symbolic link to point the directory in your PATH to the version of Slurm you would like to use. This allows you to install the new version before you are in a maintenance period as well as easily switch between versions should you need to roll back for any reason. It also avoids potential problems with library conflicts that might arise from installing different versions to the same directory.

Most sites do the upgrade only after draining the cluster. If you want to perform a live upgrade please open a support request with SchedMD. (Note that this is only possible for paying customers)

Typically upgrade for (mariadb,) slurmdbd, and slurmctld (as well as the underlying service nodes) are independent operations in sequential order. Once the services are back to production then slurmd services on all compute nodes are upgraded incrementally in groups rolling over all nodes.

Footnotes

  1. Slurm Quick Start Administrator Guide, SchedMD
    https://slurm.schedmd.com/quickstart_admin.html↩︎

  2. Slurm Installation and Upgrading, Nilfheim Cluster
    https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/#build-slurm-rpms↩︎

  3. Slurm RPM Spec, SchedMD, GitHub
    https://github.com/SchedMD/slurm/blob/master/slurm.spec↩︎

  4. Slurm Community BoF, SchedMD SC23, 2023/11
    https://slurm.schedmd.com/SC23/Slurm-SC23-BOF.pdf↩︎

  5. slurm.spec, Fedora Project
    https://src.fedoraproject.org/rpms/slurm/blob/rawhide/f/slurm.spec↩︎

  6. slurm.spec line 254, Fedora Project
    https://src.fedoraproject.org/rpms/slurm/blob/rawhide/f/slurm.spec#_294↩︎

  7. slurm.spec, OpenSUSE Build Service
    https://build.opensuse.org/package/view_file/network:cluster/slurm/slurm.spec↩︎

  8. Slurm Community BoF, SchedMD SC23, 2023/11
    https://slurm.schedmd.com/SC23/Slurm-SC23-BOF.pdf↩︎

  9. Slurm Dependencies, SchedMD
    https://slurm.schedmd.com/download.html↩︎

  10. PMIx Slurm Compatibility Matrix, OpenPMIx Project
    https://openpmix.github.io/support/how-to/slurm-support.html↩︎

  11. MLNX_OFED Linux drivers, NVIDIA
    https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/↩︎

  12. AMD Instinct MI Series, AMD
    https://www.amd.com/en/support/server-accelerators/amd-instinct/amd-instinct-mi-series/instinct-mi100↩︎

  13. AMD ROCm Packages, Fedora Project
    https://packages.fedoraproject.org/search?query=rocm↩︎

  14. Slurm Download, SchedMD
    https://www.schedmd.com/downloads.php↩︎

  15. Fedora EPEL Slurm Packages, Fedora Project
    https://src.fedoraproject.org/rpms/slurm↩︎

  16. RPM Macros, Fedora Project
    https://docs.fedoraproject.org/en-US/packaging-guidelines/RPMMacros↩︎

  17. Vagrant Test Environment, GitHub
    https://github.com/vpenso/vagrant-playground/tree/master/slurm/packages↩︎

  18. Field Notes From the Frontlines of Support, SUG 2021
    https://slurm.schedmd.com/SLUG21/Field_Notes_5.pdf
    https://www.youtube.com/watch?v=-YAW-PBvLJc↩︎

  19. Slurm releases move to a six-month cycle, SchedMD Blog
    https://www.schedmd.com/slurm-releases-move-to-a-six-month-cycle/↩︎

  20. Slurm Support, SchedMD
    https://www.schedmd.com/slurm-support/↩︎

  21. Slurm Releases, SchedMD
    https://www.schedmd.com/download-slurm
    https://github.com/SchedMD/slurm/releases↩︎

  22. Field Notes From the Frontlines of Support, SUG 2020
    https://slurm.schedmd.com/SLUG20/Field_Notes.pdf
    https://www.youtube.com/watch?v=F8CZaqOQ4Sk↩︎

  23. Field Notes From the Frontlines of Support, SUG 2021
    https://slurm.schedmd.com/SLUG21/Field_Notes_5.pdf
    https://www.youtube.com/watch?v=-YAW-PBvLJc↩︎