Slurm: RPM Binary Packages

SchedMD RPM Spec & Distribution Packages


March 15, 2023


April 8, 2024


If RPM packages is the method of choice for deployment 1

  • …multiple source for packages available
  • …many clusters use RPM packages for deployment 2
  • …RPMs build from the official RPM Spec files

RPM Spec

Find the official slurm.spec 3 of SchedMD on GitHub.…

  • …defines a common denominator for all RPM based environments
  • …allows to build Slurm in a required version for any target platform
  • …includes Systemd service unit files and Slurm configuration examples

System integration is distribution specific and not included in the RPM spec. Following configuration are assumed to be done by other means:

  • …adds a slurm group and user to the system
  • …create and set permissions for directories: /etc/slurm/, /var/{lib,log,spool}/slurm
  • …create a Systemd tmp-file configuration for /run/slurm
  • …add log-rotation for /etc/logrotate.d/slurm



SchedMD recommends to not use RPMs from EPEL! 4

Fedora Slurm packages build from a custom slurm.spec 5 to follow Fedora policies:

  • …build limited to Slurm dependencies in versions supported in Fedora
  • …dose not support Slurm dependencies like GPU libraries
  • System integration in the Fedora RPM packages 6
    • tempfile.d config to create /run/slurm
    • /var/log/slurm and .../logrotate.d/slurm
    • /var/run/slurm, /var/spool/slurm….PID files…
    • slurm{dbd,cltd,d,restd}.service systemd units

Security patches and major version upgrades:

  • …security patches are supplied to recent Fedora releases
  • …the state of security patches for EPEL packages is unclear
  • …EPEL does not get major version updates after release
  • …new major version are typically only build for Fedora rawhide


OpenSUSE Slurm package is build from a custom slurm.spec 7:

  • …OpenSUSE Tumbleweed use a normal naming schema like slurm-23.02.7
  • …OpenSUSE Leap uses package names like slurm_23_02-23.02.7
    • …codifies version in the name
    • …likely to distinguish multiple versions on a platform


For reference… Debian/Ubuntu packages (since 23.11) 8

  • …see slurm-23.11/debian …packages will be under a common slurm-smd-* prefix
  • …avoids conflicts with the existing mix of slurm-wlm/slurm-llnl packages
  • …package layout aligned with the RPM layout from slurm.spec …not the existing unofficial Slurm debian packages


Install the build environment locally:

# ...install DNF plugins package and the EPEL repository
dnf install -y dnf-plugins-core epel-release @development rpm-build
# ...enable the PowerTools Repository
dnf config-manager --set-enabled powertools

Create an Apptainer container with the build environment…

# Store build artifacts to a temporary directory...
cd $(mktemp -d /var/tmp/$USER-apptainer-XXXXXX)

# Create an Apptainer definition file...
cat > apptainer-el8.def <<EOF
Bootstrap: docker

    export LC_ALL=C

    dnf install -y dnf-plugins-core epel-release @development rpm-build
    dnf config-manager --set-enabled powertools

# ...and build the base container
apptainer build el8.sif apptainer-el8.def


Read the list of dependencies 9 collected by SchedMD.

List of RPM packages for Slurm dependencies…

  • hwloc-develtask/cgroup plugin
  • hdf5-devel …HDF5 Job Profiling
  • man2html …HTML Man Pages
  • freeipmi-develacct_gather_energy/ipmi accounting plugin
  • rdma-core-develacct_gather_interconnect/ofed
  • libjwt-devel …for JWT authentication
  • lua-devel …Lua API support
  • munge-develauth/munge plugin
  • mariadb-devel …MySQL support for accounting
  • pam-devel …PAM support
  • numactl-develtask/affinity plugin
  • readline-devel …Readline support in scontrol and sacctmgr
  • rrdtool-develext_sensors/rrd plugin
  • http-parser-devel & json-c-develslurmrestd REST API
# ...install Slurm dependencies
dnf install -y \
      bzip2-devel \
      freeipmi-devel \
      glib2-devel gtk2-devel \
      hdf5-devel http-parser-devel hwloc hwloc-devel json-c-devel \
      libcurl-devel libibmad libibumad libssh2-devel libjwt-devel \
      lua lua-devel lz4-devel 
      ncurses-devel numactl numactl-devel \
      man2html mariadb-server mariadb-devel munge munge-libs munge-devel \
      openmpi openssl openssl-devel \
      pam-devel pmix-devel perl-Switch perl-ExtUtils-MakeMaker python3 \
      readline-devel rdma-core-devel rrdtool-devel \
      ucx ucx-devel ucx-ib \


If possible use the latest versions of both OemPMIx and UCX…

  • PMIx compatibility should not be an issue anymore 10
  • OpenMPI (and UCX) is available from Nvidia in the MLNX OFED distribution 11

Install following packages for support…

  • ucx, ucx-devel and ucx-ib for the UCX communication layer
  • pmix-devel for PMIx support in MPI launches …enable with build configuration option --with-pmi

Hardware Support

Support for specific hardware features (GPUs, network topology, etc) and there corresponding interface libraries…

NVIDIA GPUs require libnvidia-ml development library

# ...CUDA on EL 8
dnf config-manager --add-repo \
dnf -y install cuda-nvml-devel-12-2 cuda-12-2

AMD Instinct GPUs 12 requires the ROCm development library…

  • …build configuration option to enable is --with-rsmi
  • …configuration script searches for rocm_smi.h
  • …since Fedore EPEL 9 13 a rocm-smi-devel package is available

Package Build


Build default RPMs packages for a target platform from the SchedMD source code archive 14

cd $(mktemp -d /tmp/slurm-XXXXX)

tar -xf slurm-23.11.1.tar.bz2

# ...create the SRPM package first
mock --root rocky+epel-8-x86_64 --install perl
mock --root rocky+epel-8-x86_64 --no-clean \
     --sources slurm-23.11.1.tar.bz2 \
     --spec slurm-23.11.1/slurm.spec \
     --buildsrpm --resultdir=$PWD

# the RPM packages from the source package
mock --root rocky+epel-8-x86_64 \
     --rebuild slurm-23.11.1-1.el8.src.rpm \

Configure build options …in case of issue check the build.log

# ...install dependencies
mock --root rocky+epel-8-x86_64 --install \
      bzip2-devel \
      freeipmi-devel \
      glib2-devel gtk2-devel \
      hdf5-devel http-parser-devel hwloc hwloc-devel json-c-devel \
      libcurl-devel libibmad libibumad libssh2-devel libjwt-devel \
      lua lua-devel lz4-devel \
      ncurses-devel numactl numactl-devel \
      man2html mariadb-server mariadb-devel munge munge-libs munge-devel \
      openmpi openssl openssl-devel \
      pam-devel pmix-devel perl-Switch perl-ExtUtils-MakeMaker python3 \
      readline-devel rdma-core-devel rrdtool-devel \
      ucx ucx-devel ucx-ib \

# the Slurm packages
mock --root rocky+epel-8-x86_64 --no-clean \
     --with hdf5 --with hwloc --with lua \
     --with mysql --with numa --with pmix \
     --with slurmrestd --with ucx \
     --without debug --without x11 \
     --rebuild slurm-23.11.1-1.el8.src.rpm \


Create an Apptainer container with all Slurm build dependencies using following definition file apptainer.def:

Bootstrap: localimage
From: el8.sif

   dnf install -y \
      bzip2-devel \
      freeipmi-devel \
      glib2-devel gtk2-devel \
      hdf5-devel http-parser-devel hwloc hwloc-devel json-c-devel \
      libcurl-devel libibmad libibumad libssh2-devel libjwt-devel \
      lua lua-devel lz4-devel \
      ncurses-devel numactl numactl-devel \
      man2html mariadb-server mariadb-devel munge munge-libs munge-devel \
      openmpi openssl openssl-devel \
      pam-devel pmix-devel perl-Switch perl-ExtUtils-MakeMaker python3 \
      readline-devel rdma-core-devel rrdtool-devel \
      ucx ucx-devel ucx-ib \

Build the container from the definition file above…

apptainer build build-container.sif apptainer.def

…and build Slurm packages within this container:

# Download the required version of Slurm...
export VERSION=23.11.1

# ...and build the package in the Apptainer container
apptainer exec build-container.sif \
  rpmbuild -ta slurm-$VERSION.tar.bz2 \
    --with hdf5 \
    --with hwloc \
    --with lua \
    --with mysql \
    --with numa \
    --with pmix \
    --with slurmrestd \
    --with ucx \
    --without debug \
    --without x11 \
  | tee build.log

# ...packages should be available in
ls -1 ~/rpmbuild/{RPMS,SRPMS}/**/slurm*.rpm

Extend the apptainer.def file above to include RPM files from your home-directory ~/rpmbuild/ path, which may store previously build dependencies packages:

    ${HOME}/rpmbuild/RPMS/x86_64/* /localrepo/

    dnf install -y createrepo_c
    cd /localrepo && createrepo .
    echo -e "[local-repo]\nname=local-repo\nbaseurl=/localrepo\nenabled=1\nmetadata_expire=1d\ngpgcheck=0" > /etc/yum.repos.d/local_repo.repo


Export environment variables to customize the builds process…

  • VERSION version of Slurm to build
  • DOMAIN unique identifier for your environment …to distinguish packages from Fedora EPEL 15
export VERSION=23.02.0
# ...derive domain name from the host
export DOMAIN=$(hostname -d | cut -d. -f1)

Modify the RPM slurm.spec configuration in the source archive…

# ...dowload the source archive from SchedMD

# ...extract the source archive
tar -xf slurm-$VERSION.tar.bz2
mv slurm-$VERSION $DOMAIN-slurm-$VERSION

# ...modify the Slurm RPM Spec configuration according to your needs
$EDITOR $DOMAIN-slurm-$VERSION/slurm.spec

Prefix the package-name to distinguish it from other Fedora EPEL packages:

Name: $DOMAIN-slurm
Conflicts: slurm-contribs,slurm-devel,slurm-doc,slurm-gui,slurm-libs,slurm-nss_slurm,slurm-openlava,slurm-pam_slurm,slurm-perlapi,slurm-pmi,slurm-pmi-devel,slurm-rrdtool,slurm-slurmctld,slurm-slurmd,slurm-slurmdbd,slurm-slurmrestd,slurm-torque

Create a new source archive…

tar -cjf $DOMAIN-slurm-$VERSION.tar.bz2 $DOMAIN-slurm-$VERSION
rm -rf $DOMAIN-slurm-$VERSION

Build new Slurm packages …build options described in slurm.spec:

# the RPM build command
rm -rf ~/rpmbuild \
  && rpmbuild -ta $DOMAIN-slurm-$VERSION.tar.bz2 \
    --define "%domain $DOMAIN" \
    --with hdf5 \
    --with hwloc \
    --with lua \
    --with mysql \
    --with numa \
    --with pmix \
    --with slurmrestd \
    --with ucx \
    --without debug \
    --without x11 \
  | tee rpmbuild.log

# ...copy RPM packages from the build directory
cp ~/rpmbuild/{RPMS,SRPMS}/**/*.rpm .

# ...list all files in the packages
rpm -qlp $DOMAIN-slurm*.rpm

Modify the installation prefix with a RPM macro file 16:

# ...unconventional file locations
cat >> ~/.rpmmacros <<EOF
%_prefix /opt/slurm/$VERSION
%_slurm_sysconfdir %{_prefix}/etc/slurm
%_defaultdocdir %{_prefix}/doc


First test usually performed in a virtual environment…

  • …one option is to us Vagrant to setup a test-environment 17
  • …otherwise create an accessible RPM package repository and use it from a dedicated test infrastructure


In the Slurm User Group (SUG) meetings following details are presented on the topic of upgrades 18:

  • Upgrade Slurm to take advantage of…
    • …security patches
    • …performance improvements
    • …new features (i.e. support for recent hardware)
  • SchedMD developers provide bug fixes only for the recent releases
  • Support contracts required to stay on deprecated releases

Release Cycle

Major releases move at a six month cycle (since 2024/Q2) 19

  • Supports upgrades from prior three releases (beginning with 24.11)
  • 18-month support cycle by SchedMD 20
  • Slurm release 21 naming convention…
    • …major releases <year>.<month> for example 24.05 (May 2024)
    • …minor releases suffixed by a numeric counter for example 23.11.5
  • Minor maintenance releases with a six week cadence


Changes in the RPCs (remote procedure calls) and state files…

  • …only be made if the major release number changes
  • …may require to rebuild applications…
    • …using Slurm MPI libraries
    • …locally developed Slurm plugins

Slurm daemons will support RPCs and state files…

  • …from the two previous major releases
  • …means that upgrading at least once each year is recommended


SchedMD discourages use of packages for deployment …makes upgrade to the software more difficult 22 23:

“RPMs do make this process difficult to do with the system live.” …While we ship and support the slurm.spec file, we do not actually recommend using RPMs to install Slurm.

We suggest structuring installs in version-specific directories, and using sym-links and/or module files to manage versions …this makes rolling upgrades much simpler.

Usually a slide is includes with following proposed Slurm deployment structure, usually hosted on a shared network file-system (typically NFS):

./configure --prefix=/apps/slurm/21.08.0/ --sysconfdir=/apps/slurm/etc/
ln -s /apps/slurm/21.08.0 /apps/slurm/dbd
ln -s /apps/slurm/21.08.0 /apps/slurm/ctld
ln -s /apps/slurm/21.08.0 /apps/slurm/d
ln -s /apps/slurm/21.08.0 /apps/slurm/current
  • Use the appropriate symlink in each service file, and add /apps/slurm current symlink into $PATH (through /etc/profile.d/ or a module file).
  • This makes a rolling upgrade much simpler, just move the symlink when ready to move that component forward onto the newer release.


“Live-Upgrade” in this context means to restart slurmdbd (including database migration) and slurmctld within a small enough time frame to not interrupt service on the cluster. The tolerances for service interrupts are defined by SlurmctldTimeout and SlurmdTimeout. If the Slurm daemons are down for longer than the specified timeout during an upgrade, nodes may be marked DOWN and their jobs killed. To further clarify: If slurmd daemons are not able to contact slurmctld within the specified tolerance it is unavoidable that the payload of the entire cluster is killed.

The first time slurmdbd is started after an upgrade it will take some time to update existing records in the database. If slurmdbd is started with systemd, it may think that slurmdbd is not responding and kill the process when it reaches its timeout value, which causes problems with the upgrade. We recommend starting slurmdbd by calling the command directly rather than using systemd when performing an upgrade.

The run-time of the database migration depends on the size of the accounting database. Therefore it relevant to have a reasonable estimate about the run-time of the database migration. Unfortunately it is difficult to find relevant references about this particular issue. Make a dry run database upgrade (Nilfheim Supercomputing Center, Denmark) may be interesting to gain experience with the production environment on site.

A common approach when performing upgrades is to install the new version of Slurm to a unique directory and use a symbolic link to point the directory in your PATH to the version of Slurm you would like to use. This allows you to install the new version before you are in a maintenance period as well as easily switch between versions should you need to roll back for any reason. It also avoids potential problems with library conflicts that might arise from installing different versions to the same directory.

Most sites do the upgrade only after draining the cluster. If you want to perform a live upgrade please open a support request with SchedMD. (Note that this is only possible for paying customers)

Typically upgrade for (mariadb,) slurmdbd, and slurmctld (as well as the underlying service nodes) are independent operations in sequential order. Once the services are back to production then slurmd services on all compute nodes are upgraded incrementally in groups rolling over all nodes.


  1. Slurm Quick Start Administrator Guide, SchedMD↩︎

  2. Slurm Installation and Upgrading, Nilfheim Cluster↩︎

  3. Slurm RPM Spec, SchedMD, GitHub↩︎

  4. Slurm Community BoF, SchedMD SC23, 2023/11↩︎

  5. slurm.spec, Fedora Project↩︎

  6. slurm.spec line 254, Fedora Project↩︎

  7. slurm.spec, OpenSUSE Build Service↩︎

  8. Slurm Community BoF, SchedMD SC23, 2023/11↩︎

  9. Slurm Dependencies, SchedMD↩︎

  10. PMIx Slurm Compatibility Matrix, OpenPMIx Project↩︎

  11. MLNX_OFED Linux drivers, NVIDIA↩︎

  12. AMD Instinct MI Series, AMD↩︎

  13. AMD ROCm Packages, Fedora Project↩︎

  14. Slurm Download, SchedMD↩︎

  15. Fedora EPEL Slurm Packages, Fedora Project↩︎

  16. RPM Macros, Fedora Project↩︎

  17. Vagrant Test Environment, GitHub↩︎

  18. Field Notes From the Frontlines of Support, SUG 2021↩︎

  19. Slurm releases move to a six-month cycle, SchedMD Blog↩︎

  20. Slurm Support, SchedMD↩︎

  21. Slurm Releases, SchedMD↩︎

  22. Field Notes From the Frontlines of Support, SUG 2020↩︎

  23. Field Notes From the Frontlines of Support, SUG 2021↩︎