Warewulf 4.x HPC Provisioning

HPC
Published

March 23, 2023

Modified

March 28, 2023

Warewulf – https://warewulf.org …bare metal, stateless, container-based HPC cluster provisioning solution:

Benefits…

Drawbacks…

Virtual Test Environment

Following example uses a virtual test environment …configuration and setup with LibVirt and Vagrant…

# work in a disposable path
pushd $(mktemp -d /tmp/$USER-vagrant-XXXXXX)

Network

Provisioning based on the assumption to operate in a private network…

  • …default configuration uses the 192.168.200.0/24 network
  • …requires UEFI PXE LAN boot to load an EFI expansion ROM for the boot of nodes
  • Libvirt dnsmasq used with a custom network configuration …only for service nodes
    • host elements (lines 11-15) configure MAC- and IP-addresses for the nodes
    • dnsmasq element passes options to the underlying dnsmasq
    • dhcp-ignore=tag:!known prevents dnsmasq to respond to unknown MAC-addresses

Configure a custom private network with custom.xml:

<network xmlns:dnsmasq='http://libvirt.org/schemas/network/dnsmasq/1.0'>
  <name>custom</name>
  <uuid>356e5daf-3b06-49d5-98fd-178e847cf559</uuid>
  <forward mode='nat'/>
  <bridge name='virbr9' stp='on' delay='0'/>
  <mac address='52:54:00:aa:da:73'/>
  <ip address='192.168.200.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.200.2' end='192.168.200.254'/>
      <host mac='52:54:00:00:00:02' name='wwctl' ip='192.168.200.2' />
    </dhcp>
  </ip>
  <dnsmasq:options>
    <dnsmasq:option value='dhcp-ignore=tag:!known'/>
  </dnsmasq:options>
</network>
# load the configuration file and enable the custom network
virsh net-define custom.xml && virsh net-start custom

DHCP configuration file in /var/lib/libvirt/dnsmasq/custom.conf.

# list all DHCP leases
virsh net-dhcp-leases custom

# modify the network configuration
virsh net-edit custom && virsh net-destroy custom && virsh net-start custom

# add additional nodes to the DHCP configuration
virsh net-update custom add --live --config \
      ip-dhcp-host "<host mac='52:54:00:00:00:03' name='service1' ip='192.168.200.3' />"

Head Node

Build a Warewulf server instance from the following Vagrantfile

  • …connect the instance to the custom network with IP 192.168.200.2 (line 10)
  • …install Warewulf from the RPM package hosted on GitHub (line 14)
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
  config.vm.box = "almalinux/8"
  config.vm.hostname = "wwctl"
  config.vm.provider  do |libvirt|
    libvirt.memory = 1024
    libvirt.cpus = 1
    libvirt.qemu_use_session = false
  end
  config.vm.network ,  => "192.168.200.2",  => "custom"
  config.vm.provision "shell", true , <<-SHELL
     dnf install -y epel-release
     dnf config-manager --set-enabled powertools
     dnf install -y https://github.com/hpcng/warewulf/releases/download/v4.4.0/warewulf-4.4.0-1.git_afcdb21.el8.x86_64.rpm
     systemctl disable --now firewalld     
  SHELL
end
# start the instance
vagrant up

# login after provisioning
vagrant ssh

Configuration

Service configuration in /etc/warewulf/warewulf.confmodify the ipaddr of the network interface:

# /etc/warewulf/warewulf.conf
#...
ipaddr: 192.168.200.2
#...

Start all required serviceslog information written to /var/log/warewulfd.log by default

# start system services Warewulf depends on
wwctl configure --all

# start the Warewulf service
systemctl enable --now warewulfd

# check server status with
wwctl server status

Pull a base OS container (VNFS) from Docker Hub…

# import a RockyLinux 8 OCI container image 
wwctl container import docker://ghcr.io/hpcng/warewulf-rockylinux:8 rocky-8

adjust the network configuration and the container image for the default profile

# set default networking configurations
wwctl profile set --yes --netdev eth0 --netmask 255.255.255.0 --gateway 192.168.200.1 "default"

# assoicate the container image to the default profile
wwctl profile set --yes --container rocky-8 "default"

# list available profiles
wwctl profile list -a

Adding a node configuration

  • associates a node to the default profiles (unless adjusted by option)
  • …identifies the node by its IP-address provided with option --ipaddr
  • with --discoverable yes enables the registration of the MAC-address during boot
# add a new node to the configuraqtion
wwctl node add node1 --ipaddr 192.168.200.3 --discoverable yes

Use virt-install to start a virtual machine instance…

  • --pxe and --boot uefi to enable UEFI PXE LAN boot
  • --network options connects with custom network
  • make sure to provide enough memory for the container image size
virt-install \
      --osinfo detect=on,require=off \
      --name node1 --memory 4096 --disk size=10 \
      --boot uefi,menu=on,useserial=on \
      --pxe --network network=custom,mac=52:54:00:00:01:01 &

After successful PXE boot the node should register with the server:

# show provisioning status of nodes 
>>> wwctl node status
NODENAME             STAGE                SENT                      LASTSEEN (s)
================================================================================
node1                RUNTIME_OVERLAY      generic.img.gz            47        

# print the configuration for a node
>>> wwctl node list -a node1 | grep hwaddr
node1                default:hwaddr     --           52:54:00:00:01:01

…login to the node with ssh root@192.168.200.3

DHCP & TFTP

DHCP and TFTP services can be configured individually…

# configure the DHCP server
wwctl configure dhcp

# list DHCP requests
journalctl -g dhcpd | grep -i discover
  • …generate the configuration in /etc/dhcp/dhcpd.conf
  • …start the dhcpd.service systemd unit
# configure the TFTP server
wwctl configure tftp
  • …serves files from /var/lib/tftpboot
  • journalctl -t in.tftpd prints logs

Container Image

Read documentation about container management

  • template image for the compute nodes …“Virtual Node File System” (VNFS)
    • …supports image import from any Open OCI-compliant registry
    • …note that most Docker containers are not “bootable” …have a limited version of Systemd
  • does not require a container runtime on the nodes
    • …provisions a container image to bare metal before boot
    • …reboot does not persist any information on the node
    • …Linux kernel included in the container image
  • …bootable container provided on DockerHub
  • creating a container from scratch is described in the documentation
# show available containers
wwctl container list

Use the container exec to start an interactive shell in a container image:

# ...add a password to the root account in the container
>>> wwctl container exec rocky-8 /bin/bash
[rocky-8] Warewulf> dnf install -y passwd
[rocky-8] Warewulf> passwd root
exit
+ LANG=C
+ LC_CTYPE=C
+ export LANG LC_CTYPE
+ dnf clean all
25 files removed
Rebuilding container...
Created image for VNFS container rocky-8: /var/lib/warewulf/container/rocky-8.img
Compressed image for VNFS container rocky-8: /var/lib/warewulf/container/rocky-8.img.gz

The container image will be rebuild automatically after exiting the shell.

Overlays

Custom configuration applied with overlays

  • …at different stages through the provisioning process
  • …overlay images stored in /var/lib/warewulf/overlays

The node list sub-command will show which overlays are applied to a node

>>> wwctl node list --long
NODE NAME              KERNEL OVERRIDE  CONTAINER        OVERLAYS (S/R)
=====================================================================================
node1                  --               rocky-8          (wwinit)/(generic)

System Overlay

Available before Systemd init is called …not updated during run-time

  • …contains udev rules to configure network device names
  • …loops scripts in /wwinit/warwulf/init.d/*
# print files in the system overlay
wwctl overlay list -l wwinit

Runtime Overlay

…aka gerneric overlay….

>>> wwctl overlay list generic -l
PERM MODE    UID GID   SYSTEM-OVERLAY     FILE PATH
-rwxr-xr-x     0 991   generic            /etc/
-rw-r--r--     0 991   generic            /etc/group.ww
-rw-r--r--     0 991   generic            /etc/hosts.ww
-rw-r--r--     0 991   generic            /etc/passwd.ww
-rwxr-xr-x     0 991   generic            /root/
-rwxr-xr-x     0 991   generic            /root/.ssh/
-rw-r--r--     0 991   generic            /root/.ssh/authorized_keys.ww
  • …includes password, groups and SSH authorized keys configuration buy default
  • …use this overlay for dynamic configuration files like slurm.conf

wwclient

Updates the run-time overlay on a regular basis (defaults to once per minute)…

  • wwclient agent run as Systemd unit
    • …provisioned as part of the wwinit system overlay
    • …assists with the init process during boot
    • deploys node overlays during boot and runtime
  • …reads its own initialization scripts from /warewulf/init.d/
>>> systemctl cat wwclient
# /etc/systemd/system/wwclient.service
# ...
[Service]
Type=notify
ExecStart=/warewulf/wwclient
ExecReload=/bin/kill -s SIGHUP "$MAINPID"
PIDFile=/var/run/wwclient.pid
TimeoutSec=60
#...

journalctl -fu wwclient follow the overlay update process.

Templates

Templates allow dynamic content

  • …template files will end with the suffix of .ww
  • …uses the text/template engine
# print a specific file from an overlay
>>> wwctl overlay show generic /root/.ssh/authorized_keys.ww
{{Include "/root/.ssh/authorized_keys"}}

Include read a file from the head node root file-system…

  • an SSH key-pair has been generated at /root/.ssh/cluster{.pub}
  • the public key is appended to /root/.ssh/authorized_keys
  • …this enables ssh login to all nodes from the head node …as well as wwctl ssh sub-command

Using Overlays

Import a new file into the runtime overlay…

>>> echo "My message of the day..." > /tmp/motd
>>> wwctl overlay import generic /tmp/motd /etc/motd
Building overlay for node1: [generic]
Created image for overlay node1/[generic]: /var/lib/warewulf/overlays/node1/generic.img
Compressed image for overlay node1/[generic]: /var/lib/warewulf/overlays/node1/generic.img.gz
# edit a file within an overlay (in-place)
wwctl overlay edit generic /etc/motd

automatically rebuild the overlay imagepopulate the configuration to the associated nodes via wwclient

# ...delte a file from an overlay
wwctl overlay remove generic /etc/motd

# empty /etc/motd on a node
wwctl ssh $node "echo '' > /etc/motd"

References

Community…

Media…