Warewulf 4.x HPC Provisioning
Warewulf – https://warewulf.org …bare metal, stateless, container-based HPC cluster provisioning solution:
- …the most important feature is stateless provisioning
- …does not require installation of an OS on local storage …“re-installs the operating system on every boot”
- …retrieves the base OS image from the network …live OS executes directly from RAM
- …stateless provisioning does not mean disk-less nodes …node storage devices can server
\tmp
(scratch)
Benefits…
- …guarantees identical nodes (no configuration drift) …consequently single faulty nodes typically a hardware issue
- …replaces network installation like Anaconda & Kickstart
- …replaces configuration management systems like Ansible, Puppet, Chef
- …non-persistence reduces attack surface from IT security perspective
- …updates & security patches applied to container image …decouples nodes from package repositories
- …facilitates use of CI/CD workflows with container images
Drawbacks…
- …cluster mass (re-)boot depends on the network performance and stability
- …reboot depends on network services like DHCP/PXE
- …the root file-system of the base OS allocates portions of the main memory (RAM)
- …size depends on the content of the base container image …specifically when application software included
- …SWAP may be used to alleviate this effect with huge container images
- …logs/debugging required to be stored on central server for trouble-shooting
Virtual Test Environment
Following example uses a virtual test environment …configuration and setup with LibVirt and Vagrant…
# work in a disposable path
pushd $(mktemp -d /tmp/$USER-vagrant-XXXXXX)
Network
Provisioning based on the assumption to operate in a private network…
- …default configuration uses the
192.168.200.0/24
network - …requires UEFI PXE LAN boot to load an EFI expansion ROM for the boot of nodes
- Libvirt
dnsmasq
used with a custom network configuration …only for service nodes- …
host
elements (lines 11-15) configure MAC- and IP-addresses for the nodes - …
dnsmasq
element passes options to the underlyingdnsmasq
- …
dhcp-ignore=tag:!known
preventsdnsmasq
to respond to unknown MAC-addresses
- …
Configure a custom private network with custom.xml
:
<network xmlns:dnsmasq='http://libvirt.org/schemas/network/dnsmasq/1.0'>
<name>custom</name>
<uuid>356e5daf-3b06-49d5-98fd-178e847cf559</uuid>
<forward mode='nat'/>
<bridge name='virbr9' stp='on' delay='0'/>
<mac address='52:54:00:aa:da:73'/>
<ip address='192.168.200.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.200.2' end='192.168.200.254'/>
<host mac='52:54:00:00:00:02' name='wwctl' ip='192.168.200.2' />
</dhcp>
</ip>
<dnsmasq:options>
<dnsmasq:option value='dhcp-ignore=tag:!known'/>
</dnsmasq:options>
</network>
# load the configuration file and enable the custom network
virsh net-define custom.xml && virsh net-start custom
DHCP configuration file in /var/lib/libvirt/dnsmasq/custom.conf
.
# list all DHCP leases
virsh net-dhcp-leases custom
# modify the network configuration
virsh net-edit custom && virsh net-destroy custom && virsh net-start custom
# add additional nodes to the DHCP configuration
virsh net-update custom add --live --config \
"<host mac='52:54:00:00:00:03' name='service1' ip='192.168.200.3' />" ip-dhcp-host
Head Node
Build a Warewulf server instance from the following Vagrantfile
…
- …connect the instance to the
custom
network with IP192.168.200.2
(line 10) - …install Warewulf from the RPM package hosted on GitHub (line 14)
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
config.vm.box = "almalinux/8"
config.vm.hostname = "wwctl"
config.vm.provider :libvirt do |libvirt|
libvirt.memory = 1024
libvirt.cpus = 1
libvirt.qemu_use_session = false
end
config.vm.network :private_network, :ip => "192.168.200.2", :libvirt__network_name => "custom"
config.vm.provision "shell", privileged: true , inline: <<-SHELL
dnf install -y epel-release
dnf config-manager --set-enabled powertools
dnf install -y https://github.com/hpcng/warewulf/releases/download/v4.4.0/warewulf-4.4.0-1.git_afcdb21.el8.x86_64.rpm
systemctl disable --now firewalld
SHELL
end
# start the instance
vagrant up
# login after provisioning
vagrant ssh
Configuration
Service configuration in /etc/warewulf/warewulf.conf
…modify the ipaddr
of the network interface:
# /etc/warewulf/warewulf.conf
#...
ipaddr: 192.168.200.2
#...
Start all required services …log information written to /var/log/warewulfd.log
by default
# start system services Warewulf depends on
wwctl configure --all
# start the Warewulf service
systemctl enable --now warewulfd
# check server status with
wwctl server status
Pull a base OS container (VNFS) from Docker Hub…
# import a RockyLinux 8 OCI container image
wwctl container import docker://ghcr.io/hpcng/warewulf-rockylinux:8 rocky-8
…adjust the network configuration and the container image for the default
profile…
# set default networking configurations
wwctl profile set --yes --netdev eth0 --netmask 255.255.255.0 --gateway 192.168.200.1 "default"
# assoicate the container image to the default profile
wwctl profile set --yes --container rocky-8 "default"
# list available profiles
wwctl profile list -a
Adding a node configuration…
- …associates a node to the
default
profiles (unless adjusted by option) - …identifies the node by its IP-address provided with option
--ipaddr
- …with
--discoverable yes
enables the registration of the MAC-address during boot
# add a new node to the configuraqtion
wwctl node add node1 --ipaddr 192.168.200.3 --discoverable yes
Use virt-install
to start a virtual machine instance…
- …
--pxe
and--boot uefi
to enable UEFI PXE LAN boot - …
--network
options connects with custom network - …make sure to provide enough memory for the container image size
virt-install \
--osinfo detect=on,require=off \
--name node1 --memory 4096 --disk size=10 \
--boot uefi,menu=on,useserial=on \
--pxe --network network=custom,mac=52:54:00:00:01:01 &
After successful PXE boot the node should register with the server:
# show provisioning status of nodes
>>> wwctl node status
NODENAME STAGE SENT LASTSEEN (s)
================================================================================
node1 RUNTIME_OVERLAY generic.img.gz 47
# print the configuration for a node
>>> wwctl node list -a node1 | grep hwaddr
node1 default:hwaddr -- 52:54:00:00:01:01
…login to the node with ssh root@192.168.200.3
DHCP & TFTP
DHCP and TFTP services can be configured individually…
# configure the DHCP server
wwctl configure dhcp
# list DHCP requests
journalctl -g dhcpd | grep -i discover
- …generate the configuration in
/etc/dhcp/dhcpd.conf
- …start the
dhcpd.service
systemd unit
# configure the TFTP server
wwctl configure tftp
- …serves files from
/var/lib/tftpboot
- …
journalctl -t in.tftpd
prints logs
Container Image
Read documentation about container management…
- …template image for the compute nodes …“Virtual Node File System” (VNFS)
- …supports image import from any Open OCI-compliant registry
- …note that most Docker containers are not “bootable” …have a limited version of Systemd
- …does not require a container runtime on the nodes
- …provisions a container image to bare metal before boot
- …reboot does not persist any information on the node
- …Linux kernel included in the container image
- …bootable container provided on DockerHub
- …images – https://hub.docker.com/u/warewulf
- …definition files – https://github.com/hpcng/warewulf-node-images
- …creating a container from scratch is described in the documentation
# show available containers
wwctl container list
Use the container exec
to start an interactive shell in a container image:
# ...add a password to the root account in the container
>>> wwctl container exec rocky-8 /bin/bash
[rocky-8] Warewulf> dnf install -y passwd
[rocky-8] Warewulf> passwd root
exit
+ LANG=C
+ LC_CTYPE=C
+ export LANG LC_CTYPE
+ dnf clean all
25 files removed
Rebuilding container...
Created image for VNFS container rocky-8: /var/lib/warewulf/container/rocky-8.img
Compressed image for VNFS container rocky-8: /var/lib/warewulf/container/rocky-8.img.gz
The container image will be rebuild automatically after exiting the shell.
Overlays
Custom configuration applied with overlays…
- …at different stages through the provisioning process
- …overlay images stored in
/var/lib/warewulf/overlays
The node list
sub-command will show which overlays are applied to a node
>>> wwctl node list --long
NODE NAME KERNEL OVERRIDE CONTAINER OVERLAYS (S/R)
=====================================================================================
node1 -- rocky-8 (wwinit)/(generic)
System Overlay
Available before Systemd init
is called …not updated during run-time…
- …contains
udev
rules to configure network device names - …loops scripts in
/wwinit/warwulf/init.d/*
# print files in the system overlay
wwctl overlay list -l wwinit
Runtime Overlay
…aka gerneric
overlay….
>>> wwctl overlay list generic -l
PERM MODE UID GID SYSTEM-OVERLAY FILE PATH
-rwxr-xr-x 0 991 generic /etc/
-rw-r--r-- 0 991 generic /etc/group.ww
-rw-r--r-- 0 991 generic /etc/hosts.ww
-rw-r--r-- 0 991 generic /etc/passwd.ww
-rwxr-xr-x 0 991 generic /root/
-rwxr-xr-x 0 991 generic /root/.ssh/
-rw-r--r-- 0 991 generic /root/.ssh/authorized_keys.ww
- …includes password, groups and SSH authorized keys configuration buy default
- …use this overlay for dynamic configuration files like
slurm.conf
wwclient
Updates the run-time overlay on a regular basis (defaults to once per minute)…
- …
wwclient
agent run as Systemd unit…- …provisioned as part of the
wwinit
system overlay - …assists with the
init
process during boot - …deploys node overlays during boot and runtime
- …provisioned as part of the
- …reads its own initialization scripts from
/warewulf/init.d/
>>> systemctl cat wwclient
# /etc/systemd/system/wwclient.service
# ...
[Service]
Type=notify
ExecStart=/warewulf/wwclient
ExecReload=/bin/kill -s SIGHUP "$MAINPID"
PIDFile=/var/run/wwclient.pid
TimeoutSec=60
#...
…journalctl -fu wwclient
follow the overlay update process.
Templates
Templates allow dynamic content
- …template files will end with the suffix of
.ww
- …uses the
text/template
engine
# print a specific file from an overlay
>>> wwctl overlay show generic /root/.ssh/authorized_keys.ww
{{Include "/root/.ssh/authorized_keys"}}
Include
read a file from the head node root file-system…
- …an SSH key-pair has been generated at
/root/.ssh/cluster{.pub}
- …the public key is appended to
/root/.ssh/authorized_keys
- …this enables
ssh
login to all nodes from the head node …as well aswwctl ssh
sub-command
Using Overlays
Import a new file into the runtime overlay…
>>> echo "My message of the day..." > /tmp/motd
>>> wwctl overlay import generic /tmp/motd /etc/motd
Building overlay for node1: [generic]
Created image for overlay node1/[generic]: /var/lib/warewulf/overlays/node1/generic.img
Compressed image for overlay node1/[generic]: /var/lib/warewulf/overlays/node1/generic.img.gz
# edit a file within an overlay (in-place)
wwctl overlay edit generic /etc/motd
…automatically rebuild the overlay image …populate the configuration to the associated nodes via wwclient
# ...delte a file from an overlay
wwctl overlay remove generic /etc/motd
# empty /etc/motd on a node
wwctl ssh $node "echo '' > /etc/motd"
References
Community…
- Mailing list – https://groups.google.com/a/lbl.gov/g/warewulf
- Source code – https://github.com/hpcng/warewulf
Media…
- Stateless Provisioning of Stateful Nodes: Examples with Warewulf 4
- Splitting Warewulf Images Between PXE and NFS
- Provisioning: Stateless Vs. Stateful, Research Computing Roundtable, CIQ
- Turnkey HPC: Warewulf & OpenHPC, Research Computing Roundtable, CIQ
- Warewulf 4, Admin Magazin, Jeff Layton