Apptainer Containers
An Introduction to Linux Containers for HPC
Questions
- Why use containers?
- What are container images?
Objectives
- Learn how to use exiting containers
- Build container images for your application
Apptainer 1 is a container run-time platform for HPC compute clusters favoring integration over isolation. This integrative approach allows to support resources specific to HPC infrastructure like InfiniBand network fabrics, efficient parallel computing on many core CPUs and dedicated accelerators like GPUs. It is intentionally designed to facilitate research computing and scientific applications.
Using Apptainer
Apptainer enables users to interact with containers transparently. Execute programs inside a container (as if running on the host). Redirect I/O, use command pipes, pass arguments, access files, sockets and ports. Apptainer is updated regularly 2 and most Linux distributions have packages available in their repositories. On Fedora for example, install Apptainer using the official apptainer
package 3:
# install apptainer
sudo dnf install -y apptainer
# verify functionality
apptainer run docker://alpine
Users of MacOS and Windows can execute Linux containers in a virtual environment on their computer. It is not necessarily required to have Apptainer installed on your machine. Many HPC systems will have the apptainer
command pre-installed, including the capability to build container images on the cluster nodes. However it is often preferable to use container images in the local environment as well, and copy them on-demand to a target HPC system. This approach ensures that you work with the exact same environment under all circumstance.
Why use containers?
- Many applications in the HPC environment have become very complex in regards to software dependencies and details of configuration. For many years HPC infrastructure providers have struggled to build a common environment to facilitate the requirements of all user-communities on a single host platform. Containers enable complete decoupling of application environments from the host platform and other users. In addition containerization prevents any interference between user environments, an issue that created a lot of friction for cluster operations in the past.
- Containers change the user environment into a swappable component. Allowing users to have a custom application environment “packaged” into a container image. This container image can be executed within any host that provides container support. Which includes their own computer and a growing number of HPC systems.
- The most important benefit for users is the complete independence from the HPC host environment. This allows the freedom of choice to select any Linux distribution as a foundation for the container image. Furthermore users can install any combination of compilers, libraries and other software components in versions they require.
- Most container images are build programmatically using definition files. This ties into the concepts and goals of reproducible science. Images can not only be preserved without any ties to the host infrastructure used for execution, but are based on a recipe detailing how the image was build in the first place.
- Updates to both containerized applications and the host infrastructure are not interrelated. This benefits HPC administrators by enabling them to select the most suitable host environment to support the growing complexity of hardware. At the same time users can make changes to container images aligned with their scientific schedule and other boundary conditions.
Container Images
Sub-Command | Description |
---|---|
pull |
Download a container image from a remote resource |
What are container images?
- A container images stores a file-system tree containing a software environment (compilers, frameworks, libraries, etc.) for a given application.
- Additionally container images store configuration metadata used during container launch to create a desired application run-time environment.
- Container images are typically read-only (immutable) snapshots, which makes them host independent and portable. They can be moved freely between different infrastructures, and are published in container registries.
- New images are often derived from an existing image. Usually the container definition is maintained with a version control system.
- It is easy to switch between multiple versions of container images simplifying the roll-out of updates and roll-back in case of problems.
The container image default format for Apptainer is called SIF (Singularity Image File), a compressed read-only container file, by convention suffixed with .sif
. Download container images 4 from a remote location, for example a container registry like DockerHub, with apptainer pull
. This downloads all require container layers and combines those layers into an image file stored to the container cache in the path ~/.apptainer/cache
by default.
Many use-cases can build on an existing container images. Pre-build images are published by many organizations in a public container registry. A registry is centralized storage for container images. All container runtime systems (like Apptainer) have a standard way to download images from a registry. The most popular image registry is is DockerHub 5, the origin of a common protocol used to communicate with a container registry.
Environment Variable | Description |
---|---|
APPTAINER_CONTAINER |
Absolute path to storage location for container images |
# Set the path to a directoy storing container images...
export APPTAINER_CONTAINER=$LUSTRE_HOME/containers
# ...download a container image from a container registry
apptainer pull $APPTAINER_CONTAINER/jupyter.sif docker://quay.io/datascience-notebook:latest
Above example shows the download of a container image provided by Project Jupyter 6, pulled from a container registry called Quay (by RedHat) using the standard docker://
protocol. It is up to the users to select an appropriate image for their application. The recommendation is to use container images provided by the developer community of a given software ecosystem.
Interact with Containers
Sub-Command | Description |
---|---|
exec |
Execute a command within a containerized environment |
shell |
Start an interactive shell session in a containerized environment |
The Jupyter datascience-notebook
container image downloaded in the previous section includes a Python environment supporting the NumPy package 7 for scientific computing. Users interact with container images 8 using a selection of Apptainer sub-commands. The sub-command exec
executes a specified command within a container. Below a very simple Python program utilizing the NumPy package:
#!/usr/bin/env python
import numpy as np
= np.array([
array 3, 7, 1],
[10, 3, 2],
[5, 6, 7]
[
])
print(np.sort(array, axis=1))
# Implement the NumPy program above in an example file...
$EDITOR example.py
# ...make the example file an executable program
chmod +x example.py
# Execute the Python program in the downloaded container image
apptainer exec $APPTAINER_CONTAINER/jupyter.sif ./example.py
The shell
sub-command starts an interactive shell within a container:
>>> apptainer shell $APPTAINER_CONTAINER/jupyter.sif
Apptainer> ipython
Python 3.11.6 | packaged by conda-forge | (main, Oct 3 2023, 10:40:35) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.16.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]:
Build Containers
Sub-Command | Description |
---|---|
build |
Build a new container image from a definition file |
Create a new container image 9 with the help of a container definition file (or “def file” for short). The following example builds a container image for the GNU Hello 10 software, based on a definition file called hello.def
:
BootStrap: docker
From: quay.io/fedora/fedora:latest
%labels
Maintainer John Snow <j.snow@example.com>
%post
version=2.12.1
archive=hello-$version.tar.gz
dnf install -y wget tar gzip gcc make
dnf clean all
wget https://ftp.gnu.org/gnu/hello/$archive
tar xvzf $archive -C /opt
rm $archive
cd /opt/hello-$version
./configure
make
make install
%runscript
/usr/local/bin/hello
Run the following command providing the container image name and the definition file as arguments to build a container image:
apptainer build $APPTAINER_CONTAINER/hello.sif hello.def
Test the functionality of the container image by executing the hello
application in the container:
>>> apptainer run $APPTAINER_CONTAINER/hello.sif
Hello, world!
>>> apptainer exec $APPTAINER_CONTAINER/hello.sif hello --version | head -n1
hello (GNU Hello) 2.12.1
Definition Files
Container images are build from a definition file 11, basically a blueprint listing each step to install software components within the container. The files are divided into two parts:
- The header defines the Linux distribution used as foundation to build the container image (root) file-system. We recommend to select the distribution best supported by the software you want to use.
- The rest of the definition is comprised of sections to install additional software and configure the container run-time environment.
Header
The header at the top defines the base operating system used to build the container. The Bootstrap
keyword is mandatory in the first line. This keyword supports multiple bootstrap agents 12, for example the docker
agent enables access to compatible container registries like DockerHub:
# ...from DockerHub
BootStrap: docker
From: fedora:latest
# ...from Quay (RedHat)
BootStrap: docker
From: quay.io/rockylinux/rockylinux:8
Sections
The second part of the definition file is broken into sections 13. Each section adds different content or executes commands at different times during the container image build (note that multiple sections of the same name can be included). The build option --section
allows to limit execution to a specific section or sections. The table below presents a brief overview of all possible sections:
Section | Description |
---|---|
%setup |
…executed on the host…before container build |
%files |
…copy files into the container before %post |
%post |
…install software…create configurations |
%test |
…validate the container |
%environment |
…define environment variables 14 (not made available at build time) |
%startscript |
…executed at instance start command |
%runscript |
…executed at run command |
%labels |
…add metadata to the file /.singularity.d/labels.json |
%help |
…help text… |
The definition file below illustrates many sections as examples:
%setup
touch /file1
touch ${APPTAINER_ROOTFS}/file2
%files
/file1
/file1 /opt
%environment
export LISTEN_PORT=12345
export LC_ALL=C
%post
apt-get update && apt-get install -y netcat
NOW=`date`
echo "export NOW=\"${NOW}\"" >> $APPTAINER_ENVIRONMENT
%runscript
echo "Container was created $NOW"
echo "Arguments received: $*"
exec echo "$@"
%startscript
nc -lp $LISTEN_PORT
%test
grep -q NAME=\"Ubuntu\" /etc/os-release
if [ $? -eq 0 ]; then
echo "Container base is Ubuntu as expected."
else
echo "Container base is not Ubuntu."
exit 1
fi
%labels
Author alice
Version v0.0.1
%help
This is a demo container used to illustrate a def file that uses all
supported sections.
Local Images
Repeatedly building containers can become a time consuming process. Apptainer supports to build containers from local images. This enables to maintain a list of base containers as foundation for later reuse. Following definition builds a base-container from the latest Fedora release including developer tools, compilers and a selection of additional useful packages.
BootStrap: docker
From: quay.io/fedora/fedora:latest
%post
dnf install -y @development-tools \
\
gcc-c++ gcc gcc-gfortran git git-delta gnupg2 \
python3 python3-pip python3-setuptools python3-boto3 \
psmisc rsync tmux tree wget unzip \
bat curl fd-find fzf findutils \
hostname iproute netcat neovim make patch dnf clean all
# ...build the container image
apptainer build fedora-base.sif fedora-base.def
Another container definition file can then reuse a local image with the following bootstrap configuration 15:
Bootstrap: localimage
From: fedora-base.sif
Note that Apptainer supports another mechanism called multi-stage build 16 in a single container definition file.
Container Cache
Environment Variable | Description |
---|---|
APPTAINER_CACHEDIR |
Cache folder for images from a container registry. |
APPTAINER_TMPDIR |
Temporary directory to build container file-systems. |
By default the container build cache is located in the directory $HOME/.apptainer
. Details about the container build environment are available in the Apptainer User Guide 17. Pay attention to the storage consumed by the container cache, since the amount of available space may be limited depending on your environment.
# show storage capacity used by the cache
apptainer cache list
# ..detailed view
apptainer cache list -v
# clean up everything
apptainer cache clean
The environment variable listed in the table on top are used to configure the paths to the container caches. Following example stores all container artifacts on volatile storage in the /var/tmp
directory by setting both variables:
# create a user working directory
pushd $(mktemp -d /var/tmp/$USER-apptainer-XXXXXX)
# locate all artifacts within this directory
export APPTAINER_TMPDIR=$PWD
export APPTAINER_CACHEDIR=$PWD
This is useful in make sure that all build-artifact don’t linger in your environment.
Test Sandboxes
Option | Description |
---|---|
--sandbox |
Work with a container image (root)-filesystem |
A sandbox provides a writable (ch)root directory to interactively work with a container image. This is obviously not a reproducible method to build an image, but it is helps testing during implementation of definition files:
# ...create a container within a writable directory
apptainer build --fix-perms --sandbox rootfs/ docker://quay.io/rockylinux/rockylinux:8
# ...make changes within the container
apptainer shell --writable --fakeroot --home $PWD rootfs/
# ...build a new container from the sandbox
apptainer build apptainer.sif rootfs/
Do not forget to delete the rootfs
directory afterwards, since these can consume multiple Gigabytes of storage.
Docker Compatibility
Container build and runtime tools have a complex interrelationship and overlap of functionality. The most widely used container tools outside of HPC are Docker 18 and Podman 19, which are mostly compatible to each other. Chances are high that you will encounter projects providing a Dockerfile
20 as container definition. Consider the example below for GNU Hello introduced previously:
FROM quay.io/fedora/fedora:latest
ARG version=2.12.1
ARG archive=hello-$version.tar.gz
RUN dnf install -y wget tar gzip gcc make
RUN dnf clean all
RUN wget https://ftp.gnu.org/gnu/hello/$archive
RUN tar xvzf $archive -C /opt
RUN rm $archive
WORKDIR /opt/hello-$version
RUN ./configure
RUN make
RUN make install
ENTRYPOINT /usr/local/bin/hello
The notation above differs significantly from an Apptainer definition file. In order to build a container image from a Dockerfile
, use the build
sub-command 21 of Podman. Please consult the corresponding documentations for more details.
# Build a container image using a `Dockerfile` in the working directory...
podman build -t hello .
# ...and execute the container image to test its functionality
podman run --rm -it localhost/hello:latest
For the purpose of this article it is interesting to understand the compatibility of Apptainer and Docker container images. Apptainer uses a different container format SIF, as discussed above. Fortunately it is easy to convert between both formats:
# Create an OCI container archive...
podman push localhost/hello:latest oci-archive:/tmp/hello-latest.tar
# ...and use this archive to create a SIF container image
apptainer build hello.sif oci-archive:/tmp/hello-latest.tar
The two commands above use an standardized container image format called OCI Image Layout 22 as intermediate step during conversion. There is a lot more details to the conversion then described here. We leave it to the reader to continue to investigate this.
Footnotes
Apptainer
https://apptainer.org↩︎List of Apptainer Releases, GitHub
https://github.com/apptainer/apptainer/releases↩︎Fedora Apptainer Package
https://src.fedoraproject.org/rpms/apptainer↩︎Downloading Images, Apptainer User Guide
https://apptainer.org/docs/user/latest/quick_start.html#downloading-images↩︎Container Registry, Docker Manual
https://docs.docker.com/registry/↩︎Project Jupyter, Quay.io
https://quay.io/organization/jupyter↩︎Numpy Package
https://numpy.org/↩︎Interacting with Images, Apptainer User Guide
https://apptainer.org/docs/user/main/quick_start.html#interacting-with-images↩︎Build a Container, Apptainer User Guide
https://apptainer.org/docs/user/main/build_a_container.html↩︎GNU Hello
https://www.gnu.org/software/hello↩︎Definition Files, Apptainer User Guide
https://apptainer.org/docs/user/main/definition_files.html↩︎Bootstrap Agents, Apptainer User Guide
https://apptainer.org/docs/user/main/definition_files.html#other-bootstrap-agents↩︎Sections, Apptainer User Guide
https://apptainer.org/docs/user/main/definition_files.html#sections↩︎Environment and Metadata, Apptainer User Guide
https://apptainer.org/docs/user/main/environment_and_metadata.html#environment-and-metadata↩︎Build Modules, Apptainer User Guide
https://apptainer.org/docs/user/main/appendix.html#build-localimage↩︎Multi-Stage Builds, Apptainer User Guides
https://apptainer.org/docs/user/main/definition_files.html#multi-stage-builds↩︎Apptainer User Guide - Build Environment
http://apptainer.org/docs/user/main/build_env.html#cache-folders↩︎Docker Documentation
https://docs.docker.com/↩︎Podman Documentation
https://podman.io/docs↩︎Dockerfile Reference, Docker Documentation
https://docs.docker.com/engine/reference/builder/↩︎Podman Build Manual Page
https://docs.podman.io/en/latest/markdown/podman-build.1.html↩︎OCI Image Layout Specification, Open Containers Initiative
https://github.com/opencontainers/image-spec/blob/main/image-layout.md↩︎