Flux Resource Management Framework
Integration with HPC & Cloud Infrastucture
What does Flux do?
Flux 1 Extends the traditional model of HPC resource management…
- …developed for extreme-scale science and Exa-scale computing
- …convergence of HPC with machine learning (ML) and cloud-computing
- …build to facilitate new hardware resource types…
- …hybrid (or heterogeneous) combinations of processors
- …GPUs and other accelerators
- …multi-tiered disk storage
- …including methods for power efficiency
Combines fully hierarchical resource management with graph-based scheduling…
- …solves three primary deficiencies of existing workload manage
- …manages all types of resources …bare-metal, VMs, cloud and HPC
- …scales from a local workstation (laptop) to big-scale HPC infrastructure
- …uses recursively create nested instances…
- …on scheduler instance per user …improves robustness and execution performance
- …user workflows can easily and automatically sub-divide their jobs into arbitrarily small tasks
- …jobs/tasks can connect to one another through messaging overlays…
- …data-stores built-in directly into Flux
- …breaking down the job coordination barrier
Graph-Based Scheduling
Manage complex combinations of resources …heterogeneous, dynamic systems …local and cloud
- …resource representation (a model for characterizing resources) on a directed graph
- …capable of dynamically defining arbitrary resource types
- …mathematical structure that associates…
- …objects …vertices …(e.g., hardware, software, power distribution units)
- …via directed relationships …edges …indicate containment (i.e., a server contains a CPU)
- …resource request consists of descending into the graph and checking vertices for suitability
- …allocate resources in different ways based on paths …permits priority based on proximity
- …scheduling operations are basic procedures in the context of directed graphs
Hierarchical Management
Divide-and-conquer approach…
- …resources divided among schedulers in the hierarchy …increases scalability
- Three distinct principles…
- …parent Flux instance grants resource allocations to its children
- …each instance configured independently …responsible for effective use of resources
- …first two principles apply recursively from the top of the resource hierarchy
- …instances to delegate work to child instances …spreading the load
- …creation of the appropriate number of Flux instances for each workflow
Fluxion …scheduler component …scalable graph-based scheduling techniques
Converged Computing
…coexistence of (traditional) HPC and Cloud resources
- Fluence 2 …the Flux scheduler swapped with kube-scheduler…
- …HPC-oriented technology swapped into Kubernetes for cloud-native orchestrator
- Flux Operator 3 …for Kubernetes
- …create & control a HPC workload manager inside Kubernets
- …“mini”-cluster scheduled in Kubernets with Flux fine-grained resource mapping
- …reference to video talks 4 5
Single-User Mode
Flux allows for both single-user and multi user modes…
- …take a look to “Introduction to Flux” 6
- …single-user mode → overlay workload manager
- …on top of the native system-level workload manager (like Slurm)
Provide users with the comprehensive ability to manage resources within their own allocation
- …streamline applications coupling, coordination, and dependency management
- …set up customized hierarchies
- …policies based on the graph-based resource model
- …scheduling options such as queue depths an throttling of jobs
- …ensemble-based workflow …short-duration, single-core jobs spin up a network of nested Flux instances
First Steps
In a container…
# ...Flux instance from a container
>>> podman run -ti fluxrm/flux-sched:latest
ƒ(s=1,d=0) fluxuser@afac2f6d30de:~$ flux --help
# ...emulate a multi-node deployment
>>> podman run -ti fluxrm/flux-sched flux start --test-size 4
ƒ(s=4,d=0) fluxuser@e67d73ebe096:~$ flux resource list
STATE NNODES NCORES NGPUS NODELIST
free 4 16 0 e67d73ebe[096,096,096,096]
allocated 0 0 0
down 0 0 0
First job…
# ...submits a job which will be scheduled and run in the background
>>> flux submit hostname
ƒCUHCibq # ...returns job ID
# ...run a job interactively
>>> flux run hostname
7f995365e9a2
# ...submit and list jobs
>>> flux submit sleep 360
ƒ2MU1GM7V
>>> flux submit sleep 360
ƒ2MtVYV5q
>>> flux jobs
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒ2MtVYV5q fluxuser sleep R 1 1 3.935s 7f995365e9a2
ƒ2MU1GM7V fluxuser sleep R 1 1 4.893s 7f995365e9a2
# ....inspect a job
flux job info ƒ2MtVYV5q jobspec | jq
# ...summery of all jobs
flux top
Job Management
flux submit
queues jobs… --cc
(i.e. carbon copy) duplicates jobs …--wait
for job completion
# ...submit different jobs for demonstration
flux submit --cc=0-1 --wait /bin/true
flux submit --cc=0-1 --wait /bin/false
flux submit --cc=0-7 sleep inf
flux jobs
…states …R
(running), CD
(completed), F
(failed), and CA
(canceled)
# ...list running (& pending) jobs
>>> flux jobs
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒCV3ko224 fluxuser sleep R 1 1 3.961m 7f995365e9a2
ƒCV3ko223 fluxuser sleep R 1 1 3.961m 7f995365e9a2
# [...]
# ...list all other jobs
>>> flux jobs --filter=inactive
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒBqMEQEw9 fluxuser false F 1 1 0.034s 7f995365e9a2
ƒBqMEQEwA fluxuser false F 1 1 0.029s 7f995365e9a2
ƒBFFuA8AL fluxuser true CD 1 1 0.036s 7f995365e9a2
# [...]
# ...many options on filter available
>>> flux jobs --format=long ƒCV3ko224 ƒCV3gM4B2
With Slurm
No requirement on the cluster resource provider (underlying Slurm cluster) 7:
- …changes notion of compute jobs …language to describe these
- Flux keeps track of hardware …resources within a user-allocation
- …the notion of jobs is then independent from the parent system
- …user interaction completely isolated within Flux
- …enables quick error recovery …optimization & increased job throughput
Footnotes
Flux Project
https://flux-framework.org
https://flux-framework.readthedocs.io
https://github.com/flux-framework↩︎Fluence Source Code, GitHub
https://github.com/flux-framework/flux-k8s↩︎Flux Operator Source Code, GitHub
https://flux-framework.org/flux-operator
https://github.com/flux-framework/flux-operator↩︎Flux for Job Management on Kubernetes, HPCKP 2023
https://www.youtube.com/watch?v=eg-oaNidSBI↩︎Kubernetes and HPC: Bare-metal bros, FOSDEM’24
https://fosdem.org/2024/schedule/event/fosdem-2024-2590-kubernetes-and-hpc-bare-metal-bros↩︎Introduction to Flux, LNLL
https://hpc-tutorials.llnl.gov/flux↩︎Running Flux in Slurm, Ryan Day, SLUG’23
https://slurm.schedmd.com/SLUG23/flux_in_slurm.pdf↩︎