Flux — Resource Management Framework

Integration with HPC & Cloud Infrastucture

HPC

Slurm

Published

June 28, 2023

Modified

June 18, 2025

What does Flux do?

Flux ¹ Extends the traditional model of HPC resource management…

…developed for extreme-scale science and Exa-scale computing
…convergence of HPC with machine learning (ML) and cloud-computing
…build to facilitate new hardware resource types…
- …hybrid (or heterogeneous) combinations of processors
- …GPUs and other accelerators
- …multi-tiered disk storage
- …including methods for power efficiency

Combines fully hierarchical resource management with graph-based scheduling…

…solves three primary deficiencies of existing workload manage
- …manages all types of resources …bare-metal, VMs, cloud and HPC
- …scales from a local workstation (laptop) to big-scale HPC infrastructure
…uses recursively create nested instances…
- …on scheduler instance per user …improves robustness and execution performance
- …user workflows can easily and automatically sub-divide their jobs into arbitrarily small tasks
…jobs/tasks can connect to one another through messaging overlays…
- …data-stores built-in directly into Flux
- …breaking down the job coordination barrier

Graph-Based Scheduling

Manage complex combinations of resources …heterogeneous, dynamic systems …local and cloud

…resource representation (a model for characterizing resources) on a directed graph
…capable of dynamically defining arbitrary resource types
…mathematical structure that associates…
- …objects …vertices …(e.g., hardware, software, power distribution units)
- …via directed relationships …edges …indicate containment (i.e., a server contains a CPU)
…resource request consists of descending into the graph and checking vertices for suitability
…allocate resources in different ways based on paths …permits priority based on proximity
…scheduling operations are basic procedures in the context of directed graphs

Hierarchical Management

Divide-and-conquer approach…

…resources divided among schedulers in the hierarchy …increases scalability
Three distinct principles…
- …parent Flux instance grants resource allocations to its children
- …each instance configured independently …responsible for effective use of resources
- …first two principles apply recursively from the top of the resource hierarchy
…instances to delegate work to child instances …spreading the load
…creation of the appropriate number of Flux instances for each workflow

Fluxion …scheduler component …scalable graph-based scheduling techniques

Converged Computing

…coexistence of (traditional) HPC and Cloud resources

Fluence ² …the Flux scheduler swapped with kube-scheduler…
- …HPC-oriented technology swapped into Kubernetes for cloud-native orchestrator
Flux Operator ³ …for Kubernetes
- …create & control a HPC workload manager inside Kubernets
- …“mini”-cluster scheduled in Kubernets with Flux fine-grained resource mapping
…reference to video talks ⁴ ⁵ ⁶

Single-User Mode

Flux allows for both single-user and multi user modes…

…take a look to “Introduction to Flux” ⁷
…single-user mode → overlay workload manager
…on top of the native system-level workload manager (like Slurm)

Provide users with the comprehensive ability to manage resources within their own allocation

…streamline applications coupling, coordination, and dependency management
…set up customized hierarchies
- …policies based on the graph-based resource model
- …scheduling options such as queue depths an throttling of jobs
- …ensemble-based workflow …short-duration, single-core jobs spin up a network of nested Flux instances

Launch Flux

In a Docker container…

# ...Flux instance from a container
>>> podman run -ti fluxrm/flux-sched:latest
ƒ(s=1,d=0) fluxuser@afac2f6d30de:~$ flux --help

# ...emulate a multi-node deployment
>>> podman run -ti fluxrm/flux-sched flux start --test-size 4
ƒ(s=4,d=0) fluxuser@e67d73ebe096:~$ flux resource list
     STATE NNODES   NCORES    NGPUS NODELIST
      free      4       16        0 e67d73ebe[096,096,096,096]
 allocated      0        0        0 
      down      0        0        0

First Job

# ...submits a job which will be scheduled and run in the background
>>> flux submit hostname
ƒCUHCibq     # ...returns job ID

# ...run a job interactively
>>> flux run hostname
7f995365e9a2

# ...submit and list jobs
>>> flux submit sleep 360
ƒ2MU1GM7V
>>> flux submit sleep 360
ƒ2MtVYV5q
>>> flux jobs
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒ2MtVYV5q fluxuser sleep       R      1      1   3.935s 7f995365e9a2
   ƒ2MU1GM7V fluxuser sleep       R      1      1   4.893s 7f995365e9a2

# ....inspect a job
flux job info ƒ2MtVYV5q jobspec | jq

# ...summery of all jobs
flux top

Job Management

flux submit queues jobs… --cc (i.e. carbon copy) duplicates jobs …--wait for job completion

# ...submit different jobs for demonstration
flux submit --cc=0-1 --wait /bin/true
flux submit --cc=0-1 --wait /bin/false
flux submit --cc=0-7 sleep inf

flux jobs …states …R (running), CD (completed), F (failed), and CA (canceled)

# ...list running (& pending) jobs
>>> flux jobs
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒCV3ko224 fluxuser sleep       R      1      1   3.961m 7f995365e9a2
   ƒCV3ko223 fluxuser sleep       R      1      1   3.961m 7f995365e9a2
# [...]

# ...list all other jobs
>>> flux jobs --filter=inactive
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
   ƒBqMEQEw9 fluxuser false       F      1      1   0.034s 7f995365e9a2
   ƒBqMEQEwA fluxuser false       F      1      1   0.029s 7f995365e9a2
   ƒBFFuA8AL fluxuser true       CD      1      1   0.036s 7f995365e9a2
# [...]

# ...many options on filter available
>>> flux jobs --format=long ƒCV3ko224 ƒCV3gM4B2

With Slurm

No requirement on the cluster resource provider (underlying Slurm cluster)⁸:

…changes notion of compute jobs …language to describe these
Flux keeps track of hardware …resources within a user-allocation
- …the notion of jobs is then independent from the parent system
- …user interaction completely isolated within Flux
…enables quick error recovery …optimization & increased job throughput

Start Flux on a Slurm cluster⁹…

…uses PMI client to determine its place in a parallel program
…user Slurm option --mpi=pmi2 …unless it is the default

# optain an allocation from Slurm
salloc -N2 -n2 -p debug

# start a Flux instance …requires an interactive shell
srun -N2 -n2 --mpi=pmi2 --pty flux start

# list the allocated resources
>>> flux resource info
2 Nodes, 2 Cores, 0 GPUs

# run a dummy job
>>> flux run -N2 hostname
node1130
node1131

Footnotes

Flux Project
https://flux-framework.org
https://flux-framework.readthedocs.io
https://github.com/flux-framework ↩︎
Fluence Source Code, GitHub
https://github.com/flux-framework/flux-k8s ↩︎
Flux Operator Source Code, GitHub
https://flux-framework.org/flux-operator
https://github.com/flux-framework/flux-operator ↩︎
Flux for Job Management on Kubernetes, HPCKP 2023
https://www.youtube.com/watch?v=eg-oaNidSBI ↩︎
Kubernetes and HPC: Bare-metal bros, FOSDEM’24
https://fosdem.org/2024/schedule/event/fosdem-2024-2590-kubernetes-and-hpc-bare-metal-bros ↩︎
Flux Tutorial, Inside Livermore Lab, Youtube (2024)
https://www.youtube.com/watch?v=Dt4CSZWSEJE ↩︎
Introduction to Flux, LNLL
https://hpc-tutorials.llnl.gov/flux ↩︎
Running Flux in Slurm, Ryan Day, SLUG’23
https://slurm.schedmd.com/SLUG23/flux_in_slurm.pdf ↩︎
Starting with Slurm, Flux Documentation
https://flux-framework.readthedocs.io/projects/flux-core/en/latest/guide/start.html#starting-with-slurm ↩︎

--- title: 'Flux — Resource Management Framework' subtitle: Integration with HPC & Cloud Infrastucture categories: - HPC - Slurm date: 2023/06/28 date-modified: 2025/06/18 toc-expand: 3 --- # What does Flux do? Flux [^g0zZA] Extends the traditional model of HPC resource management… [^g0zZA]: Flux Project <https://flux-framework.org> <https://flux-framework.readthedocs.io> <https://github.com/flux-framework> - ...developed for extreme-scale science and Exa-scale computing - ...**convergence of HPC with machine learning (ML) and cloud-computing** - ...build to facilitate new hardware resource types... - ...hybrid (or heterogeneous) combinations of processors - ...GPUs and other accelerators - ...multi-tiered disk storage - ...including methods for power efficiency Combines fully hierarchical resource management with graph-based scheduling... - ...solves three primary deficiencies of existing workload manage - ...**manages all types of resources** ...bare-metal, VMs, cloud and HPC - ...scales from a local workstation (laptop) to big-scale HPC infrastructure - ...uses recursively create nested instances... - ...**on scheduler instance per user** ...improves robustness and execution performance - ...user workflows can easily and automatically sub-divide their jobs into arbitrarily small tasks - ...jobs/tasks can connect to one another through **messaging overlays**... - ...**data-stores built-in** directly into Flux - ...breaking down the job coordination barrier [wdfd]: https://flux-framework.readthedocs.io/en/latest/guides/learning_guide.html#configuring-the-flux-system-instance ### Graph-Based Scheduling Manage complex combinations of resources ...heterogeneous, dynamic systems ...local and cloud - ...resource representation (a model for characterizing resources) on a **directed graph** - ...capable of dynamically defining arbitrary resource types - ...mathematical structure that associates... - ...objects ...**vertices** ...(e.g., hardware, software, power distribution units) - ...via directed relationships ...**edges** ...indicate containment (i.e., a server contains a CPU) - ...resource request consists of descending into the graph and checking vertices for suitability - ...allocate resources in different ways based on paths ...permits priority based on proximity - ...scheduling operations are basic procedures in the context of directed graphs ### Hierarchical Management Divide-and-conquer approach... - ...resources divided among schedulers in the hierarchy ...**increases scalability** - Three distinct principles... - ...parent Flux instance grants resource allocations to its children - ...**each instance configured independently** ...responsible for effective use of resources - ...first two principles apply recursively from the top of the resource hierarchy - ...instances to delegate work to child instances ...spreading the load - ...creation of the **appropriate number of Flux instances for each workflow** Fluxion ...scheduler component ...[scalable graph-based scheduling techniques][sgbst] [sgbst]: https://flux-framework.readthedocs.io/en/latest/guides/learning_guide.html#developer-guidelines ### Converged Computing …coexistence of (traditional) HPC and Cloud resources - Fluence [^EeZ8e] …the Flux scheduler swapped with kube-scheduler… - …HPC-oriented technology swapped into Kubernetes for cloud-native orchestrator - Flux Operator [^FSKsN] …for Kubernetes - …create & control a HPC workload manager inside Kubernets - …"mini"-cluster scheduled in Kubernets with Flux fine-grained resource mapping - …reference to video talks [^Froro] [^u8jcv] [^P2GGX] [^EeZ8e]: Fluence Source Code, GitHub <https://github.com/flux-framework/flux-k8s> [^FSKsN]: Flux Operator Source Code, GitHub <https://flux-framework.org/flux-operator> <https://github.com/flux-framework/flux-operator> [^Froro]: Flux for Job Management on Kubernetes, HPCKP 2023 <https://www.youtube.com/watch?v=eg-oaNidSBI> [^u8jcv]: Kubernetes and HPC: Bare-metal bros, FOSDEM'24 <https://fosdem.org/2024/schedule/event/fosdem-2024-2590-kubernetes-and-hpc-bare-metal-bros> [^P2GGX]: Flux Tutorial, Inside Livermore Lab, Youtube (2024) <https://www.youtube.com/watch?v=Dt4CSZWSEJE> # Single-User Mode Flux allows for both single-user and multi user modes... - …take a look to "Introduction to Flux" [^UU60B] - …single-user mode → overlay workload manager - ...on top of the native system-level workload manager (like Slurm) [^UU60B]: Introduction to Flux, LNLL <https://hpc-tutorials.llnl.gov/flux> **Provide users with the comprehensive ability to manage resources within their own allocation** - ...streamline applications coupling, coordination, and dependency management - ...set up customized hierarchies - ...policies based on the graph-based resource model - ...scheduling options such as queue depths an throttling of jobs - ...ensemble-based workflow ...short-duration, single-core jobs spin up a network of nested Flux instances ### Launch Flux In a Docker container... ```bash # ...Flux instance from a container >>> podman run -ti fluxrm/flux-sched:latest ƒ(s=1,d=0) fluxuser@afac2f6d30de:~$ flux --help # ...emulate a multi-node deployment >>> podman run -ti fluxrm/flux-sched flux start --test-size 4 ƒ(s=4,d=0) fluxuser@e67d73ebe096:~$ flux resource list STATE NNODES NCORES NGPUS NODELIST free 4 16 0 e67d73ebe[096,096,096,096] allocated 0 0 0 down 0 0 0 ``` ### First Job ```sh # ...submits a job which will be scheduled and run in the background >>> flux submit hostname ƒCUHCibq # ...returns job ID # ...run a job interactively >>> flux run hostname 7f995365e9a2 # ...submit and list jobs >>> flux submit sleep 360 ƒ2MU1GM7V >>> flux submit sleep 360 ƒ2MtVYV5q >>> flux jobs JOBID USER NAME ST NTASKS NNODES TIME INFO ƒ2MtVYV5q fluxuser sleep R 1 1 3.935s 7f995365e9a2 ƒ2MU1GM7V fluxuser sleep R 1 1 4.893s 7f995365e9a2 # ....inspect a job flux job info ƒ2MtVYV5q jobspec | jq # ...summery of all jobs flux top ``` ### Job Management `flux submit` queues jobs... `--cc` (i.e. carbon copy) duplicates jobs ...`--wait` for job completion ```sh # ...submit different jobs for demonstration flux submit --cc=0-1 --wait /bin/true flux submit --cc=0-1 --wait /bin/false flux submit --cc=0-7 sleep inf ``` `flux jobs` ...states ...`R` (running), `CD` (completed), `F` (failed), and `CA` (canceled) ```sh # ...list running (& pending) jobs >>> flux jobs JOBID USER NAME ST NTASKS NNODES TIME INFO ƒCV3ko224 fluxuser sleep R 1 1 3.961m 7f995365e9a2 ƒCV3ko223 fluxuser sleep R 1 1 3.961m 7f995365e9a2 # [...] # ...list all other jobs >>> flux jobs --filter=inactive JOBID USER NAME ST NTASKS NNODES TIME INFO ƒBqMEQEw9 fluxuser false F 1 1 0.034s 7f995365e9a2 ƒBqMEQEwA fluxuser false F 1 1 0.029s 7f995365e9a2 ƒBFFuA8AL fluxuser true CD 1 1 0.036s 7f995365e9a2 # [...] # ...many options on filter available >>> flux jobs --format=long ƒCV3ko224 ƒCV3gM4B2 ``` ### With Slurm No requirement on the cluster resource provider (underlying Slurm cluster)[^xUpyn]: [^xUpyn]: Running Flux in Slurm, Ryan Day, SLUG'23 <https://slurm.schedmd.com/SLUG23/flux_in_slurm.pdf> - …changes notion of compute jobs …language to describe these - Flux keeps track of hardware …resources within a user-allocation - …the notion of jobs is then independent from the parent system - …user interaction completely isolated within Flux - …enables quick error recovery …optimization & increased job throughput Start Flux on a Slurm cluster[^3Dfg3]… [^3Dfg3]: Starting with Slurm, Flux Documentation <https://flux-framework.readthedocs.io/projects/flux-core/en/latest/guide/start.html#starting-with-slurm> * …uses PMI client to determine its place in a parallel program * …**user Slurm option `--mpi=pmi2`** …unless it is the default ```bash # optain an allocation from Slurm salloc -N2 -n2 -p debug # start a Flux instance …requires an interactive shell srun -N2 -n2 --mpi=pmi2 --pty flux start # list the allocated resources >>> flux resource info 2 Nodes, 2 Cores, 0 GPUs # run a dummy job >>> flux run -N2 hostname node1130 node1131 ```