OCI Linux Containers

Linux
Containers
Published

July 29, 2020

Modified

February 24, 2023

Containers

Linux components and sub-systems used with containers:

  • Kernel Namespaces
    • Originally the kernel not designed with namespaces in mind…
    • …ongoing development to make more sub-systems namespace aware
    • Isolation layers to implement different user-spaces views for processes
    • Partitions process, users, network stacks, etc into separate spaces
    • Provide groups of processes a unique view onto the system
  • Control Groups (cgroups), more powerful ulimit()/rlimit()
    • Mechanism to apply hardware resource limits and access control to processes
    • Tree-based hierarchical, inheritable and optionally nested
    • Configured via a special cgroup virtual file-system
  • Root Capabilities
    • Enforce namespaces (in privileged containers) by reducing the power of root
    • Capability privilege bitmap per process used by the kernel to partition root access
    • Restricts root-level operations to follow the principle of least privilege
  • System call pivot_root - Change the root file-system for a new container
  • Tools to enforce security for containers
    • seccomp-bpf - Berkeley Packet Filter system call filtering
    • prctl (PRE_SET_NO_NEW_PRIVS) kernel-level flags to prevent system escalation
    • SELinux - Labeling system for file-systems and applications
    • AppAmor - Profile-based MAC (Manditory Acess Control) system for limiting applications abilities

Run-Times Engine

Container build and runtime tools have a complex interrelationship and overlap of functionality.

In general starting a container involves following steps:

  1. Build a container and push (upload) to a container registry
  2. Pull (downloads) container images from a container registry
  3. Prepares the container image file-system and a mount-point
  4. Read the container metadata and configuration to…
  5. …assign a container namespace for isolation (processes, network, etc.)
  6. …set resource constrains (CPU, memory, network bandwidth etc.)
  7. Runs a system call to the kernel to start a container
  8. Configures security constrains SELinux, AppArmor, seccomp, etc.

OCI Runtime

Open Container Initiative (OCI)…

…provides a specification defining what a container is…

…majority of container tools are compatible to this specification

Container Images

Run-time environment to execute a service or application…

  • …store all necessary configuration metadata
  • …and a (root) file-system with the executables and dependencies

Read-only (immutable) snapshots of a container run-time environment…

  • …launch a container from an container image
  • …create a writable layer…
  • …above a copy of the associated container image

Image-based container have following advantages…

  • …root file-system completely host independent and portable…
    • …simple deployment and distribution over a container registries
    • …easy integration into continues build, test and integration systems
  • Several container instances can be started from a single container image
  • Versioning and roll-back for a complete service/application runtime environment

Typically container images are derived from an existing image

  • …after applying modifications…
  • …another independent image is created (decoupled from the original)
  • …an additional layer is created associated to the original

OCI Image Layout

OCI Image Layout Specification

  • …directory structure for
    • …OCI content-addressable blobs
    • …location-addressable references
  • …used in a variety of different transport mechanisms
    • …archive formats (e.g. tar, zip)
    • …filesystem environments
    • …networked file fetching (e.g. http, ftp, rsync)

Content…

  • content-addressable…blobs directory
  • oci-layout…provide the version of the image-layout
  • index.json…references and descriptors of the image-layout

OCI Container Bundle

…container format…encoding a container as a file-system bundle…

  • …consumable by any OCI compliant container run-time
  • …data & metadata needed to load and run a container

…mandatory artifacts residing locally in a single directory…

  1. config.json configuration data in the root directory
  2. …container’s root file-system…typically rootfs/

Use runc to build a file-system bundle…

>>> cd $(mktemp -d)
# generate a skeleton config.json
>>> runc spec
# default path to the root file-system
>>> cat config.json | jq '[.root.path]'
[
  "rootfs"
]
>>> mkdir rootfs/
# ...create the content of the root file-system
>>> runc run test

…multiple ways to generate an OCI Container bundle

Container Registry

…centralized storage for container images

  • …images identified by…
    • …registry name
    • …project name
    • …image name
  • ..standard way to…
    • …find & introspect images
    • …share images (push/pull)
# download an container image from a registry

                     ┌─────┬────────────────────────── registry server
                     
                          │            ┌────────┬──── container image

podman pull docker://quay.io/rockylinux/rockylinux:8
   
                │           │        │            └── container image suffix 
                │           │        │
                │           └────────┴─────────────── project repository
            
            └────┴──────────────────────────────────── protocl

Registry vs. repository…

  • Registry…service (local or external)
    • …hosting and distributing of container images
    • …collection of repositories
  • Repository…collection of related images…
    • …provided with different versions
    • …multiple variants of a service/application
  • Tag…alphanumeric identifier attached to images…
    • …within a repository
    • …means to differentiate versions of images

Public vs. private registries…

  • Public…
    • …commonly used by individuals or small teams
    • …limits to privacy and access control
  • Private…
    • …way to incorporate security and privacy
    • …hosted remotely or on-premises

OCI compliant container registries…