Kubernetes — Architecture & Applications
Overview
Kubernetes aka K8s …manage containerized workloads and services
- …supports declarative configuration …automates deployment
- …runs on virtual and physical environments …on-premise & public clouds
- …reference infrastructure for cloud-native1˒2 applications
- Benefits of the Kubernetes architecture…
- …service discovery …networks, devices, service, metadata, etc
- …storage orchestration …decouple storage allocation from back-ends
- …container run-time agnostic …use any OCI compliant container engine
- …provider agnostic …no vendor lock-in to a specific IaaS
- …productivity …enables fast deployment CI/CD & DevOps
What can be difficult with Kubernetes?
- Hard multi-tenancy…
- …options for cluster-level sharing are available
- …however difficult where the tenants do not trust each other
- …typically each tenant is hosted a dedicated cluster instance
- Most deployments are virtual …and run on a cloud infrastructure
- …many use-cases & examples specific to cloud providers
- …bare-metal deployment not widely used …hence more challenging to build
Architecture
# display cluster information
kubectl cluster-info
# list cluster nodes
kubectl get nodes
# additional details about a node
kubectl describe node $node_name
Control plane — Manages worker nodes & pods in the cluster
- …usually runs distributed over multiple nodes (fault-tolerance, highly-available)
- Control plane components …global decisions about the cluster:
- API server
- …core component server …exposes the Kubernetes API
- …runs several instances of
kube-apiserver
to scale horizontally
- Cluster Storage
- …consistent and highly-available key value store
- …
etcd
3 …uses Raft consensus algorithm - …data is consistently replicated across multiple nodes
- …supports built-in snapshots for backup
- Scheduler
- …
kube-scheduler
…basically assigns pods to nodes - …monitors cluster resources …uses various policies for placement
- …
- Controllers …manage the cluster state
- …
kube-controller-manager
…runs individual controller processes - …different controllers for nodes, jobs, services, etc.
- …
- API server
Worker nodes — Host pods that house containerized applications
Objects
Objects — Specific configurations that define the desired state
- “Configuration that describes what you want to run”
- After an object is created…
- …it represents a specific entity with its own configuration and purpose
- …system will constantly work to ensure that the object exists
- An object includes two nested object fields:
- …object spec describes the desired state for the object
- …object status describes the actual state of the object
# list supported resources
kubectl api-resources
Resources — Components that can be managed within the cluster
- Generally refers to quantifiable units of compute, storage, and networking
- Resources include the object themselves and underlying hardware used
- “Not all resources are necessarily objects in the same sense”
Manifests
Manifests — Files that describe what to deploy
- Deployments created by posting a manifest to the Kubernetes API
- Why use Kubernetes manifest files?
- Declarative configuration …predictable correct state
- Prevents configuration drift by re-applying manifest
- Configuration as code …maintained with version control system
- Allows structured collaboration with co-workers
- Format described in the Kubernetes API4
# discover possible API object fields
kubectl explain $object # describe associated fields
kubectl explain $object --recursive # list all possible fields and subfields
kubectl explain $object.metadata # get details on fields
Main parts of a manifest:
kind
— Object typemetadata
— Name, namespace & labelsspec
— Containers, volumes, etc.
# example for a pod object
apiVersion: apps/v1 # Kubernetes API version
kind: Pod # Type of resource
metadata:
name: nginx # Name of the resource
spec:
containers:
- name: nginx # Name of the container
image: nginx:latest # Image to create the container from
Nodes
Nodes5 run workloads as pods…
- …contain all serviced required to run pods
- …provide the Kubernetes run-time environment
# list all nodes
kubectl get nodes
# display the node status
kubectl describe node $node_name
Components
Components that run on every node…
- Kubelet6 …primary “node agent”
- …works in terms of a PodSpec …YAML/JSON object describing a pod
- …ensures that pods are running and healthy
kube-proxy
(optional) …connects pods with a network- …uses the OS packet filter …otherwise forwards the traffic itself
- …other network plugins implement CNI (Container Network Interface)
- Container run-time …Kubernetes CRI (Container Runtime Interface)
Debugging
Start a debugging pod on a target node…
# start an interactive debugging shell on a node
kubectl debug node/$node_name -it --image busybox
# clean up the debugging Pod
kubectl delete pod node-debugger-$name
- …name of the new pod based on the name of the node
- …root file-system mounted to
/host
- …container runs in in the host IPC, Network, and PID namespace
- Use
--profile=
option to…general
…for common debuggingnetadmin
…network admin privilegessysadmin
…root privileges
Pods
# list pods
kubectl get pods -o wide
# detailed information
kubectl describe pod $pod_name
Pod — Smallest deployable compute object in Kubernetes…
- Group of one or more tightly related container
- Container in a pod will always run together…
- …on the same worker node …same Linux namespace
- …“running on the same logical machine”
Pod life-cycle
Pending
— ready for scheduling …not started yetRunning
— Pod bound to a node …containers createdSucceeded
— Pod containers terminated in success (no restart)Failed
— Container terminated in failure (non-zero status)Unknown
— Pod state could not be obtained- Container states inside a pod…
Waiting
— Running operations to complete startupRunning
— Container is executing without issuesTerminated
— Container ran to completion or failed for some reason
Command & Arguments
spec:
containers:
- name: #…
command: ["/bin/sh", "-c"]
args: ["echo 'Hello, World!' && sleep 3600"]
spec:
containers:
- name: #…
command: ["/bin/sh", "-c"]
args:
- |
echo "Hello World!" sleep 3600
Define commands & arguments for a container in a pod:
- …override default from the container image
- …if only arguments are defined the default command is used
command
— Command to run when the container starts- …corresponds to an entrypoint in a Dockerfile
args
— Arguments to pass to the command
# define an argument by environment variable
spec:
containers:
- name: #…
env:
- name: MESSAGE
value: "hello world"
command: ["/bin/echo"]
args: ["$(MESSAGE)"] # variable expansion
Namespaces
Mechanism for isolating groups of resources within a single cluster
- …scoping is applicable only for namespaced objects
- …divide cluster resources between multiple users (via resource quota)
- Namespaces
default
for new clusters without first creating a namespace
# list current namespaces
kubectl get namespace
# set the namespace for a current request
kubectl <...> --namespace=$name
# set namespace preference
kubectl config set-context --current --namespace=$name
# ...to unset
kubectl config set-context --current --namespace=''
Labels & Annotations
Labels are key/value pairs to specify identifying attributes of objects…
- …used to organize and to select subsets of objects
- …each key must be unique for a given object
- …common set of labels allows tools to work interoperable
- …labels without a prefix are private to users
Labels are defined in the metadata.labels
object:
apiVersion: v1
kind: Pod
metadata:
# ...
labels:
environment: prod
name: mariadb
component: database
part-of: slurm
#...
# list objects with a given label
kubectl get <...> -l 'component=database,part-of=slurm'
kubectl get <...> -l 'environment in (prod,dev)'
Po# list all labels of an object
kubectl get <...> -o json | jq .metadata.labels
Shared labels/annotations have a common prefix <prefix>/<name>
<prefix>
(optional) needs to be a valid DNS subdomain<name>
arbitrary property name of the label
Annotations attach arbitrary non-identifying metadata to objects
Secrets
# create a secret
kubectl create secret generic project-secret \
--from-literal=username=alice --from-literal=password=abc123
spec:
containers:
env:
- name: USERNAME
valueFrom:
secretKeyRef:
name: project-secret
key: username
- name: PASSWORD
valueFrom:
secretKeyRef:
name: project-secret
key: password
Store & manage sensitive information like passwords, SSH keys, OAuth tokens
- Ensure secret information is not exposed in configuration files
- Enables processes to regularity rotate secrets
Secrets are stored in the etcd database…
- …in a Base64-encoded format …basically exposed on access to the database
- Encryption at Rest — Encryption data before storing to the database
TODO…
Workloads
Kubernetes supports various container patterns
Container Pattern | Description |
---|---|
Main | Core functionality of a pod |
InitContainer | Run before the main application container |
SideCar | Run alongside the main application container |
Job | One-off tasks …runs until completion |
CronJob | One-time Jobs …repeating schedule |
ReplicaSet | Maintain a stable set of pods |
Deployment | Stateless application workload |
StatefulSet | Persistent storage & unique network identity |
DaemonSet | Local to specific nodes |
Ambassador | Communication proxy for external services |
Adapter | Transform formats/protocols to ensure compatibility |
Jobs
job_name=hello
cat > $job_name.yml <<EOF
apiVersion: batch/v1
kind: Job
metadata:
name: $job_name
spec:
template:
spec:
containers:
- name: hello
image: busybox
command: ["echo", "Hello, World!"]
restartPolicy: OnFailure
EOF
# create job (if not existing)
kubectl create -f $job_name.yml
# inspect job state
kubectl get jobs $job_name
kubectl describe jobs $job_name
# print the logs
kubectl logs job/$job_name
# clean up
kubectl delete job $job_name
Run the parallel job example:
job_name=sleep-parallel
cat > $job_name.yml <<EOF
apiVersion: batch/v1
kind: Job
metadata:
name: $job_name
spec:
completions: 6
parallelism: 2
template:
spec:
containers:
- name: sleep
image: busybox
command: ["sleep", "60"]
restartPolicy: Never
EOF
kubectl create -f $job_name.yml
# watch the status of pods created
kubectl get -w pods -l job-name=$job_name
# clean up
kubectl delete -f $job_name.yml
completions
number of pods to completeparallelism
number of pods to run in parallel
Cronjob
Run jobs on a scheduled interval
apiVersion: batch/v1
kind: CronJob
metadata:
name: project-crontjob
spec:
schedule: "*/1 * * * *" # Cron format for the schedule
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
command: ["echo", "Hello, World!"]
restartPolicy: OnFailure
Each CronJob creates a Job …each Job runs a single Pod
# list cronjobs
kubectl get cronjobs $name
# list the jobs
kubectl get jobs --watch
# remove a cronjob
kubectl delete cronjob $name
The job name is different from the pod name
# list pods for a given job
kubectl get pods --selector=job-name=$name --output=jsonpath={.items[*].metadata.name}
Manually start a cron job… --from=crontjob/
to specify the CronJob to use as template:
kubectl create job --from=cronjob/$cronjob_name $job_name
ReplicaSet
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: my-replicaset
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-container
image: my-image:latest
Manage a number of Pod replicas and their life-cycle
- Monitor health & automatically re-create pods on failure
- Features: scaling, self-healing, label selector, rolling updates
- Important fields:
replicas
— Number of podsselector
— How to identify pods belonging to the ReplicaSettemplate
— Describe the pods
Add additional expressions to the selector:
# more expressive label selector
selector:
matchExpression:
- key: app
operator: In
values:
- my-app
- Valid operators…
In
…labels value must matchNotIn
…labels value must not matchExists
…pod must include label with keyDoesNotExist
…pod must not include label with key
- Multiple expressions must all evaluate to true
- Possible to combine
matchLabels
withmatchExpressions
Deployment
Deployments7 provide rollback functionality and update control
Download nginx-deployment.yaml
Kubernetes example:
kubectl apply -f https://k8s.io/examples/controllers/nginx-deployment.yaml
# inspect the deployment
kubectl get deployments
# see the ReplicaSet created by the Deployment
kubectl get kubectl get replicaset
# list the pods (using a label)
kubectl get pods -l app=nginx
Rolling update incremental replacement of multiple pods
- …no downtime …network traffic load-balanced to available pods
- …facilitates CI/CD deployments …support rollback to previous version
# update pods to a new container version
kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
# check the revisions of this Deployment
kubectl rollout history deployment/nginx-deployment
# undo the current rollout and rollback to the previous revision
kubectl rollout undo deployment/nginx-deployment
Scaling an application on demand…
- …increase the number of pods to a desired state
- …supports (horizontal) autoscaling depending to load
# scale a deployment
kubectl scale deployment/nginx-deployment --replicas=5
# remove a deployment
kubectl delete deployment nginx-deployment
Init Container
Example: Use an initContainer
to clone a Git repository:
spec:
volumes:
- name: repo
emptyDir: {}
initContainers:
- name: git-clone
image: alpine/git
env:
- name: GITLAB_TOKEN
valueFrom:
secretKeyRef:
key: gitlab-token
name: project-secrets
args:
- 'clone'
- 'https://none:$(GITLAB_TOKEN)@example.org/path/to/repo.git'
- '/srv/repo'
volumeMounts:
- name: repo
mountPath: /srv
Run before the main application container
- Why — Perform initialization tasks
- Sequential execution…
- …multiple init containers run in sequence
- …containers must run successful before the next starts
- …failing init containers restarted until success
- Main application container starts after success of all init containers
Sidecar
Run alongside the main application container
- Why use a sidecar container?
- Collect monitoring metrics & health information
- Proxy and network communication
- Log forwarding …update of service components
- Co-location — Share resources with the main container
- Stop/start alongside the main container
Footnotes
CNCF (Cloud Native Computing Foundation)
https://landscape.cncf.io/↩︎Open Service Broker API Specification
https://www.openservicebrokerapi.org
https://github.com/openservicebrokerapi/servicebroker↩︎etcd
Documentation
https://etcd.io/docs
https://raft.github.io↩︎Kuberneter API Documentation
https://kubernetes.io/docs/reference/using-api
https://kubernetes.io/docs/reference/#api-reference↩︎Nodes, Kubernetes Documentation https://kubernetes.io/docs/concepts/architecture/nodes↩︎
kubelet
, Documentation
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/↩︎Kubernetes Deployments https://kubernetes.io/docs/concepts/workloads/controllers/deployment↩︎