Kubernetes — Networking & Load-Balancing

Kubernetes
Published

December 4, 2024

Modified

May 27, 2025

Overview

Kubernetes networking… …software-defined networking (SDN)

  • …network plane spread across a cluster of machines
  • …flat network structure
  • …components connect without relying on hardware

Birds view on Kubernetes networking…

  • Primary goal…
    • pod-to-pod communication between all cluster nodes
    • …each pod has an IP address …routed within the cluster
    • …eliminates need for mapping port to pods
    • …no need to NAT cluster internal communication
  • Secondary goals:
    • IP address management (IPAM) …allocate IPs to pods
    • Port mapping …expose Pods to the outside world
    • Bandwidth control …egress/ingress traffic rates
    • Source NAT …for traffic leaving the cluster

Basic Communication

Four basic types of network communication:

  1. Container-to-container
    • …smallest unit in a Kubernetes network
    • …containers in the same pod share the same network namespace
    • …containers communicate within a single pod through localhost
    • …containers share IP address and port space
  2. Pod-to-pod
    • each pod in a cluster has its own unique IP address
    • …direct communication between pods on all cluster nodes
    • …pod IP addresses are ephemeral …recreation of a pod changes the IP
  3. Pod-to-service …service abstraction
    • …facilitate both pod-to-service & external-to-service communication
    • …enables external traffic exposure to cluster internal applications
    • …provides load balancing & service discovery for logic sets of pods

Traffic Patterns

Jargon for network traffic patterns…

  • Ingress …defines rules for incoming traffic to pods
  • Egress …defines rules for outgoing traffic from pods
  • North-South
    • North traffic (incoming traffic)
      • …typically handled by a load balancer
      • …external IP address to forward traffic
    • South traffic (outgoing traffic)
      • …return a response …call external services
      • …requires an Egress resource
  • East-West traffic (internal traffic) …services to service communication

Container Network Interface

Container Network Interface (CNI) …under the governance of the CNCF

What CNI plugins do?

  • Connectivity
    • …create network namespaces
    • …assign IP addresses
    • …set up network routes
  • Reachability
    • …enable pod-to-pod communication
    • …within the same node and across nodes

Specification & libraries to write plugins for container networking…

  • CNI providers implement plugin as binary executable
    • …invoked by the container engine via the Kubelet process
    • …container runtime create a network namespace before invoking the CNI plugin
    • …plugin responsible for connecting the network interface
  • Kubernetes provides a default CNI…
    • …third-party plugins include Cilium, Calico, Flannel, Istio…
    • …differ in their approach to overlay networks, direct routing, etc.

Reachability

Basic Terminology for networks:

  • Underlay network — Physical infrastructure
    • Enables IP package forwarding…
    • …cables, switches, routers
  • OSI transport layer works as transition layer
  • Overlay network — Software-driven transportation
    • …abstracts low-level details for traffic forwarding
    • …overlay implements virtual networks
    • …create multiple logical networks over the underlay

Connection between nodes depends on underlying layer 2 network…

  • Shared — Nodes share a layer 2 network
    • …connectivity by static routes or full mesh
  • Decoupled — Nodes connected to different layer 2 networks
    • …encapsulation in the overlay (e.g. VXLAN)
    • …orchestrating the underlay (e.g. BGP)

Implementation

Reference implementation for CNI plugins1 include…

  • bridge create a bridge network an attaches pods
  • vlan allocates a VLAN device
  • host-device attach to an existing host device
  • ptp create a virtual eth pair

CNI plugin implmentations…

  • Kindnet2
    • Reachability …one static route per peer node
    • Connectivity …mix of reference CNI plugins
      • ptp to create veth links
      • host-local to allocate IPs
      • portmap for port mapping
      • kindnetd daemon generates configuration files
  • Flannel
    • Reachability …managed by flanneld daemon
      • …generates a host-local IPAM configuration
      • …creates a VXLAN interface flannel.1
      • …discovers VXLAN information from other nodes
      • …builds a local unicast head-end replication (HER) table
    • Connectivity …generates a bridge
  • Calico3
    • Connectivity…
      • …creates veth link
      • …setup host-route pointing to veth link
      • …egress link setup with proxy_arp
    • Reachability…
      • Static route & overlay mode …supports VXLAN
      • BGP mode …BGP speaker on every node
  • Cilium
    • Connectivity…
      • …creates a veth link
      • …eBPF program performs traffic forwarding
    • Reachability…
      • tunnel mode …VXLAN interfaces to forward traffic
      • native-routing mode …provided by underlay …static routes or BGP

Network Policies

Network policies …filter traffic from/to pods

  • …labels & selectors specify which policy applies to a pod
  • …define and manage security policies for network communication
  • Default pod communication within cluster not secured…
    • …if a cluster is not using network policies
    • …pods by default do not filter incoming traffic
    • …no firewall rules for inter-pod communication

IP Addresses

Pods have a unique IP from a PodCIDR range…

  • …CIDR ranged assigned to a node during kubelet configuration
  • …node are not aware of CIDRs assigned to other nodes

Non-overlapping IP addresses for pods, services & nodes

>>> kubectl get configmaps -n kube-system kubeadm-config -o yaml | grep -i subnet
      podSubnet: 10.244.0.0/16
      serviceSubnet: 10.96.0.0/16

# list per node IP address range
kubectl get nodes -o jsonpath='{range.items[*]}{.metadata.name}{"\t"}{.spec.podCIDR}{"\n"}'

# show pod with IP addresses
kubectl get pod -o wide

# show only the IP address
kubectl get pod $pod_name --template '{{.status.podIP}}'

DNS Service

Kubernetes cluster automatically provides a DNS service

  • Assigns readable names…
    • …lightweight mechanism for service discovery
    • …in addition to the pod IP address assignment
    • …ephemeral IP addresses are not reliable endpoints for communication
    • service consumers should avoid using IP address
  • Kubernetes DNS services4 for pods…
    • …have at least one corresponding A/AAAA DNS record
    • …format depends on the type of the serivce
    • …some service may have SRV and PTR records

CoreDNS

CoreDNS implements the Kubernetes DNS spec5:

  • …compiled to a static binary …deployed into the Kubernetes cluster
  • Service discovery…
    • Server-side
      • …exposed as a ClusterIP service
      • …DNS service inside cluster …based on network forwarding rules
      • …stores Kubernetes service, pods and endpoint objects
      • …acts as DNS proxy for all internal domains
    • Client-side
      • …controlled by spec.dnsPolicy6 (per-pod basis)
      • …by default Kubelet configures cluster DNS IP in resolv.conf
      • …internal DNS has precedents over eternal DNS
  • Kubernetes nodes can run a local DNS cache7

Example

Setup an example in the default namespace…

# Create a website deployment…
kubectl create deployment website --replicas=3 --image=httpd
# …and a service
kubectl expose deployment website --port=80
# Start a client…
kubectl run -it client --image busybox
# …later clean up 
kubectl delete pod/client service/website deploy/website

Find the IP address of the cluster DNS server…

>>> kubectl get service/kube-dns -n kube-system
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   178m

DNS resolution in a pod…

  • 10.96.0.10 is the address of the DNS server
  • Default internal domain name for a cluster is cluster.local
    • …subdomain per namespace $namespace.svc.cluster.local
    • …service for example website.default.svc.cluster.local.
# Linux resolver configuration in a pod
>>> cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local localdomain
nameserver 10.96.0.10
options ndots:5

Send a request by referencing the service name

>>> wget -qO - website
<html><body><h1>It works!</h1></body></html>
>>> wget -qO - website.default
<html><body><h1>It works!</h1></body></html>
>>> wget -qO - website.foo
wget: bad address 'website.foo'

External DNS

Required to discover…

  • …external resources use from the Kubernetes cluster
  • …external load-balancing service
  • …ingress and gateway services

Two options to integrate external DNS resolution…

  • ExternalDNS8 to synchronize with supported third-party DNS providers via API
  • External DNS zone via k8s_gateway9 …NS record for the delegation
    • …queries to sub-domain will be forwarded
    • …plugin responds as authoritative nameserver

Services Abstraction

Multi-pod service abstraction groups similar pods & load-balance traffic to them

# list all services in a cluster
kubectl get service

# create a load-balancer service object 
kubectl expose $object $name --type=LoadBalancer --name=$name

# remove a service object
kubectl delete service $name

Why using a service?

  • Pod groups
    • …all pods with a similar label represent a service
    • …incoming traffic is load-balanced to all pods in a service
  • Service exposure …either cluster internal and/or external
  • Route external connections
    • …clients do not need to know individual pods
    • …single constant IP-address for a service

Overview

Kubernetes provides several services to facilitate external traffic into a cluster…

  • Headless …simplest load-balancing (round-robin) by DNS
  • ExternalName …access to a service by external DNS name
  • ClusterIP …default for for internal communications
  • NodePort …exposes a service on a static port on each node’s IP
    • …makes the service accessible outside of the cluster
    • …most basic way to perform external-to-service networking
  • LoadBalancer …standard for external-service networking
    • …assigns service to a public IP address
    • …external load balancer is then directed to the backend pod
  • Ingress …collection of routing rules surrounding external access to services

ClusterIP

ClusterIP — Reserve a static virtual IP address

  • …internal IP …reachable only within the cluster
  • …maintains the security boundaries of the cluster
  • Pod-to-service communication
    • …used for internal communications between pods and services
    • …traffic load-balanced within the cluster

Example

Create some pods via a deployment

>>> kubectl create deployment website --replicas=3 --image=httpd

# all pods have the `app=website` label
>>> kubectl get pod --show-labels
NAME                       READY   STATUS    RESTARTS   AGE    LABELS
website-5d755d9996-2h4c2   1/1     Running   0          3m1s   app=website,pod-template-hash=5d755d9996
website-5d755d9996-2vpdm   1/1     Running   0          3m1s   app=website,pod-template-hash=5d755d9996
website-5d755d9996-fck9z   1/1     Running   0          3m1s   app=website,pod-template-hash=5d755d9996

Create a ClusterIP service

>>> cat > service.yaml <<EOF
apiVersion: v1
kind: Service
metadata:
  name: website
spec:
  ports:
  - port: 80
    name: http
  selector:
    app: website
EOF
>>> kubectl apply -f service.yaml
# list the service including IP (only an internal IP)
>>> kubectl get service website       
NAME      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
website   ClusterIP   10.96.194.111   <none>        80/TCP    2m11s

# IP address allocated from a predefined range
>>> kubectl cluster-info dump | grep -m 1 service-cluster-ip-range
                            "--service-cluster-ip-range=10.96.0.0/16"

>>> kubectl get endpointslice | grep ^website         
website-5x9x7   IPv4          80      10.244.2.89,10.244.1.133,10.244.1.139 8m41s

Query the ClusterIP from a client pod:

# Start a client pod…
>>> kubectl run -it client --image busybox

# …send a GET request to the ClusterIP
>> wget -qO - http://10.96.194.111
<html><body><h1>It works!</h1></body></html>

Check the service logs for the rerquest

kubectl logs -l app=website | grep GET

Observer changes after:

# modify the replication scale
kubectl scale deployment website --replicas=2

# remove a pod
kubectl delete pod website-$name

Clean up…

kubectl delete pod/client service/website deploy/website

Node Port

NodePort — Forward traffic to specific port

  • Accessible from outside the cluster via node IP address
  • Each node forwards traffic to a specific port

Example

Nodeport service manifest…

service.yaml
apiVersion: v1
kind: Service
metadata:
  name: website
spec:
  type: NodePort
  ports:
  - port: 80
    nodePort: 30080
    name: http
  selector:
    app: website
kubectl apply -f service.yaml
kubectl create deployment website --replicas=3 --image=httpd

# later clean up
kubectl delete deployment/website service/website

Verify by connecting to the node port:

# identify the workers hosting the pods
>>> kubectl get pods -o wide    
NAME                       READY   STATUS    RESTARTS   AGE    IP NODE            NOMINATED NODE   READINESS GATES
website-5d755d9996-4dcr2   1/1     Running   0          112s   10.244.2.45        delta-worker2   <none>           <none>
website-5d755d9996-jnd8z   1/1     Running   0          112s   10.244.1.37        delta-worker    <none>           <none>
website-5d755d9996-tcswb   1/1     Running   0          112s   10.244.1.157       delta-worker    <none>           <none>

# select one of the nodes and get the node IP-address
>>> kubectl get nodes delta-worker -o wide
NAME           STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP OS-IMAGE                         KERNEL-VERSION           CONTAINER-RUNTIME
delta-worker   Ready    <none>   28m   v1.33.1   172.18.0.4    <none>      Debian GNU/Linux 12 (bookworm)   6.14.6-200.fc41.x86_64   containerd://2.1.1

# send a GET request to the node port
>>> curl -s 172.18.0.4:30080
<html><body><h1>It works!</h1></body></html>

Load Balancer

Requires an external load balancer with public IP

  • Accessible from outside via load balancer IP address
  • For production …distributes traffic over nodes
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: LoadBalancer
  selector:
    app: my-app          # identify pods for this service
  ports:
    - port: 8080         # service port
      targetPort: 8080   # foward to container port

Configure session affinity for a service…

  • …clients are redirected to the same port every time
  • …defaults to None if not specified
spec:
  sessionsAffinity: ClientIP

Pods backing a service can not see the actual client IP-address

  • …packets source IP changed for cluster internal routing
  • …SNAT (Source Network Access Translation) performed on each package

Orchestration

Kubernetes orchestration of external load-balancer depends on environment:

Cloud-based cluster…

  • Network Load Balancer (NLB) for Amazon Elastic Kubernetes Service (EKS)
  • Standard Load Balancer for Azure Kubernetes Services (AKE)
  • Cloud Load Balancer for Google Kubernetes Engine (GKE)
  • LBaaS plugin for Openstack
  • NSX ALB for VMWare
  • in-cluster component called cloud-controller-manager

On-prem cluster…

  • …existing load-balancer appliances
  • …direct interaction with the physical network by cluster add-ons
    • ARP (for L2 integration)
    • BGB (for L3 integration)
  • Kubernetes hosted load-balancers…
    • MetalLB
      • ARP and BGP modes …custom user-space implementation
      • Configured with ConfigMap and custom CRD based operator
    • OpenELB (part of Kubesphere)
      • ARB and BGP modes
      • Configuration via CRDs
    • Kube-vip
      • ARB and BGP modes
      • Configured via flags, environment variables and ConfigMap
    • ServiceLB10 (aka Klipper, integrated with K3s)
      • Exposes LoadBalancer as host ports on all cluster noeds

Ingress

Ingress — Associate a URL with a backend service

  • Operates on the application layer of the network stack (HTTP)
  • Works in conjunction with Kubernetes Services and Endpoints
  • Why use ingress?
    • Path- & Host-based routing …typically an URL path
    • Multiple services can share a single IP-address
    • Manage SSL certificates and terminate SSL connections
    • Manage authentication and authorization
  • Terminology…
    • Ingress Resource — Routing rules directing external traffic to a services
    • Ingress Controller — Implements the rules defined in the ingress resource
    • Backend Services — Services receiving the traffic directed by Ingress

Reverse Proxy

Ingress serves as a reverse proxy

  • …intermediary between client and server that forwards requests
  • Ingress Controllers are pods running within the Kubernetes cluster…
    • NGINX Ingress Controller11
    • Traefik12 HTTP reverse proxy and load balance
    • HAProxy Ingress Controller13
  • …enforcing the rules set in the Ingress resources
  • Control flow of inbound requests and direct them to the appropriate service

Ingress Rules

Ingress resources contain one or more Ingress rules

  • Component of the Ingress resource that specifies the actual routing logic
  • Each Ingress rule specifies a set of conditions (like host and path) and the corresponding backend service to which the traffic should be directed
  • Path-Based Routing
    • Routing directs traffic based on the URL path
    • example.org/hello & example.org/world redirect to different services
  • Host-Based Routing
    • Routing traffic bases on the hostname (or domain)
    • hello.example.org & world.example.org redirect to different services

Port Forward

Connect to a specific pod without going through a service…

  • …typically for debugging & testing individual pods
  • …notation is local port, colon followed by port in the pod
# forward a local network port to a port in the pod
kubectl port-forward $pod_name 30080:80   # local:remote

# select a specific container
kubectl port-forward $pod_name 38080:8080 -c $container_name

# multiple ports
kubectl port-forward $pod_name 30080:80,30443:443

SSH tunneling to a node with accesses to a Kubernetes cluster

ssh -L 38080:localhost:38080 $user@$node
kubectl port-forward $pod_name 38080:8080

Use proxy management like kubefwd14