Layer 4: Zero-Trust Networking

Every connection is denied until you explicitly say it is allowed — no implicit trust, no open back doors.

Default-deny networking: only the paths you name exist, and pod traffic is encrypted.

What “zero trust” means in plain English

Most home networks work like a medieval castle: hard wall on the outside, but once you are inside, everyone trusts everyone. Kubernetes clusters shipped the same way — once a container was running, it could reach any other container on any port.

Zero trust flips that: nothing is allowed unless you wrote a rule that says it is. Every pod, every service, every direction of traffic has to be explicitly permitted. An attacker who breaks into one container cannot pivot to any other part of the system because the network itself refuses the connection before the destination ever sees a packet.

Imagine every room in a building has its own locked door, and there is a security guard checking your ID at each door — not just the front entrance. Zero-trust networking is that for containers. The rule book starts empty (“deny everything”) and you add entries one at a time.

In plain English

This layer puts that rule book in place using Cilium — the software that replaces Kubernetes’ default network layer and enforces policy at the Linux kernel level using a technology called eBPF.

Why Cilium replaces what k3s shipped

k3s installs a network plugin called Flannel by default. Flannel connects containers together but knows nothing about security — it has no concept of “allow this, deny that.” It also relies on an old Linux firewall mechanism (iptables) that is slow and hard to reason about.

Cilium does everything Flannel does, and much more:

Capability	Flannel	Cilium
Connect pods across nodes	Yes	Yes
Enforce network policy (allow/deny)	No	Yes — L3/L4/L7
Inspect HTTP paths, DNS queries	No	Yes
Encrypt pod-to-pod traffic	No	Yes (WireGuard)
Replace kube-proxy (iptables)	No	Yes (eBPF)
Network traffic observability	No	Yes (Hubble)
Serve as the ingress controller	No	Yes (Gateway API)

ingress-nginx — the most common Kubernetes ingress controller — reached official end-of-life in March 2026 and is no longer receiving security patches. Cilium’s built-in Gateway API replaces it. Source: CNCF blog, January 2026

Note

eBPF is a way to run small, sandboxed programs inside the Linux kernel — the core of the operating system — without recompiling it. Cilium uses eBPF to intercept every network packet as it arrives or leaves a container. It reads the policy you wrote (e.g. “frontend can talk to backend on port 8080, nothing else”) and enforces it in microseconds, in kernel space, before the packet travels anywhere.

In plain English

Prerequisites

Layer 2 (k3s install) must have started k3s with:

bash

--flannel-backend=none --disable-network-policy --disable-kube-proxy

These flags tell k3s: “do not touch the network — we will bring our own.” If your k3s was started differently, re-install it with those flags before proceeding. The Layer 2 script handles this.

Step 1 — Install Cilium via Helm

Cilium is installed with helm, the Kubernetes package manager.

On the host (root) — or run scripts/cluster/21-network-cilium.sh

# Add the Cilium Helm repository
helm repo add cilium https://helm.cilium.io/
helm repo update

# The node's IP address — Cilium needs this to reach the API server
export API_SERVER_IP=$(hostname -I | awk '{print $1}')
export API_SERVER_PORT=6443

helm install cilium cilium/cilium \
  --namespace kube-system \
  --version 1.17.4 \
  --values /home/user/hardened-k3s/scripts/config/network/cilium-values.yaml \
  --set k8sServiceHost=${API_SERVER_IP} \
  --set k8sServicePort=${API_SERVER_PORT}

Wait for Cilium to be ready:

Verify Cilium is running

kubectl -n kube-system rollout status daemonset/cilium --timeout=120s
kubectl exec -n kube-system ds/cilium -- cilium-dbg status | grep "KubeProxyReplacement"
# Expected: KubeProxyReplacement: True

Until Cilium is fully running, pods cannot communicate. Do not deploy workloads between helm install and the rollout status check completing.

Important

What the Helm values do

See scripts/config/network/cilium-values.yaml for the full annotated file. Key settings:

Setting	Value	Why
`kubeProxyReplacement`	`true`	Cilium handles all service routing via eBPF — no iptables
`bpf.masquerade`	`true`	Source-NAT via eBPF instead of iptables MASQUERADE rules
`loadBalancer.algorithm`	`maglev`	Consistent hashing — same client always reaches same backend
`encryption.type`	`wireguard`	Encrypt all pod-to-pod traffic automatically
`encryption.nodeEncryption`	`true`	Also encrypt kubelet-to-kubelet traffic
`hubble.relay.enabled`	`true`	Aggregate flow logs across nodes
`gatewayAPI.enabled`	`true`	Enable the built-in ingress controller
`operator.replicas`	`1`	Single-node cluster; one operator replica is correct

Step 2 — Transparent encryption between pods (WireGuard)

Normally when two containers talk to each other across nodes, the packets travel the physical network unencrypted — anyone on that network segment can read them. Cilium’s WireGuard mode wraps every pod-to-pod packet in an encrypted tunnel automatically. You do not change any application code; Cilium handles it invisibly.

In plain English

When encryption.type=wireguard is set, Cilium automatically:

Generates a WireGuard key pair on each node at startup.
Distributes public keys via CiliumNode annotations — no manual key management.
Creates an encrypted tunnel (UDP/51871) between every pair of nodes. Note: UDP/51871 is Cilium’s internal node-to-node tunnel port — it travels the cluster’s virtual network and does not require a host-firewall rule. It is entirely separate from the admin VPN (wg0, UDP/51820); in a single-host topology only UDP/51820 needs a firewall hole.
Encrypts every pod-to-pod packet, including traffic that stays on one node.

Why WireGuard over IPsec: Raspberry Pi and ARM64 hardware lack AES-NI hardware acceleration. IPsec uses AES-GCM, which is slow without hardware offload. WireGuard uses ChaCha20-Poly1305, which is fast in software. On ARM hardware, WireGuard consistently outperforms IPsec. Source: Berops traffic encryption benchmarks

Verify encryption is active:

Confirm WireGuard encryption

kubectl exec -n kube-system ds/cilium -- cilium-dbg encrypt status
# Expected output contains:
#   Encryption: WireGuard
#   Keys in use: N (where N = number of nodes)

Step 3 — Default-deny: block everything, then open what you need

Imagine a building where every door is locked by default. The policy you are about to apply is like replacing every lock in the entire building simultaneously. Nothing gets through until you hand out a specific key.

In plain English

Apply the cluster-wide default-deny policy first:

Apply default-deny and baseline DNS allow

kubectl apply -f /home/user/hardened-k3s/scripts/config/network/ccnp-default-deny.yaml
kubectl apply -f /home/user/hardened-k3s/scripts/config/network/allow-dns.yaml

Apply these two files together in one command, or apply DNS-allow immediately after default-deny. If you apply default-deny alone and wait, existing pods will lose DNS resolution and start failing within seconds.

Caution

Why DNS must be the first exception

DNS is how containers look up the address of other services (e.g. my-database.my-namespace.svc.cluster.local → 10.43.x.x). Without DNS, your application code will fail with “name not resolved” errors even when the destination service is running. The allow-dns.yaml permits all pods to reach the cluster’s DNS server (kube-dns in kube-system) on port 53.

Step 4 — How to allow a specific port or IP

Every application you deploy needs its own allow rules. The file scripts/config/network/example-allow-port-ip.yaml is a heavily commented template — copy it, rename it, and adjust the three values shown.

The three things you always specify:

Which pods this rule applies to — identified by their labels (e.g. app: my-api)
What direction — ingress (traffic coming in) or egress (traffic going out)
What is allowed — a port number, a CIDR (IP range), or both

A pod label is like a name badge. You write rules like “anyone wearing a badge that says app: my-api is allowed to receive traffic on port 8080 from anyone wearing a badge that says app: my-frontend.” The rule sticks to the badge, not to the server’s IP address.

In plain English

Example: allow your frontend to call your API on port 8080:

Allow frontend → API on port 8080

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-frontend-to-api
  namespace: my-app
spec:
  endpointSelector:
    matchLabels:
      app: my-api              # Rule applies to pods labelled app=my-api
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: my-frontend   # Only allow traffic from pods labelled app=my-frontend
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP

Example: allow a pod to reach a specific external IP range (e.g. your monitoring server at 192.168.1.50):

Allow egress to a specific CIDR

spec:
  endpointSelector:
    matchLabels:
      app: my-app
  egress:
    - toCIDR:
        - 192.168.1.50/32      # Single IP. Use /24 for a subnet, /32 for one host.
      toPorts:
        - ports:
            - port: "9090"
              protocol: TCP

See scripts/config/network/example-allow-port-ip.yaml for the full annotated template with every option explained.

Step 5 — How to allow egress to specific hostnames (FQDN egress)

Standard network policies work with IP addresses. But cloud services (GitHub, S3, your upstream API provider) have dynamic IPs that change without warning. Cilium’s DNS/FQDN policy solves this: you name the hostnames you want to allow, and Cilium watches DNS responses to learn their current IPs automatically.

Instead of writing a rule that says “allow traffic to 140.82.112.3” — an IP that GitHub uses today but might change tomorrow — you write “allow traffic to api.github.com”. Cilium watches every DNS lookup your pod makes and automatically updates the allowed IP list when the address changes.

In plain English

How it works (two-rule pattern): You need exactly two rules — always both together:

A rule allowing the pod to ask DNS about the specific hostname.
A rule allowing the pod to actually reach the resolved IP (by FQDN name, not IP).

Allow egress to api.github.com on port 443

spec:
  endpointSelector:
    matchLabels:
      app: my-app
  egress:
    # Rule 1: allow DNS lookups — but only for the specific hostname(s) you need
    - toEndpoints:
        - matchLabels:
            "k8s:io.kubernetes.pod.namespace": kube-system
            "k8s:k8s-app": kube-dns
      toPorts:
        - ports:
            - port: "53"
              protocol: UDP
          rules:
            dns:
              - matchName: "api.github.com"      # exact hostname
              # - matchPattern: "*.s3.amazonaws.com"  # wildcard — use sparingly
    # Rule 2: allow the actual outbound connection to the resolved IP(s)
    - toFQDNs:
        - matchName: "api.github.com"
      toPorts:
        - ports:
            - port: "443"
              protocol: TCP

If you skip Rule 1 (the DNS allow), Cilium never sees the DNS response and never learns which IP belongs to the hostname — Rule 2 will silently match nothing. Both rules are required.

Important

See scripts/config/network/example-fqdn-egress.yaml for the annotated template.

Step 6 — Hubble: watch what is actually happening

Hubble is Cilium’s built-in observability layer. It records every network flow — which pod talked to which, what ports, whether the connection was allowed or dropped, and at L7 (HTTP, DNS, gRPC) what method or query was used. It runs as a DaemonSet; Hubble Relay aggregates flows cluster-wide.

Real-time flow watching

# Install the cilium CLI (if not already present)
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
# Map uname -m to the architecture string used in the release filename
CLI_ARCH=$(uname -m); [ "$CLI_ARCH" = "x86_64" ] && CLI_ARCH="amd64" || CLI_ARCH="arm64"
curl -L --remote-name-all \
  https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz
tar -xzf cilium-linux-${CLI_ARCH}.tar.gz -C /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz

# Forward the Hubble API to your laptop
cilium hubble port-forward &

# Watch all dropped flows in real time (your most useful debug tool)
hubble observe --verdict DROPPED --follow

# Watch flows for a specific namespace
hubble observe --namespace my-app --follow

# Open the browser UI
cilium hubble ui    # opens http://localhost:12000

When you apply a new allow policy and traffic still does not flow, run hubble observe --verdict DROPPED --follow in one terminal and reproduce the connection in another. Hubble will show you exactly which policy is dropping the packet and why — this is far faster than reading Cilium logs.

Tip

Enable L7 visibility (HTTP paths, DNS query names) for a pod:

Enable L7 inspection for a pod

kubectl annotate pod <pod-name> \
  policy.cilium.io/proxy-visibility="<Egress/53/UDP/DNS>,<Egress/443/TCP/HTTP>"

Step 7 — Admin VPN: the only way to reach kubectl

The Kubernetes API server (port 6443) is the master control plane. Anyone who can reach it with a valid credential owns your cluster. Never expose 6443 to the internet. This step puts a WireGuard VPN in front of it so that even if your firewall is misconfigured, the API server is unreachable without first completing the VPN handshake.

Hardened

WireGuard is a modern VPN built into the Linux kernel. It works differently from OpenVPN or IPsec:

Configuration is minimal — two files (server config, client config) and two key pairs.
The VPN only exists when both sides know each other’s public key.
It is completely silent on the network until a valid handshake arrives — scanners see nothing.

Think of WireGuard as a secret door that is invisible unless you knock in exactly the right way (your private key). Port scanners and attackers see nothing at UDP/51820 — no banner, no response, no evidence the port is open. Only your laptop, with the matching private key, can get in.

In plain English

Run the host setup script:

On the host (root)

bash /home/user/hardened-k3s/scripts/host/14-wireguard.sh

The script will:

Install wireguard-tools.
Generate the server key pair under /etc/wireguard/ (mode 600, root-only).
Write /etc/wireguard/wg0.conf from the template.
Enable and start wg-quick@wg0.
Print the client config block you copy to your laptop.

Then configure your laptop. Install WireGuard from wireguard.com/install, add a new tunnel, and paste the client config block printed by the script. The client config will look like:

Client config (printed by the script — copy to your laptop)

[Interface]
Address = 10.100.0.2/32
PrivateKey = <CLIENT_PRIVATE_KEY — generated by you on your laptop>
DNS = 10.100.0.1

[Peer]
PublicKey = <SERVER_PUBLIC_KEY — printed by the script>
Endpoint = <YOUR_PUBLIC_IP>:51820
AllowedIPs = 10.100.0.0/24, 10.42.0.0/16, 10.43.0.0/16
PersistentKeepalive = 25

Generate your client key pair on your laptop, not on the server:

bash

wg genkey | tee client_private.key | wg pubkey > client_public.key

Give only your public key to the server config. The private key never leaves your laptop.

Note

After the VPN is up, point kubectl at the VPN address:

Configure kubectl to use the VPN address

kubectl config set-cluster homelab --server=https://10.100.0.1:6443
kubectl config use-context homelab
kubectl get nodes    # should work only when VPN is connected

Add additional admin peers (e.g. a second machine) by adding [Peer] blocks to /etc/wireguard/wg0.conf and running wg addconf wg0 <(wg-quick strip wg0) — no restart required.

Patching the nftables firewall

The Layer 0 firewall (/etc/nftables.conf) has a placeholder WG_PORT_PLACEHOLDER. The Layer 4 script replaces it with 51820. If you run the script you do not need to edit nftables manually. To verify:

bash

nft list ruleset | grep 51820
# Should show:  udp dport 51820 accept

Step 8 — TLS ingress via Cilium Gateway API

An ingress controller is the front door of your cluster — it receives web traffic from the internet (ports 80 and 443) and routes it to the right internal service. Cilium’s Gateway API is that front door, and cert-manager is the locksmith that automatically gets and renews your TLS certificates from Let’s Encrypt.

In plain English

Install Gateway API CRDs

Gateway API is a Kubernetes standard — its resource definitions must be installed before Cilium can use them:

Install Gateway API CRDs (standard channel)

kubectl apply -f \
  https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.0/standard-install.yaml

Install cert-manager

cert-manager watches for Certificate and Issuer resources and automatically requests, renews, and stores TLS certificates:

Install cert-manager

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.20.2 \
  --set crds.enabled=true

Apply the Gateway and ClusterIssuer

Apply gateway + TLS config

# Edit the email address in the file first
kubectl apply -f /home/user/hardened-k3s/scripts/config/network/gateway-tls.yaml

Then label any namespace whose services should be reachable through the gateway:

bash

kubectl label namespace my-app gateway-access=true

And create an HTTPRoute pointing at your service:

HTTPRoute — route traffic to my-app

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: my-app
  namespace: my-app
spec:
  parentRefs:
    - name: homelab-gateway
      namespace: ingress
      sectionName: https
  hostnames:
    - "myapp.example.com"      # Replace with your actual domain
  rules:
    - backendRefs:
        - name: my-app-svc
          port: 8080

Let’s Encrypt HTTP-01 challenge requires that port 80 be reachable from the internet. If your ISP blocks port 80, use DNS-01 challenge instead — cert-manager supports this via DNS provider plugins (Cloudflare, Route53, etc.). See cert-manager DNS01 docs.

Tip

CrowdSec brute-force defence

CrowdSec watches log files for attack patterns (SSH brute force, HTTP scanning, credential stuffing) and installs block rules via an nftables bouncer. It also shares threat intelligence with a community blocklist.

Install CrowdSec and nftables bouncer

curl -s https://packagecloud.io/install/repositories/crowdsec/crowdsec/script.deb.sh | bash
apt-get install -y crowdsec crowdsec-firewall-bouncer-nftables

# Install the SSH scenario
cscli collections install crowdsecurity/sshd

# Check active decisions (IPs currently banned)
cscli decisions list

Putting it all together: the network stack from bottom to top

text

┌──────────────────────────────────────────────────────┐
│  Internet                                            │
│                         │ UDP/51820 WireGuard         │
│                         │ TCP/80,443 web              │
├─────────────────────────┼────────────────────────────┤
│  nftables (host)        │ default-drop; explicit      │
│  + CrowdSec bouncer     │ allows only                 │
├─────────────────────────┼────────────────────────────┤
│  WireGuard wg0          │ admin VPN — kubectl/SSH     │
│  (host kernel)          │ inside tunnel only          │
├────────────────────────────────────────────────────── │
│  Cilium eBPF (kernel)   │ all pod traffic intercepted │
│  • kube-proxy replaced  │ CiliumClusterwideNetworkPolicy│
│  • WireGuard encryption │ default-deny; explicit rules│
│  • Gateway API (Envoy)  │ TLS termination port 443    │
│  • Hubble relay         │ per-flow observability      │
└──────────────────────────────────────────────────────┘

Checklist

[ ] k3s started with --flannel-backend=none --disable-network-policy --disable-kube-proxy
[ ] Cilium installed via Helm; cilium-dbg status shows KubeProxyReplacement: True
[ ] WireGuard encryption active: cilium-dbg encrypt status shows Encryption: WireGuard
[ ] Default-deny policy applied cluster-wide
[ ] DNS allow applied (pods can resolve internal service names)
[ ] Host WireGuard VPN running: wg show wg0 shows interface and peer
[ ] kubectl configured to use VPN address 10.100.0.1:6443
[ ] Port 6443 reachable only from 10.100.0.0/24 (nftables rule in place)
[ ] Hubble relay running; hubble observe returns flows
[ ] Gateway API CRDs installed; cert-manager running
[ ] TLS certificate issued by Let’s Encrypt for your domain
[ ] CrowdSec installed with nftables bouncer active

What this layer bought you

Encrypted transit everywhere. Every pod-to-pod packet — including traffic between services on the same node — is wrapped in WireGuard encryption. A compromised switch or a co-located attacker sees ciphertext.

Default-deny networking. Every new workload you deploy is isolated until you write a rule. Lateral movement from a compromised container requires breaking the policy engine, not just finding an open port.

FQDN egress control. Your pods can reach api.github.com but not exfil.attacker.com — even if both resolve to IPs in the same range. DNS-level policy prevents data exfiltration to unlisted hosts.

Invisible API server. The Kubernetes control plane listens only inside the WireGuard VPN. An attacker scanning your public IP from the internet sees one open UDP port (51820) that is completely silent until they present the right key.

Flow observability before an incident. Hubble records every allowed and dropped flow. When something breaks — or when you are investigating a security event — you have a per-second record of what talked to what, what was blocked, and what DNS name each IP resolved from.

No ingress-nginx. The most widely exploited Kubernetes component is gone; Cilium’s Gateway API implementation carries no CVE backlog from the ingress-nginx era and is maintained by an active CNCF Graduated project.