Layer 2: Hardened K3s
Install Kubernetes so locked-down that it passes the CIS security benchmark out of the box — every secret encrypted, every door closed.
By the end of this chapter you will have a running Kubernetes cluster where: secrets are encrypted in the database before they ever touch disk, every container is forced into a security sandbox, an audit trail records every sensitive action, and the cluster’s control port is invisible to the public internet. You will also know how to prove all of this with an automated scanner.
What is k3s, and why use it here?
Kubernetes is the software that decides which containers run on which machines, restarts them when they crash, and handles networking between them. A full Kubernetes installation has a dozen separate programs. k3s is Rancher’s single-binary packaging of all of them — the same technology, a fraction of the footprint. On a homelab machine it starts in seconds and uses a few hundred megabytes of RAM instead of several gigabytes.
More importantly for this guide: k3s ships with several hardening defaults that vanilla Kubernetes does not. We will build on those defaults rather than work around them.
Kubernetes is the manager of your containers — it knows where everything is running and makes sure it keeps running. K3s is a smaller, pre-configured version of that manager, designed for machines that aren’t enormous cloud servers. Think of it as the same job title, but a leaner employee who already knows the security rules.
What k3s gives you for free
Before adding anything, k3s already does:
- RBAC enabled by default. Every action inside the cluster requires an explicit permission. Nothing works by default except what k3s itself needs to function.
- Node restriction — a Kubernetes node (the machine running containers) can only see and modify its own resources, not those of other nodes.
- Kubelet client-certificate authentication. The kubelet (the per-node agent) proves its identity with a certificate before the API server talks to it.
- Bound service account tokens. When a pod needs to talk to Kubernetes itself, it gets a short-lived, scoped token — not an old-style permanent secret.
RBAC is a permission system: every user and every program inside the cluster has to show its badge before doing anything. k3s ships with RBAC switched on. That is not the default in plain Kubernetes.
What needs manual work
The gaps that remain — each closed in this chapter:
| Gap | Why it matters |
|---|---|
| Secrets encryption at rest | Without it, anyone who copies the database gets every password in plain text |
| Audit logging | Without it you have no record of who did what |
| Pod security enforcement | Without it any container can run as root and escape to the host |
| Read-only port closed | CVE-2025-46599 — left open, exposes every pod’s environment variables with zero auth |
| TLS cipher restriction | Prevents downgrade attacks to weak encryption |
| Unused components disabled | Smaller attack surface |
| Default service account tokens revoked | Prevents containers from talking to Kubernetes when they have no reason to |
The one rule you cannot break: secrets encryption at first boot
secrets-encryption: true must be set before k3s runs for the first time. You cannot enable it on an existing cluster without a full re-encryption procedure that risks data loss. The install script enforces this. Do not run k3s manually before running the script.
When secrets encryption is on, every Kubernetes Secret object is encrypted with secretbox (XSalsa20-Poly1305) before being written to the etcd database — this is the k3s default provider since v1.32.4, and is what this guide’s v1.36 build uses. An attacker who steals a copy of the database gets ciphertext, not your actual secrets. The encryption key is stored separately on the server and can be rotated without restarting the cluster.
Think of the database as a filing cabinet. Without encryption, the filing cabinet stores documents in plain text — anyone who picks the lock reads everything. With secrets encryption, the filing cabinet stores scrambled text, and the decoder ring lives somewhere else. Stealing the cabinet alone gets you nothing.
Layer 0 dependency: kernel parameters
protect-kernel-defaults: true in the k3s config tells the kubelet: “before you do anything, check that the Linux kernel is configured the way you expect. If it isn’t, refuse to start.” This is a safety net — it catches machines where someone changed a kernel setting that Kubernetes relies on.
The required kernel settings are written by Layer 0 (/etc/sysctl.d/90-kubelet.sysctl.conf). The install script checks they are applied before touching k3s. If you run this script on a fresh machine without running Layer 0 first, it will stop and tell you.
It is like a car that checks its own oil level before the engine will start. The kernel parameters are the oil. Layer 0 put the oil in. If the oil is missing, k3s refuses to move.
The datastore: embedded etcd
k3s ships two datastore options: SQLite (simple, single-node only) and embedded etcd (production-grade, HA-ready). We use embedded etcd (cluster-init: true), because:
- The CIS hardening guide assumes etcd — some benchmark checks only apply to etcd.
- etcd encrypts its snapshots with AES-256 derived from the cluster join token.
- If you ever add a second node for high availability, etcd supports it; SQLite does not.
SQLite is a basic notebook. etcd is a bank vault with a backup system and a combination lock. Both store the same data; one is ready for a real security posture.
etcd snapshots contain all cluster data including CA private keys. Snapshot encryption is derived from the cluster join token. Store the token and the snapshots in separate places. Anyone with both can decrypt everything.
The config file — every setting explained
K3s reads a single YAML file at startup: /etc/rancher/k3s/config.yaml. The hardened version lives at scripts/config/k3s/config.yaml. Every flag is explained below.
cluster-init: true # use embedded etcd, not SQLite
secrets-encryption: true # encrypt secrets in etcd — MUST be set at first boot
protect-kernel-defaults: true # refuse to start if kernel params are wrong
write-kubeconfig-mode: "0600" # kubeconfig readable only by root
tls-san:
- "10.0.0.1" # your WireGuard IP — edit this
- "k3s.internal" # optional hostname
disable:
- traefik # replace with a hardened ingress controller if needed
- servicelb # replace with MetalLB or kube-vip
- local-storage # replace with Longhorn or a CSI driver
- metrics-server # deploy your own if needed
# Layer 4 (Cilium) prep: disable the built-in network stack
# so Cilium can replace it completely.
flannel-backend: none
disable-network-policy: true
disable-kube-proxy: true
The three lines at the bottom (flannel-backend: none, disable-network-policy: true, disable-kube-proxy: true) leave the cluster with no network until Layer 4 installs Cilium. Do not try to run workloads between Layer 2 and Layer 4.
Flannel is k3s’s built-in “how containers talk to each other” system. We turn it off because Layer 4 installs a much more powerful replacement (Cilium). Until that replacement arrives, containers cannot talk to each other — which is fine because we haven’t put any workloads in yet.
API server hardening flags
These go under kube-apiserver-arg: — they tune the gatekeeper that every kubectl command talks to.
| Flag | What it does |
|---|---|
anonymous-auth=false |
Rejects any request with no identity (already the k3s default; explicit for auditability) |
enable-admission-plugins=... |
Turns on four extra security checks at pod admission time (see PSA section) |
admission-control-config-file=... |
Points to the PSA + EventRateLimit config |
audit-log-path=... and three related flags |
Turns on audit logging with 30-day retention |
tls-cipher-suites=... |
Restricts TLS to forward-secret, authenticated encryption only (ECDHE + AES-GCM / ChaCha20) |
tls-min-version=VersionTLS12 |
Rejects TLS 1.0 and 1.1 connections |
service-account-extend-token-expiration=false |
Disables the legacy long-lived token extension |
request-timeout=300s |
Caps how long a single API request can run |
Kubelet hardening flags
These go under kubelet-arg: — they tune the per-node agent.
| Flag | What it does |
|---|---|
read-only-port=0 |
Closes port 10255 (CVE-2025-46599 — see below) |
streaming-connection-idle-timeout=5m |
Times out idle exec/attach sessions |
event-qps=5 / event-burst=10 |
Rate-limits events from kubelet |
tls-cipher-suites=... |
Same cipher restriction as the API server |
seccomp-default=true |
Applies a system-call filter to every container by default (Layer 3 depends on this) |
rotate-server-certificates=true |
Auto-renews kubelet TLS certs before they expire |
CVE-2025-46599 — the read-only port regression
CVSS 7.5 High. k3s 1.32 before 1.32.4 accidentally re-enabled kubelet port 10255. Port 10255 requires zero authentication and exposes pod metadata, environment variables (which often contain secrets), and node information.
The bug: a Go programming quirk (omitempty) silently stripped the explicit 0 value from the kubelet config, reverting to the compiled default — which is 10255. The fix in k3s 1.32.4+ corrects this. We are on 1.36.1 (unaffected), but we also set read-only-port=0 explicitly as belt-and-suspenders — if a future regression repeats the bug, the explicit flag wins.
# Should return "connection refused" — any other response means the port is open
curl -sk http://127.0.0.1:10255/pods && echo "PORT IS OPEN — INVESTIGATE" || echo "port 10255 closed OK"
Imagine a side entrance to your house that was supposed to be locked but kept unlocking itself due to a software glitch. The fix is to deadbolt it and also add a padlock — two independent mechanisms so a future glitch can’t undo both at once.
Audit logging
The audit log is a chronological record of every sensitive action taken inside the cluster: who ran kubectl exec, what pods were created, who changed a permission. It is the cluster’s CCTV footage.
Kubernetes audit logging requires four flags and a policy file. If any piece is missing, logging silently fails. The install script writes everything before k3s starts.
The policy (scripts/config/k3s/audit-policy.yaml) uses a layered approach:
- Suppressed entirely: health checks, leader-election churn — pure noise with zero security value.
- Metadata only: secrets, configmaps, service accounts — logs who accessed them but never logs the actual value (so your secrets do not appear in the audit log).
- Full request + response: exec/attach/port-forward, RBAC changes, token reviews — high-risk operations where you want every detail.
- Request body only: pod and workload mutations, network policy changes.
- Everything else: metadata level — nothing goes unlogged.
Audit logs land at /var/lib/rancher/k3s/server/logs/audit.log, rotated at 100 MB, kept for 30 days, up to 10 files.
For alerting, forward the audit log to a SIEM with Promtail → Loki or Filebeat → Elasticsearch. The critical alert rules: any pods/exec from a non-operator identity, any clusterrolebindings mutation, any secrets delete verb.
Pod Security Admission — every container in a restricted sandbox
Pod Security Admission (PSA) is the Kubernetes mechanism that inspects every container definition before it runs and rejects anything that violates your security policy. It replaced the older PodSecurityPolicy in Kubernetes 1.25.
We enforce the restricted standard cluster-wide. What restricted requires of every container:
- Run as a non-root user
- Drop all Linux capabilities
- No privilege escalation allowed
- A seccomp filter must be applied (which
seccomp-default=trueon the kubelet provides) - No access to the host’s network, PID namespace, or IPC namespace
Think of restricted as a strict rule: every container has to work as a regular user, cannot use any special system tools, and is completely isolated from the machine it runs on. If a container definition breaks any of these rules, Kubernetes refuses to start it.
Three namespaces are exempted: kube-system, kube-public, and kube-node-lease. These run k3s’s own infrastructure (CoreDNS, the CNI agent, etcd) which legitimately needs privileged access. Every namespace you create gets restricted enforcement automatically.
The config lives at scripts/config/k3s/psa.yaml. It is passed to the API server via admission-control-config-file.
defaults:
enforce: "restricted"
enforce-version: "latest"
audit: "restricted" # violations in exempted namespaces still appear in the audit log
warn: "restricted" # kubectl prints a warning even for non-enforced violations
exemptions:
namespaces:
- kube-system
- kube-public
- kube-node-lease
When you add a monitoring namespace that runs node-exporter (which needs hostPID: true), add that namespace to the exemptions list and tightly scope its RBAC. Do the same for any other infrastructure workload that legitimately needs elevated privileges.
TLS — what encryption actually means here
Every connection to and from the k3s API server uses TLS — the same encryption protocol as HTTPS websites. TLS has had several versions; the older ones (1.0, 1.1) have known weaknesses. We set tls-min-version=VersionTLS12 to reject them.
Even within TLS 1.2, some cipher suites (the specific algorithm combination) are weak. We explicitly list only suites that use:
- ECDHE (Elliptic Curve Diffie-Hellman Ephemeral) — means even if someone records the traffic now and cracks the key later, they still cannot decrypt old sessions
- AES-256-GCM or AES-128-GCM or ChaCha20-Poly1305 — authenticated encryption that detects tampering
“Forward secrecy” means: even if an attacker records all your encrypted traffic today and somehow steals the server’s private key in 5 years, they still cannot decrypt the old traffic. Each session used a different one-time key. ECDHE is what makes that possible.
Default service account tokens — revoked
Every Kubernetes namespace has a default service account. Every pod that does not explicitly choose a service account uses this one. By default, k3s mounts its API token into every pod — meaning any container that gets compromised can immediately start talking to the Kubernetes API.
We patch automountServiceAccountToken: false on the default service account in every namespace. Workloads that genuinely need API access create a dedicated service account with the minimum permissions required.
for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
kubectl patch serviceaccount default -n "$ns" \
-p '{"automountServiceAccountToken": false}'
done
The install script runs this loop automatically. Re-run it whenever you add a namespace.
Getting kubectl access over WireGuard
Port 6443 (the Kubernetes API server) is never exposed on a public interface. All kubectl access goes through the WireGuard tunnel established in Layer 4.
On the k3s server (to retrieve the kubeconfig):
sudo cat /etc/rancher/k3s/k3s.yaml
On your kubectl client machine:
mkdir -p ~/.kube
# Paste the above output into ~/.kube/config
# Then edit the server address:
sed -i 's|https://127.0.0.1:6443|https://10.0.0.1:6443|' ~/.kube/config
# Replace 10.0.0.1 with your actual WireGuard IP
chmod 600 ~/.kube/config
# Test (requires WireGuard to be up):
kubectl get nodes
The WireGuard IP must match a value in tls-san in config.yaml, otherwise TLS will reject the connection with a certificate error. If you get a TLS error, re-check tls-san, then regenerate certs: k3s certificate rotate.
The kubeconfig file is like an access card. It tells kubectl where the server is and includes a certificate proving the server is who it says it is. Without a WireGuard connection to the private IP, the access card points to an address that doesn’t exist from the public internet — which is exactly what we want.
Verifying with kube-bench
kube-bench is an open-source tool from Aqua Security that runs the full CIS Kubernetes Benchmark check list against your cluster and tells you which controls pass, which fail, and exactly what command to run to fix each failure.
kube-bench auto-detects the correct CIS profile for the k3s version it finds on the node — you do not need to pin a profile. If you ever need to override (e.g. to test against a specific baseline), use --benchmark <profile>; otherwise omit the flag and let kube-bench detect.
Option A: binary on the host (most thorough)
Running directly on the host — not inside a container — lets kube-bench read every file with root access.
# Download kube-bench (check https://github.com/aquasecurity/kube-bench/releases for latest)
curl -LO https://github.com/aquasecurity/kube-bench/releases/download/v0.9.0/kube-bench_0.9.0_linux_amd64.tar.gz
tar xzf kube-bench_0.9.0_linux_amd64.tar.gz
# Run (as root) — profile auto-detected from the running k3s version
./kube-bench 2>&1 | tee "kube-bench-$(date +%Y%m%d).txt"
# Review summary at the end:
tail -20 "kube-bench-$(date +%Y%m%d).txt"
Option B: in-cluster Job (convenient)
kubectl apply -f scripts/config/k3s/kube-bench-job.yaml
kubectl wait --for=condition=complete job/kube-bench-k3s -n kube-system --timeout=120s
kubectl logs -n kube-system job/kube-bench-k3s
kubectl delete -f scripts/config/k3s/kube-bench-job.yaml
Reading the results
| Result | Meaning | Action |
|---|---|---|
| PASS | Control is compliant | None |
| WARN | Manual or informational control | Review; document your decision |
| FAIL | Non-compliant; remediation required | Follow the fix command in the output |
| INFO | Context only | None |
After applying this chapter’s configuration, the expected results are:
- Most API server, etcd, and kubelet checks: PASS
- Audit logging: PASS (all four flags set)
- PSA / pod security: PASS
- Network policies: WARN (Layer 4 installs Cilium with network policies)
- A few manual/informational WARNs that require operator documentation
CIS Benchmark v1.12 coverage
| CIS Control | Description | Status | Where |
|---|---|---|---|
| 1.1.1 | API server file permissions | AUTO | k3s manages its own binary |
| 1.2.1 | anonymous-auth=false | AUTO | k3s default |
| 1.2.2 | No basic-auth-file | AUTO | Not supported in k3s |
| 1.2.4 | Authorization mode not AlwaysAllow | AUTO | Node,RBAC default |
| 1.2.5 | NodeRestriction admission | AUTO | k3s default |
| 1.2.6 | Audit logging enabled | MANUAL | audit-policy.yaml + 4 apiserver args |
| 1.2.7 | Audit policy file set | MANUAL | audit-policy.yaml |
| 1.2.8 | EventRateLimit | MANUAL | psa.yaml + apiserver arg |
| 1.2.9 | AlwaysPullImages | MANUAL | enable-admission-plugins |
| 1.2.10 | PodSecurity admission | MANUAL | psa.yaml |
| 1.2.12 | TLS cipher suites | MANUAL | tls-cipher-suites in config.yaml |
| 1.2.13 | TLS minimum version | MANUAL | tls-min-version in config.yaml |
| 1.2.15 | Service account key file | AUTO | k3s manages |
| 1.2.16 | etcd CA/cert/key | AUTO | Embedded etcd |
| 1.2.20 | No long-lived SA token extension | MANUAL | service-account-extend-token-expiration=false |
| 1.2.21 | API request timeout | MANUAL | request-timeout=300s |
| 1.3.1 | Terminated pod GC threshold | MANUAL | kube-controller-manager-arg |
| 1.3.2 | Controller-manager profiling | N/A | Not exposed in k3s |
| 1.4.1 | Scheduler profiling | N/A | Not exposed in k3s |
| 2.1 | etcd TLS | AUTO | Embedded etcd always TLS |
| 2.2 | etcd peer TLS | AUTO | Embedded etcd |
| 2.4 | etcd data dir permissions | MANUAL | chmod 700 on /var/lib/rancher/k3s/server/db/etcd |
| 2.7 | Encryption at rest | MANUAL | secrets-encryption: true |
| 3.1.1 | Kubelet client cert auth | AUTO | k3s default |
| 3.2.1 | Audit logs present | MANUAL | audit-policy.yaml + args |
| 4.1.1 | Kubelet service file permissions | AUTO | systemd default |
| 4.2.1 | Kubelet anonymous-auth=false | AUTO | k3s default |
| 4.2.2 | Kubelet authorization webhook | AUTO | k3s default |
| 4.2.3 | Kubelet client CA | AUTO | k3s default |
| 4.2.4 | read-only-port=0 | MANUAL | kubelet-arg (CVE-2025-46599) |
| 4.2.5 | Streaming connection idle timeout | MANUAL | kubelet-arg |
| 4.2.6 | protect-kernel-defaults | MANUAL | Layer 0 sysctls + config.yaml |
| 4.2.7 | make-iptables-util-chains | MANUAL | kubelet-arg |
| 4.2.8 | event-qps | MANUAL | kubelet-arg |
| 4.2.9 | kubelet TLS cipher suites | MANUAL | kubelet-arg |
| 4.2.11 | Rotate server certificates | MANUAL | kubelet-arg |
| 4.2.13 | seccomp-default | MANUAL | kubelet-arg |
| 5.1.1 | cluster-admin binding audit | MANUAL | RBAC audit commands (§7.3 research) |
| 5.1.2 | Minimize service account access | MANUAL | RBAC least-privilege |
| 5.1.5 | Default SA automount disabled | MANUAL | post-install patch loop |
| 5.2.x | Pod Security Standards | MANUAL | psa.yaml |
| 5.3.1–5.3.2 | Network policies | MANUAL | Layer 4 (Cilium) |
Ongoing maintenance
Key rotation
# Rotate the encryption key for all Secrets in etcd.
# The old key is retained until you explicitly remove it, so no downtime.
k3s secrets-encrypt rotate-keys
# Check status:
k3s secrets-encrypt status
Do this quarterly, or after any suspected compromise.
Certificate expiry
k3s auto-rotates certificates when they are within 90 days of expiry (changed to 120 days in v1.33+). To rotate manually:
systemctl stop k3s
k3s certificate rotate
systemctl start k3s
Regular audit queries
# Who has cluster-admin?
kubectl get clusterrolebindings \
-o jsonpath='{range .items[?(@.roleRef.name=="cluster-admin")]}{.subjects}{"\n"}{end}'
# What can the default service account do? (should return nothing)
kubectl auth can-i --list --as=system:serviceaccount:default:default
# Any failed events?
kubectl get events --field-selector reason=Failed -A
What this layer bought you
Secrets encrypted in the database. An attacker who steals a snapshot of etcd — a common post-exploit move — gets secretbox (XSalsa20-Poly1305) ciphertext, not your secrets.
A complete audit trail. Every exec into a container, every RBAC change, every token review is logged with timestamp and identity. You can answer “who ran that command?” weeks later.
Every container sandboxed. The restricted Pod Security Standard blocks the most common container-escape techniques before any code runs. No container can run as root or escalate privileges without an explicit exemption.
Port 10255 dead. CVE-2025-46599 is closed by both the k3s version and the explicit flag.
No unnecessary network components. Traefik, the built-in load balancer, and local storage are gone — three potential attack surfaces removed. The cluster is ready for Cilium (Layer 4), which will provide network policies instead.
Smaller credential blast radius. No container gets a Kubernetes API token unless its service account explicitly opts in. A compromised workload cannot pivot to the Kubernetes control plane.
Kube-bench green. You have a repeatable, automated way to prove compliance to yourself (and anyone else) on demand.