Layer 2: Hardened K3s

Install Kubernetes so locked-down that it passes the CIS security benchmark out of the box — every secret encrypted, every door closed.

Layer 2 — the hardened Kubernetes control plane at the center of the stack.

By the end of this chapter you will have a running Kubernetes cluster where: secrets are encrypted in the database before they ever touch disk, every container is forced into a security sandbox, an audit trail records every sensitive action, and the cluster’s control port is invisible to the public internet. You will also know how to prove all of this with an automated scanner.

What is k3s, and why use it here?

Kubernetes is the software that decides which containers run on which machines, restarts them when they crash, and handles networking between them. A full Kubernetes installation has a dozen separate programs. k3s is Rancher’s single-binary packaging of all of them — the same technology, a fraction of the footprint. On a homelab machine it starts in seconds and uses a few hundred megabytes of RAM instead of several gigabytes.

More importantly for this guide: k3s ships with several hardening defaults that vanilla Kubernetes does not. We will build on those defaults rather than work around them.

Kubernetes is the manager of your containers — it knows where everything is running and makes sure it keeps running. K3s is a smaller, pre-configured version of that manager, designed for machines that aren’t enormous cloud servers. Think of it as the same job title, but a leaner employee who already knows the security rules.

In plain English

What k3s gives you for free

Before adding anything, k3s already does:

RBAC enabled by default. Every action inside the cluster requires an explicit permission. Nothing works by default except what k3s itself needs to function.
Node restriction — a Kubernetes node (the machine running containers) can only see and modify its own resources, not those of other nodes.
Kubelet client-certificate authentication. The kubelet (the per-node agent) proves its identity with a certificate before the API server talks to it.
Bound service account tokens. When a pod needs to talk to Kubernetes itself, it gets a short-lived, scoped token — not an old-style permanent secret.

RBAC is a permission system: every user and every program inside the cluster has to show its badge before doing anything. k3s ships with RBAC switched on. That is not the default in plain Kubernetes.

In plain English

What needs manual work

The gaps that remain — each closed in this chapter:

Gap	Why it matters
Secrets encryption at rest	Without it, anyone who copies the database gets every password in plain text
Audit logging	Without it you have no record of who did what
Pod security enforcement	Without it any container can run as root and escape to the host
Read-only port closed	CVE-2025-46599 — left open, exposes every pod’s environment variables with zero auth
TLS cipher restriction	Prevents downgrade attacks to weak encryption
Unused components disabled	Smaller attack surface
Default service account tokens revoked	Prevents containers from talking to Kubernetes when they have no reason to

The one rule you cannot break: secrets encryption at first boot

secrets-encryption: true must be set before k3s runs for the first time. You cannot enable it on an existing cluster without a full re-encryption procedure that risks data loss. The install script enforces this. Do not run k3s manually before running the script.

Caution

When secrets encryption is on, every Kubernetes Secret object is encrypted with secretbox (XSalsa20-Poly1305) before being written to the etcd database — this is the k3s default provider since v1.32.4, and is what this guide’s v1.36 build uses. An attacker who steals a copy of the database gets ciphertext, not your actual secrets. The encryption key is stored separately on the server and can be rotated without restarting the cluster.

Think of the database as a filing cabinet. Without encryption, the filing cabinet stores documents in plain text — anyone who picks the lock reads everything. With secrets encryption, the filing cabinet stores scrambled text, and the decoder ring lives somewhere else. Stealing the cabinet alone gets you nothing.

In plain English

Layer 0 dependency: kernel parameters

protect-kernel-defaults: true in the k3s config tells the kubelet: “before you do anything, check that the Linux kernel is configured the way you expect. If it isn’t, refuse to start.” This is a safety net — it catches machines where someone changed a kernel setting that Kubernetes relies on.

The required kernel settings are written by Layer 0 (/etc/sysctl.d/90-kubelet.sysctl.conf). The install script checks they are applied before touching k3s. If you run this script on a fresh machine without running Layer 0 first, it will stop and tell you.

It is like a car that checks its own oil level before the engine will start. The kernel parameters are the oil. Layer 0 put the oil in. If the oil is missing, k3s refuses to move.

In plain English

The datastore: embedded etcd

k3s ships two datastore options: SQLite (simple, single-node only) and embedded etcd (production-grade, HA-ready). We use embedded etcd (cluster-init: true), because:

The CIS hardening guide assumes etcd — some benchmark checks only apply to etcd.
etcd encrypts its snapshots with AES-256 derived from the cluster join token.
If you ever add a second node for high availability, etcd supports it; SQLite does not.

SQLite is a basic notebook. etcd is a bank vault with a backup system and a combination lock. Both store the same data; one is ready for a real security posture.

In plain English

etcd snapshots contain all cluster data including CA private keys. Snapshot encryption is derived from the cluster join token. Store the token and the snapshots in separate places. Anyone with both can decrypt everything.

Important

The config file — every setting explained

K3s reads a single YAML file at startup: /etc/rancher/k3s/config.yaml. The hardened version lives at scripts/config/k3s/config.yaml. Every flag is explained below.

/etc/rancher/k3s/config.yaml (excerpt — see full file)

cluster-init: true          # use embedded etcd, not SQLite
secrets-encryption: true    # encrypt secrets in etcd — MUST be set at first boot
protect-kernel-defaults: true  # refuse to start if kernel params are wrong
write-kubeconfig-mode: "0600"  # kubeconfig readable only by root

tls-san:
  - "10.0.0.1"        # your WireGuard IP — edit this
  - "k3s.internal"    # optional hostname

disable:
  - traefik            # replace with a hardened ingress controller if needed
  - servicelb          # replace with MetalLB or kube-vip
  - local-storage      # replace with Longhorn or a CSI driver
  - metrics-server     # deploy your own if needed

# Layer 4 (Cilium) prep: disable the built-in network stack
# so Cilium can replace it completely.
flannel-backend: none
disable-network-policy: true
disable-kube-proxy: true

The three lines at the bottom (flannel-backend: none, disable-network-policy: true, disable-kube-proxy: true) leave the cluster with no network until Layer 4 installs Cilium. Do not try to run workloads between Layer 2 and Layer 4.

Note

Flannel is k3s’s built-in “how containers talk to each other” system. We turn it off because Layer 4 installs a much more powerful replacement (Cilium). Until that replacement arrives, containers cannot talk to each other — which is fine because we haven’t put any workloads in yet.

In plain English

API server hardening flags

These go under kube-apiserver-arg: — they tune the gatekeeper that every kubectl command talks to.

Flag	What it does
`anonymous-auth=false`	Rejects any request with no identity (already the k3s default; explicit for auditability)
`enable-admission-plugins=...`	Turns on four extra security checks at pod admission time (see PSA section)
`admission-control-config-file=...`	Points to the PSA + EventRateLimit config
`audit-log-path=...` and three related flags	Turns on audit logging with 30-day retention
`tls-cipher-suites=...`	Restricts TLS to forward-secret, authenticated encryption only (ECDHE + AES-GCM / ChaCha20)
`tls-min-version=VersionTLS12`	Rejects TLS 1.0 and 1.1 connections
`service-account-extend-token-expiration=false`	Disables the legacy long-lived token extension
`request-timeout=300s`	Caps how long a single API request can run

Kubelet hardening flags

These go under kubelet-arg: — they tune the per-node agent.

Flag	What it does
`read-only-port=0`	Closes port 10255 (CVE-2025-46599 — see below)
`streaming-connection-idle-timeout=5m`	Times out idle exec/attach sessions
`event-qps=5` / `event-burst=10`	Rate-limits events from kubelet
`tls-cipher-suites=...`	Same cipher restriction as the API server
`seccomp-default=true`	Applies a system-call filter to every container by default (Layer 3 depends on this)
`rotate-server-certificates=true`	Auto-renews kubelet TLS certs before they expire

CVE-2025-46599 — the read-only port regression

CVSS 7.5 High. k3s 1.32 before 1.32.4 accidentally re-enabled kubelet port 10255. Port 10255 requires zero authentication and exposes pod metadata, environment variables (which often contain secrets), and node information.

Important

The bug: a Go programming quirk (omitempty) silently stripped the explicit 0 value from the kubelet config, reverting to the compiled default — which is 10255. The fix in k3s 1.32.4+ corrects this. We are on 1.36.1 (unaffected), but we also set read-only-port=0 explicitly as belt-and-suspenders — if a future regression repeats the bug, the explicit flag wins.

Verify port is closed (run on the server)

# Should return "connection refused" — any other response means the port is open
curl -sk http://127.0.0.1:10255/pods && echo "PORT IS OPEN — INVESTIGATE" || echo "port 10255 closed OK"

Imagine a side entrance to your house that was supposed to be locked but kept unlocking itself due to a software glitch. The fix is to deadbolt it and also add a padlock — two independent mechanisms so a future glitch can’t undo both at once.

In plain English

Audit logging

The audit log is a chronological record of every sensitive action taken inside the cluster: who ran kubectl exec, what pods were created, who changed a permission. It is the cluster’s CCTV footage.

Kubernetes audit logging requires four flags and a policy file. If any piece is missing, logging silently fails. The install script writes everything before k3s starts.

The policy (scripts/config/k3s/audit-policy.yaml) uses a layered approach:

Suppressed entirely: health checks, leader-election churn — pure noise with zero security value.
Metadata only: secrets, configmaps, service accounts — logs who accessed them but never logs the actual value (so your secrets do not appear in the audit log).
Full request + response: exec/attach/port-forward, RBAC changes, token reviews — high-risk operations where you want every detail.
Request body only: pod and workload mutations, network policy changes.
Everything else: metadata level — nothing goes unlogged.

Audit logs land at /var/lib/rancher/k3s/server/logs/audit.log, rotated at 100 MB, kept for 30 days, up to 10 files.

For alerting, forward the audit log to a SIEM with Promtail → Loki or Filebeat → Elasticsearch. The critical alert rules: any pods/exec from a non-operator identity, any clusterrolebindings mutation, any secrets delete verb.

Tip

Pod Security Admission — every container in a restricted sandbox

Pod Security Admission (PSA) is the Kubernetes mechanism that inspects every container definition before it runs and rejects anything that violates your security policy. It replaced the older PodSecurityPolicy in Kubernetes 1.25.

We enforce the restricted standard cluster-wide. What restricted requires of every container:

Run as a non-root user
Drop all Linux capabilities
No privilege escalation allowed
A seccomp filter must be applied (which seccomp-default=true on the kubelet provides)
No access to the host’s network, PID namespace, or IPC namespace

Think of restricted as a strict rule: every container has to work as a regular user, cannot use any special system tools, and is completely isolated from the machine it runs on. If a container definition breaks any of these rules, Kubernetes refuses to start it.

In plain English

Three namespaces are exempted: kube-system, kube-public, and kube-node-lease. These run k3s’s own infrastructure (CoreDNS, the CNI agent, etcd) which legitimately needs privileged access. Every namespace you create gets restricted enforcement automatically.

The config lives at scripts/config/k3s/psa.yaml. It is passed to the API server via admission-control-config-file.

scripts/config/k3s/psa.yaml (key lines)

defaults:
  enforce: "restricted"
  enforce-version: "latest"
  audit: "restricted"     # violations in exempted namespaces still appear in the audit log
  warn: "restricted"      # kubectl prints a warning even for non-enforced violations
exemptions:
  namespaces:
    - kube-system
    - kube-public
    - kube-node-lease

When you add a monitoring namespace that runs node-exporter (which needs hostPID: true), add that namespace to the exemptions list and tightly scope its RBAC. Do the same for any other infrastructure workload that legitimately needs elevated privileges.

Important

TLS — what encryption actually means here

Every connection to and from the k3s API server uses TLS — the same encryption protocol as HTTPS websites. TLS has had several versions; the older ones (1.0, 1.1) have known weaknesses. We set tls-min-version=VersionTLS12 to reject them.

Even within TLS 1.2, some cipher suites (the specific algorithm combination) are weak. We explicitly list only suites that use:

ECDHE (Elliptic Curve Diffie-Hellman Ephemeral) — means even if someone records the traffic now and cracks the key later, they still cannot decrypt old sessions
AES-256-GCM or AES-128-GCM or ChaCha20-Poly1305 — authenticated encryption that detects tampering

“Forward secrecy” means: even if an attacker records all your encrypted traffic today and somehow steals the server’s private key in 5 years, they still cannot decrypt the old traffic. Each session used a different one-time key. ECDHE is what makes that possible.

In plain English

Default service account tokens — revoked

Every Kubernetes namespace has a default service account. Every pod that does not explicitly choose a service account uses this one. By default, k3s mounts its API token into every pod — meaning any container that gets compromised can immediately start talking to the Kubernetes API.

We patch automountServiceAccountToken: false on the default service account in every namespace. Workloads that genuinely need API access create a dedicated service account with the minimum permissions required.

Patch all namespaces (run after cluster creation)

for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
  kubectl patch serviceaccount default -n "$ns" \
    -p '{"automountServiceAccountToken": false}'
done

The install script runs this loop automatically. Re-run it whenever you add a namespace.

Getting kubectl access over WireGuard

Port 6443 (the Kubernetes API server) is never exposed on a public interface. All kubectl access goes through the WireGuard tunnel established in Layer 4.

On the k3s server (to retrieve the kubeconfig):

On the k3s server (root)

sudo cat /etc/rancher/k3s/k3s.yaml

On your kubectl client machine:

On your client machine

mkdir -p ~/.kube
# Paste the above output into ~/.kube/config
# Then edit the server address:
sed -i 's|https://127.0.0.1:6443|https://10.0.0.1:6443|' ~/.kube/config
# Replace 10.0.0.1 with your actual WireGuard IP
chmod 600 ~/.kube/config

# Test (requires WireGuard to be up):
kubectl get nodes

The WireGuard IP must match a value in tls-san in config.yaml, otherwise TLS will reject the connection with a certificate error. If you get a TLS error, re-check tls-san, then regenerate certs: k3s certificate rotate.

Important

The kubeconfig file is like an access card. It tells kubectl where the server is and includes a certificate proving the server is who it says it is. Without a WireGuard connection to the private IP, the access card points to an address that doesn’t exist from the public internet — which is exactly what we want.

In plain English

Verifying with kube-bench

kube-bench is an open-source tool from Aqua Security that runs the full CIS Kubernetes Benchmark check list against your cluster and tells you which controls pass, which fail, and exactly what command to run to fix each failure.

kube-bench auto-detects the correct CIS profile for the k3s version it finds on the node — you do not need to pin a profile. If you ever need to override (e.g. to test against a specific baseline), use --benchmark <profile>; otherwise omit the flag and let kube-bench detect.

Option A: binary on the host (most thorough)

Running directly on the host — not inside a container — lets kube-bench read every file with root access.

On the k3s server (root)

# Download kube-bench (check https://github.com/aquasecurity/kube-bench/releases for latest)
curl -LO https://github.com/aquasecurity/kube-bench/releases/download/v0.9.0/kube-bench_0.9.0_linux_amd64.tar.gz
tar xzf kube-bench_0.9.0_linux_amd64.tar.gz

# Run (as root) — profile auto-detected from the running k3s version
./kube-bench 2>&1 | tee "kube-bench-$(date +%Y%m%d).txt"

# Review summary at the end:
tail -20 "kube-bench-$(date +%Y%m%d).txt"

Option B: in-cluster Job (convenient)

From your kubectl client (over WireGuard)

kubectl apply -f scripts/config/k3s/kube-bench-job.yaml
kubectl wait --for=condition=complete job/kube-bench-k3s -n kube-system --timeout=120s
kubectl logs -n kube-system job/kube-bench-k3s
kubectl delete -f scripts/config/k3s/kube-bench-job.yaml

Reading the results

Result	Meaning	Action
PASS	Control is compliant	None
WARN	Manual or informational control	Review; document your decision
FAIL	Non-compliant; remediation required	Follow the fix command in the output
INFO	Context only	None

After applying this chapter’s configuration, the expected results are:

Most API server, etcd, and kubelet checks: PASS
Audit logging: PASS (all four flags set)
PSA / pod security: PASS
Network policies: WARN (Layer 4 installs Cilium with network policies)
A few manual/informational WARNs that require operator documentation

CIS Benchmark v1.12 coverage

CIS Control	Description	Status	Where
1.1.1	API server file permissions	AUTO	k3s manages its own binary
1.2.1	anonymous-auth=false	AUTO	k3s default
1.2.2	No basic-auth-file	AUTO	Not supported in k3s
1.2.4	Authorization mode not AlwaysAllow	AUTO	Node,RBAC default
1.2.5	NodeRestriction admission	AUTO	k3s default
1.2.6	Audit logging enabled	MANUAL	audit-policy.yaml + 4 apiserver args
1.2.7	Audit policy file set	MANUAL	audit-policy.yaml
1.2.8	EventRateLimit	MANUAL	psa.yaml + apiserver arg
1.2.9	AlwaysPullImages	MANUAL	enable-admission-plugins
1.2.10	PodSecurity admission	MANUAL	psa.yaml
1.2.12	TLS cipher suites	MANUAL	tls-cipher-suites in config.yaml
1.2.13	TLS minimum version	MANUAL	tls-min-version in config.yaml
1.2.15	Service account key file	AUTO	k3s manages
1.2.16	etcd CA/cert/key	AUTO	Embedded etcd
1.2.20	No long-lived SA token extension	MANUAL	service-account-extend-token-expiration=false
1.2.21	API request timeout	MANUAL	request-timeout=300s
1.3.1	Terminated pod GC threshold	MANUAL	kube-controller-manager-arg
1.3.2	Controller-manager profiling	N/A	Not exposed in k3s
1.4.1	Scheduler profiling	N/A	Not exposed in k3s
2.1	etcd TLS	AUTO	Embedded etcd always TLS
2.2	etcd peer TLS	AUTO	Embedded etcd
2.4	etcd data dir permissions	MANUAL	chmod 700 on /var/lib/rancher/k3s/server/db/etcd
2.7	Encryption at rest	MANUAL	secrets-encryption: true
3.1.1	Kubelet client cert auth	AUTO	k3s default
3.2.1	Audit logs present	MANUAL	audit-policy.yaml + args
4.1.1	Kubelet service file permissions	AUTO	systemd default
4.2.1	Kubelet anonymous-auth=false	AUTO	k3s default
4.2.2	Kubelet authorization webhook	AUTO	k3s default
4.2.3	Kubelet client CA	AUTO	k3s default
4.2.4	read-only-port=0	MANUAL	kubelet-arg (CVE-2025-46599)
4.2.5	Streaming connection idle timeout	MANUAL	kubelet-arg
4.2.6	protect-kernel-defaults	MANUAL	Layer 0 sysctls + config.yaml
4.2.7	make-iptables-util-chains	MANUAL	kubelet-arg
4.2.8	event-qps	MANUAL	kubelet-arg
4.2.9	kubelet TLS cipher suites	MANUAL	kubelet-arg
4.2.11	Rotate server certificates	MANUAL	kubelet-arg
4.2.13	seccomp-default	MANUAL	kubelet-arg
5.1.1	cluster-admin binding audit	MANUAL	RBAC audit commands (§7.3 research)
5.1.2	Minimize service account access	MANUAL	RBAC least-privilege
5.1.5	Default SA automount disabled	MANUAL	post-install patch loop
5.2.x	Pod Security Standards	MANUAL	psa.yaml
5.3.1–5.3.2	Network policies	MANUAL	Layer 4 (Cilium)

Ongoing maintenance

Key rotation

Rotate secrets encryption keys (run on the server)

# Rotate the encryption key for all Secrets in etcd.
# The old key is retained until you explicitly remove it, so no downtime.
k3s secrets-encrypt rotate-keys

# Check status:
k3s secrets-encrypt status

Do this quarterly, or after any suspected compromise.

Certificate expiry

k3s auto-rotates certificates when they are within 90 days of expiry (changed to 120 days in v1.33+). To rotate manually:

Rotate k3s TLS certificates

systemctl stop k3s
k3s certificate rotate
systemctl start k3s

Regular audit queries

Weekly security checks

# Who has cluster-admin?
kubectl get clusterrolebindings \
  -o jsonpath='{range .items[?(@.roleRef.name=="cluster-admin")]}{.subjects}{"\n"}{end}'

# What can the default service account do? (should return nothing)
kubectl auth can-i --list --as=system:serviceaccount:default:default

# Any failed events?
kubectl get events --field-selector reason=Failed -A

What this layer bought you

Secrets encrypted in the database. An attacker who steals a snapshot of etcd — a common post-exploit move — gets secretbox (XSalsa20-Poly1305) ciphertext, not your secrets.

A complete audit trail. Every exec into a container, every RBAC change, every token review is logged with timestamp and identity. You can answer “who ran that command?” weeks later.

Every container sandboxed. The restricted Pod Security Standard blocks the most common container-escape techniques before any code runs. No container can run as root or escalate privileges without an explicit exemption.

Port 10255 dead. CVE-2025-46599 is closed by both the k3s version and the explicit flag.

No unnecessary network components. Traefik, the built-in load balancer, and local storage are gone — three potential attack surfaces removed. The cluster is ready for Cilium (Layer 4), which will provide network policies instead.

Smaller credential blast radius. No container gets a Kubernetes API token unless its service account explicitly opts in. A compromised workload cannot pivot to the Kubernetes control plane.

Kube-bench green. You have a repeatable, automated way to prove compliance to yourself (and anyone else) on demand.