Welcome to the second installment of our AI Analysis Lab. Part 1 gave a thorough ordering of our host-native stack. Part 2 walks through the build of the Kubernetes observability layer that turns nameless syscalls into named flows. A kind cluster running Cilium as the CNI, Hubble for the flow log, Tetragon for in-container process events, and Falco-in-Kubernetes-mode as the runtime security engine. The host-native tools from Part 1 do not go away, but you now get pod, service, and namespace labels attached to every event. This is the difference between “process 14732 connected to 10.244.1.7:5432” and “the langgraph-orchestrator pod connected to the postgres-checkpointer service in the lab-orchestrator namespace.“
By the end of this post you will have a three-node cluster, four observability layers running, two test pods deployed, and seven separate validations that each prove one specific kind of visibility works as advertised.
Same dance as Part 1: While the primary architect for the AI Analysis Lab is myself, Claude assisted in code generation. With that being said, artificial intelligence can make errors, and subtle errors at that. I highly suggest to always proofread, unit test, validate, and verify in your own lab. As well as segregating all work to a virtual machine which can be cloned, rolled back, and is disposable. Similar to a malware analysis lab.
Also Important: I use the root user during this lab, and it is constructed for that. Using root is a security risk. I am in my homelab using an ephemeral virtual machine. Risk accepted due to my environment and outweighed efficiency cost for myself.
Section 1: Install prerequisites Docker, kind, kubectl, Helm
Check to validate which need to be installed:
echo "--- Docker ---"
docker version --format '{{.Server.Version}}' 2>/dev/null && echo "PASS" || echo "NOT INSTALLED"
echo ""
echo "--- kind ---"
kind version 2>/dev/null && echo "PASS" || echo "NOT INSTALLED"
echo ""
echo "--- kubectl ---"
kubectl version --client --short 2>/dev/null || kubectl version --client 2>/dev/null && echo "PASS" || echo "NOT INSTALLED"
echo ""
echo "--- Helm ---"
helm version --short 2>/dev/null && echo "PASS" || echo "NOT INSTALLED"
Docker is the runtime kind uses to host its nodes. Each kind node is itself a Docker container running a full kubelet plus container runtime: a VM-shaped illusion built out of namespaces and cgroups. Without Docker, kind has nothing to launch nodes inside.
kind (“Kubernetes IN Docker”) is the simplest path to a real, multi-node Kubernetes cluster on a single host. kind is what the upstream Kubernetes project uses for its own CI.
kubectl is the cluster client. Every interaction with the cluster goes through it.
Helm is the package manager for Kubernetes. Tetragon and Falco install via Helm charts because that is what their maintainers ship; there is no realistic alternative.
Install Docker (skip if already PASS):
sudo apt install -y docker.io
sudo systemctl enable --now docker
sudo usermod -aG docker $USER
The usermod command adds your user to the docker group, which is what lets you run docker and kind commands without sudo. It does not take effect in your current shell, you have to log out and back in, or run newgrp docker to start a fresh shell with the new group membership.
Install kind (skip if already PASS):
[ $(uname -m) = x86_64 ] && curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.27.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind
kind version
Pinning to v0.27.0 is intentional. kind versions track Kubernetes versions loosely and this version is known to work with the Cilium CLI version we install later. Newer kind releases probably also work, but if something breaks during cluster creation, downgrade to v0.27.0 first before debugging anything else.
Install kubectl (skip if already PASS):
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/kubectl
kubectl version --client
Install Helm (skip if already PASS):
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version --short
Section 2: Create the kind cluster config
The default CNI is disabled, host paths are mounted into each node, and each node’s container is given Docker socket access.
cat > /tmp/kind-lab-config.yaml << 'EOF'
# AI Runtime Analysis Lab — kind cluster config
# 3 nodes: 1 control-plane + 2 workers
# Default CNI disabled (Cilium will replace it)
# /proc mounted for Tetragon
# /dev and docker.sock mounted for Falco
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraMounts:
- hostPath: /proc
containerPath: /procHost
- hostPath: /dev
containerPath: /dev
- hostPath: /var/run/docker.sock
containerPath: /var/run/docker.sock
- role: worker
extraMounts:
- hostPath: /proc
containerPath: /procHost
- hostPath: /dev
containerPath: /dev
- hostPath: /var/run/docker.sock
containerPath: /var/run/docker.sock
- role: worker
extraMounts:
- hostPath: /proc
containerPath: /procHost
- hostPath: /dev
containerPath: /dev
- hostPath: /var/run/docker.sock
containerPath: /var/run/docker.sock
networking:
disableDefaultCNI: true
EOF
cat /tmp/kind-lab-config.yaml
/proc ? /procHost is what makes Tetragon work inside kind. Each kind node is a Docker container, which means it has its own /proc filesystem that only sees processes inside that container. Tetragon needs to see the host’s process table, which are the processes that own the cgroups it is hooking, to correlate kernel events back to the right pod. The mount aliases the host’s /proc to /procHost inside each kind node, and Tetragon is later configured to look there with hostProcPath=/procHost. Without this, Tetragon installs but its events have no context.
/dev is mounted because Falco’s eBPF driver needs access to specific device files, particularly /dev/null and the kernel’s BPF interfaces, that are not present in a stock kind container.
/var/run/docker.sock is the Docker daemon socket. Falco uses it to enrich its alerts with container metadata (image name, container ID, labels) by querying Docker directly. Without it, Falco’s alerts still fire, but they are missing the container-identifying fields you most want.
disableDefaultCNI: true is the single line that lets Cilium replace kind’s bundled kindnet CNI. Without it, kindnet wins the race and binds the pod network namespace, leaving Cilium with nothing to manage. The result is the nodes coming up NotReady until you install a CNI, which is exactly what we want here.
Section 3: Create the cluster
This takes one to three minutes:
kind create cluster --name lab --config /tmp/kind-lab-config.yaml
Verify the cluster exists and that kubectl is pointed at it:
kubectl cluster-info --context kind-lab
kubectl get nodes
Three nodes should be listed, all in NotReady status. The kubectl context name kind-lab is auto-generated from the cluster name. If you have multiple clusters, this is how you switch between them with kubectl config use-context kind-lab.
Section 4: Install the Cilium CLI
The Cilium CLI is a single Go binary that wraps the official Helm chart with sane defaults. For a lab, using it directly is the shortest path. For production, you would typically use Helm. We use the CLI here.
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz
cilium version --client
Section 5: Install Cilium with Hubble enabled
The first command installs Cilium itself. The second waits for it to become healthy.
cilium install --set kubeProxyReplacement=true
cilium status --wait
This blocks until Cilium reports OK. Expect Cilium OK, Operator OK and two to five minutes of waiting on a fresh kind cluster. Once it returns, all 3 nodes should transition to Ready:
kubectl get nodes
Section 6: Enable Hubble
Hubble is Cilium’s observability layer. It does not get installed separately. It is enabled as a feature of an already-installed Cilium.
cilium hubble enable --ui
The –ui flag enables the Hubble UI deployment in addition to the Relay.
cilium status --wait
Wait again for Hubble Relay to come up. You’ll know it’s ready when cilium status shows Hubble Relay OK in addition to the previous lines.
Section 7: Install the Hubble CLI
HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
HUBBLE_ARCH=amd64
curl -L --fail --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-${HUBBLE_ARCH}.tar.gz
sudo tar xzvfC hubble-linux-${HUBBLE_ARCH}.tar.gz /usr/local/bin
rm hubble-linux-${HUBBLE_ARCH}.tar.gz
hubble version
After this completes, you have a hubble command on your PATH.
Section 8: Install Tetragon
Hubble shows you the network. Tetragon shows you the processes. The two are complementary: Hubble is “what flowed where,” Tetragon is “what executed inside which container, with what arguments, called by which parent.” For agent analysis, this distinction matters because a coding agent’s tool invocations are processes, not flows. Running git diff inside a containerized agent shows up in Tetragon, not Hubble, and you need both.
helm repo add cilium https://helm.cilium.io
helm repo update
helm install tetragon cilium/tetragon \
-n kube-system \
--set tetragon.hostProcPath=/procHost
Wait for the DaemonSet to roll out:
kubectl rollout status -n kube-system ds/tetragon -w
Verify pods are running:
kubectl get pods -n kube-system -l app.kubernetes.io/name=tetragon
Expect one Tetragon pod per node, three pods on this cluster, all in Running state. If pods are crashing, the most common cause is the hostProcPath flag being wrong. Double-check it matches the kind config mount.
Section 9: Install the tetra CLI
Tetragon’s events are emitted as JSON to stdout in the export-stdout container of each Tetragon pod. The tetra CLI formats those events into something readable.
TETRA_VERSION=$(curl -s https://api.github.com/repos/cilium/tetragon/releases/latest | jq -r .tag_name)
curl -L https://github.com/cilium/tetragon/releases/download/${TETRA_VERSION}/tetra-linux-amd64.tar.gz | tar -xz
sudo mv tetra /usr/local/bin/
sudo chmod +x /usr/local/bin/tetra
tetra version
Section 10: Install Falco in Kubernetes mode
Falco from Phase 1 runs in host mode, watching syscalls on the host kernel. Falco in Kubernetes mode runs as a DaemonSet, one Falco pod per node, and watches container-scoped syscalls. This decorates every alert with pod, namespace, container ID, and image. The same Falco rules you wrote on the host largely work here too, what changes is the context attached to each match.
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update
helm install falco falcosecurity/falco \
--namespace falco \
--create-namespace \
--set tty=true \
--set driver.kind=modern_ebpf
Wait for Falco pods to be ready:
kubectl wait pods --for=condition=Ready --all -n falco --timeout=120s
kubectl get pods -n falco
Expect one Falco pod per node, three pods total: all Running.
Section 11: Deploy test workloads
Two pods that serve as targets for every validation that follows: an nginx server and a curl client we’ll create on demand.
kubectl create namespace lab-test
kubectl run nginx-server -n lab-test --image=nginx:alpine --port=80 --labels="app=nginx-server"
kubectl expose pod nginx-server -n lab-test --port=80 --name=nginx-svc
kubectl wait --for=condition=ready pod/nginx-server -n lab-test --timeout=60s
Verify:
kubectl get pods -n lab-test
kubectl get svc -n lab-test
Section 12: Validation of Hubble flow observation
This is the core network observability test. Generate traffic between pods, watch it appear in Hubble.
Terminal 1. Start the Hubble port-forward and stream flows from the test namespace:
cilium hubble port-forward &
sleep 2
hubble observe --namespace lab-test -f
Terminal 2. Generate traffic by spinning up a curl pod that hits the nginx service:
kubectl run curl-client -n lab-test --image=curlimages/curl --rm -it --restart=Never -- curl -s http://nginx-svc.lab-test.svc.cluster.local
Terminal 1 should now show flow entries: source pod, destination service, port, verdict (FORWARDED), and latency. This is the payoff. Every line is a structured record of one network operation, with both endpoints labeled by pod and namespace identity. When you eventually have a model gateway in this cluster and an agent calling it, this is the layer where the agent’s model calls become a single readable log.
Section 13: Validation of Hubble DNS observation
Hubble can decode L7 protocols, including DNS. This validation specifically targets DNS resolution to prove that visibility is working.
hubble observe --namespace lab-test --protocol dns -f &
sleep 1
kubectl run dns-test -n lab-test --image=curlimages/curl --rm -it --restart=Never -- curl -s http://nginx-svc.lab-test.svc.cluster.local
wait
Expect DNS query/response entries showing the resolution of nginx-svc.lab-test.svc.cluster.local. This proves DNS-layer visibility works, which is critical for later phases — when an agent eventually resolves api.anthropic.com to call the model gateway, that DNS query is the first observable signal you’ll have that the call is happening.
Section 14: Validation of Tetragon process execution events
Tetragon’s payoff is that it sees what happens inside containers, with full pod context attached. To watch its event stream, tail the export-stdout container of the Tetragon DaemonSet and pipe it through tetra:
Terminal 1.
kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout -f | tetra getevents -o compact --namespace lab-test
Terminal 2. exec into nginx and run a few commands:
kubectl exec -it nginx-server -n lab-test -- sh -c "whoami && ls /etc && cat /etc/hostname"
Terminal 1 should show exec events for sh, whoami, ls, and cat, each with the full binary path, parent process, pod name, and namespace. This is what bpftrace’s execve probe gave you in Phase 1, but with cluster identity instead of just PIDs and command names.
Section 15: Validation of Tetragon TracingPolicy for file access
Process exec events are built into Tetragon by default. File access events are not and you have to deploy a TracingPolicy resource that hooks the kernel function you want to observe. This is Tetragon’s policy surface, and it is also the mechanism for runtime enforcement (a policy can not just observe but block).
cat <<'EOF' | kubectl apply -f -
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: lab-file-access
spec:
kprobes:
- call: "fd_install"
syscall: false
args:
- index: 0
type: int
- index: 1
type: "file"
EOF
Terminal 1. Watch events:
kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout -f | tetra getevents -o compact --namespace lab-test
Terminal 2. Trigger file access:
kubectl exec -it nginx-server -n lab-test -- cat /etc/nginx/nginx.conf > /dev/null
Expect events showing the file access, including the full path, the process name, and the pod context. This is the building block for every “what did the agent read?” question that comes later, when you eventually deploy a coding agent into a pod and it reads a file it has no business reading, this same probe captures it.
Section 16: Validation of Falco container shell detection
Falco’s default ruleset detects shell spawns inside containers, one of the most basic and useful runtime security alerts.
Terminal 1.
kubectl logs -n falco -l app.kubernetes.io/name=falco --tail=0 -f
Terminal 2.
kubectl exec -it nginx-server -n lab-test -- /bin/sh -c "whoami"
Terminal 1 should show a Falco alert about a terminal shell being spawned inside the container, including the pod name, namespace, container ID, and command line.
Section 17: Validation of Falco sensitive file read in container
A second default-ruleset alert: reading sensitive files like /etc/shadow.
kubectl logs -n falco -l app.kubernetes.io/name=falco --tail=0 -f &
FALCO_LOG_PID=$!
sleep 2
kubectl exec -it nginx-server -n lab-test -- cat /etc/shadow 2>/dev/null || true
sleep 3
kill $FALCO_LOG_PID 2>/dev/null
wait $FALCO_LOG_PID 2>/dev/null
Expect a Falco alert about a sensitive file being opened for reading inside the container. When you eventually have a coding agent in a pod that reads a file it has no business reading, this is the rule template you’ll adapt. Point it at credentials, AWS metadata endpoints, or anything else you don’t want agents touching.
Section 18: Validation of Hubble UI (optional)
cilium hubble ui &
echo "Hubble UI available at http://localhost:12000"
Open http://localhost:12000 in your browser. Select the lab-test namespace from the dropdown. Generate traffic in another terminal:
kubectl run curl-ui-test -n lab-test --image=curlimages/curl --rm -it --restart=Never -- curl -s http://nginx-svc.lab-test.svc.cluster.local
The flow appears in the UI as a visual line between the curl pod and the nginx service. Close the UI with Ctrl+C on the port-forward when done.
Section 19: Validation of full Cilium status
cilium status
Section 20: Validation of Cilium connectivity test (optional but thorough)
cilium connectivity test --request-timeout 30s --connect-timeout 10s
Section 21: Clean up test workloads
kubectl delete namespace lab-test
Section 22 — Save the kind config for future reference
LAB_ROOT="$HOME/ai-runtime-lab"
cp /tmp/kind-lab-config.yaml "$LAB_ROOT/configs/kind-lab-config.yaml"
echo "Saved to $LAB_ROOT/configs/kind-lab-config.yaml"
Conclusion
You can see flows between named pods, process executions inside named containers, runtime security alerts with full pod context, and you can correlate any of it back to host-level syscall traces from Phase 1. You have a small lab cluster with three nodes and four observability layers (and two test pods just deleted) that is now ready to host actual agent workloads.
It is worth being explicit about why both layers stay in the lab permanently rather than picking one. They answer different questions. Host-native bpftrace will show you every syscall a Tetragon pod’s eBPF program emits. Host-native strace will let you attach to a specific PID inside a container’s process namespace and watch every system call with timing data, which Tetragon’s TracingPolicy abstraction is not designed for. Host-native tcpdump will capture loopback traffic on the kind nodes themselves, including traffic between sidecars in the same pod that never crosses the CNI and is invisible to Hubble.
Kubernetes-native tooling, conversely, is the only place to get pod and namespace identity on flows, the only practical way to enforce a runtime security policy across an entire DaemonSet, and the only way to ask cluster-scoped questions like “which pods called my model gateway today?”
The next post deploys the first real workload into this cluster: LiteLLM as the model gateway, backed by Postgres for metadata, with completion requests routed to Claude. That is the first time the lab will see a real model API call leave the cluster, and the first chance to watch the full request lifecycle end-to-end.
