12 Feb 2026 8 min read Kubernetes

The Modern Logging Stack: Loki + Alloy (Why Not Promtail)

4 pods, 640 MB of RAM, 4 million lines per day. Loki SingleBinary plus an Alloy DaemonSet replaces the entire Promtail-era logging stack.

Most Loki tutorials still start with helm install loki-stack. That Helm chart bundles Promtail, which Grafana deprecated in Loki v3.4 and stops supporting in March 2026. The Grafana Agent reached End-of-Life in November 2025. If you deploy either today, you're installing software that no longer receives security patches.

This cluster never ran Promtail. It was built on Loki v3.6.3 and Grafana Alloy v1.12.2 from the start: one Loki pod in SingleBinary mode, three Alloy pods as a DaemonSet, and 4.17 million log lines per day compressed into 107 MB on disk. The entire logging stack is 4 pods and 640 MB of requested RAM. The deployment took 15 minutes. Finding where it breaks took a month.

This is Part 6 of the homelab series. Part 4 deployed Gateway API for traffic routing. Part 5 added Longhorn for distributed storage. Now we're collecting and storing every log line the cluster generates.

The Deprecation Chain

Grafana's collector story has consolidated three times in two years:

Collector	Status (Feb 2026)	Replaced By
Promtail	Deprecated (Loki v3.4), EOL March 2026	Grafana Alloy
Grafana Agent (Static/Flow)	EOL November 2025	Grafana Alloy
Grafana Alloy	Active	—

Alloy isn't a rebrand. It's Grafana's distribution of the OpenTelemetry Collector, with native support for metrics, logs, traces, and profiles in a programmable configuration language called River. For anyone migrating from Promtail, alloy convert --source-format=promtail translates YAML configs to River syntax ("best-effort" — expect manual tuning). This cluster skipped the migration by starting with Alloy.

Loki: Why SingleBinary

The Loki Helm chart defaults to SimpleScalable mode: 3 read pods, 3 write pods, 3 backend pods, a gateway, and a canary. That's 10+ pods before you store a single log line. For a homelab generating 1.8 GB/day of raw logs, it's massive overkill.

SingleBinary runs every Loki component in one process. The official docs recommend it for up to "a few tens of GB/day." This cluster generates roughly 1/10th of that lower bound.

The Helm values that matter:

deploymentMode: SingleBinary

loki:
  auth_enabled: false
  commonConfig:
    replication_factor: 1
  schemaConfig:
    configs:
      - from: "2024-01-01"
        store: tsdb
        object_store: filesystem
        schema: v13
        index:
          prefix: index_
          period: 24h

singleBinary:
  replicas: 1
  persistence:
    storageClass: longhorn
    size: 10Gi

Three decisions stand out. First, replication_factor: 1 because Longhorn already replicates the PVC across two nodes (Part 5). Adding Loki-level replication would triple write I/O for no additional protection. Second, TSDB with the v13 schema. Older guides reference BoltDB, which is deprecated since Loki v3.5. TSDB is the current standard for new deployments. Third, retention is 90 days with the compactor enforcing it:

  limits_config:
    retention_period: 2160h
    reject_old_samples: true
    reject_old_samples_max_age: 168h

  compactor:
    retention_enabled: true
    compaction_interval: 10m

The reject_old_samples_max_age of 168 hours rejects any log timestamped more than 7 days ago. Without this, a misconfigured client could backfill months of old data into Loki in a single push.

Everything else is disabled. No gateway (Cilium handles routing). No canary. No caches. No self-monitoring (Prometheus handles that externally via ServiceMonitors). The full values file explicitly sets read, write, and backend replicas to zero, preventing the chart from sneaking in SimpleScalable components:

backend:
  replicas: 0
read:
  replicas: 0
write:
  replicas: 0
chunksCache:
  enabled: false
resultsCache:
  enabled: false
gateway:
  enabled: false

Install with:

helm install loki oci://ghcr.io/grafana/helm-charts/loki \
  --namespace monitoring \
  --version 6.49.0 \
  --values helm/loki/values.yaml

What the Data Shows

That config has been running since mid-January 2026. Live cluster state:

$ kubectl get pods -n monitoring -l app.kubernetes.io/name=loki -o wide
NAME     READY   STATUS    RESTARTS   AGE   NODE
loki-0   2/2     Running   0          24d   k8s-cp2

$ kubectl get pods -n monitoring -l app.kubernetes.io/name=alloy -o wide
NAME           READY   STATUS    RESTARTS   AGE   NODE
alloy-9pcf7    2/2     Running   0          24d   k8s-cp3
alloy-pk7fh    2/2     Running   0          24d   k8s-cp2
alloy-rztt4    2/2     Running   0          24d   k8s-cp1

Zero restarts across all 4 pods. Prometheus metrics tell the ingestion story:

Metric	Value
Ingestion rate	48.3 lines/sec
Daily log lines	~4.17 million
Raw volume (uncompressed)	~1.8 GB/day
On-disk volume (Snappy compressed)	~107 MB/day
Compression ratio	17:1
PVC usage (at snapshot)	2.58 GB of 10 Gi (24.6%)

Resource consumption:

$ kubectl top pods -n monitoring -l app.kubernetes.io/name=loki
NAME     CPU(cores)   MEMORY(bytes)
loki-0   11m          402Mi

$ kubectl top pods -n monitoring -l app.kubernetes.io/name=alloy
NAME           CPU(cores)   MEMORY(bytes)
alloy-9pcf7    7m           238Mi
alloy-pk7fh    9m           147Mi
alloy-rztt4    9m           223Mi

Total actual usage: ~36m CPU and ~1,010 Mi memory across all 4 pods. For comparison, the Helm chart's default SimpleScalable mode deploys 9 Loki pods, a gateway, and a canary before collecting a single log line.

Storage Projection

At 107 MB/day on disk, 90 days of retention projects to ~9.6 GB. The 10 Gi PVC will be close to capacity once the compactor starts deleting old data at the 90-day mark. Bumping to 15-20 Gi before that point is on the to-do list.

Memory Pressure

Alloy's memory varies by node. On k8s-cp3, the pod uses 238 Mi against a 256 Mi limit (93%). On k8s-cp1, 223 Mi (87%). The AlloyHighMemory alert fires at 80%, so these pods have been triggering warnings. The variance likely correlates with the number of pods on each node generating logs. Adding workloads could push at least one Alloy pod into OOM territory.

Loki's working set (402 Mi) exceeds its memory request (256 Mi) while staying within the 512 Mi limit. Since the scheduler uses requests for placement, Loki would be an early eviction candidate under node memory pressure. Both the Alloy limits and the Loki request need adjusting.

Dropped Entries

Since deployment, Alloy has dropped 97,345 entries with ingester_error across all 3 pods. These likely occurred during the Loki Helm upgrade (revision 2). Against ~100 million total entries, that's roughly 0.1% data loss. No drops from rate limiting, line length, or stream limits. The single-replica trade-off: when Loki restarts, the pipeline has nowhere to buffer.

The Alloy Pipeline

Alloy deploys as a DaemonSet, one pod per node. Each pod collects logs only from containers on its own node, scoped by env("HOSTNAME"), which resolves to the node name in a DaemonSet. The pipeline discovers all pods in the cluster, filters to the local node, tails their logs via the Kubernetes API, and pushes them to Loki. The full config in River syntax inside the Helm values:

// Discover all pods in the cluster
discovery.kubernetes "pods" {
  role = "pod"
}

// Filter to this node + extract Kubernetes labels
discovery.relabel "pod_logs" {
  targets = discovery.kubernetes.pods.targets

  // Keep only pods on this node (DaemonSet scope)
  rule {
    source_labels = ["__meta_kubernetes_pod_node_name"]
    action        = "keep"
    regex         = env("HOSTNAME")
  }
  rule {
    source_labels = ["__meta_kubernetes_namespace"]
    target_label  = "namespace"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_name"]
    target_label  = "pod"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_container_name"]
    target_label  = "container"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_node_name"]
    target_label  = "node"
  }
}

// Tail logs via Kubernetes API
loki.source.kubernetes "pod_logs" {
  targets    = discovery.relabel.pod_logs.output
  forward_to = [loki.process.pod_logs.receiver]
}

// Add cluster label, forward to Loki
loki.process "pod_logs" {
  stage.static_labels {
    values = { cluster = "homelab" }
  }
  forward_to = [loki.write.loki.receiver]
}

loki.write "loki" {
  endpoint {
    url = "http://loki.monitoring.svc.cluster.local:3100/loki/api/v1/push"
  }
}

The discovery.relabel stage extracts Kubernetes metadata (namespace, pod, container, node) as labels and filters by node name. The loki.process stage adds a static cluster = "homelab" label before forwarding to Loki.

API Collection vs HostPath

This pipeline uses loki.source.kubernetes, which tails logs through the Kubernetes API. The alternative, loki.source.file, mounts /var/log/pods as a hostPath volume and reads log files directly from disk.

The API method avoids hostPath mounts entirely. No volume privileges, no filesystem access to the node. The trade-off is performance: the API method opens an HTTP stream per container and routes through the API server, which adds latency and network overhead. On clusters with hundreds of containers, this can measurably load the control plane.

At 48 lines per second across this cluster, the API method runs without issues. Each Alloy pod consumes 7-9m CPU. For larger clusters or higher log volumes, loki.source.file with hostPath volumes is the higher-throughput option.

Kubernetes Events Without Triplicates

Kubernetes events (pod scheduling, image pulls, OOM kills) are cluster-scoped, not node-local. Every Alloy pod in the DaemonSet sees the same events. Without deduplication, 3 pods would push 3 copies of every event into Loki.

A stage.match drop rule handles this:

loki.source.kubernetes_events "cluster_events" {
  log_format = "logfmt"
  forward_to = [loki.process.cluster_events.receiver]
}

loki.process "cluster_events" {
  stage.static_labels {
    values = {
      cluster        = "homelab",
      source         = "kubernetes_events",
      collector_node = env("HOSTNAME"),
    }
  }

  // Only k8s-cp1 forwards events; other nodes drop them
  stage.match {
    selector = "{collector_node!=\"k8s-cp1\"}"
    action   = "drop"
  }

  forward_to = [loki.write.loki.receiver]
}

Each pod tags events with its node name via collector_node. Only k8s-cp1's Alloy instance passes the forwarding rule. The other two drop their events at the processing stage before they reach Loki. Query events in Grafana with {source="kubernetes_events"}.

Alloy also supports a clustering mode with a consistent hashing ring that assigns singleton workloads automatically. For three nodes, the explicit drop rule is simpler to reason about and debug.

Two Paths Into Loki

The logging stack accepts data from two sources through two different protocols:

Alloy pushes Kubernetes logs via the Loki API at /loki/api/v1/push. This is the standard path for container logs collected by the DaemonSet.

An OTel Collector (v0.144.0) pushes application events via Loki's native OTLP endpoint at /otlp. The exporter config is minimal:

exporters:
  otlphttp/loki:
    endpoint: http://loki.monitoring.svc.cluster.local:3100/otlp

Loki v3.x accepts OTLP natively with the TSDB schema. In this cluster, the OTel Collector sends Claude Code telemetry events, queryable in Grafana with {service_name="claude-code"}. Any application instrumented with an OpenTelemetry SDK can ship logs through this same path without Alloy in the middle.

The OTLP endpoint is the forward-looking ingestion path. As more workloads adopt OTel, direct-to-Loki becomes the default; Alloy handles the Kubernetes infrastructure logs that predate OTel instrumentation.

Monitoring the Monitors

Seven PrometheusRules watch the logging stack. Four for Loki, three for Alloy:

Alert	Severity	Condition
`LokiDown`	critical	Loki unreachable for 5 min
`LokiIngestionStopped`	warning	Zero lines received for 15 min
`LokiHighErrorRate`	warning	>10% HTTP 5xx rate for 10 min
`LokiStorageLow`	warning	PVC <20% free for 30 min
`AlloyNotOnAllNodes`	warning	Fewer pods than nodes for 10 min
`AlloyNotSendingLogs`	warning	Zero bytes to Loki for 15 min
`AlloyHighMemory`	warning	Pod >80% memory limit for 10 min

All seven are PromQL (metric-based), scraped via ServiceMonitors at 30-second intervals. They answer "is the logging pipeline working?" They don't answer "what's happening in my applications?" That distinction matters.

What's Missing

The logging stack collects and stores logs. It does not analyze them. Five gaps are worth calling out.

The biggest is log-based alerting. Loki supports ruler-based LogQL alerts, but none are configured here. You can detect "Loki is dead" but not "my application is throwing 500 errors." A rule like count_over_time({namespace="production"} |= "ERROR" [5m]) > 10 would bridge that gap. That's Part 7 material.

The Alloy pipeline doesn't parse log content. It extracts Kubernetes metadata (namespace, pod, container, node) as labels, but JSON fields inside log lines aren't indexed. Finding a specific error requires full-text |= or regex matches in LogQL, not label selectors.

There's no Loki dashboard in Grafana. Dashboards exist for UPS, kube-vip, and the kube-prometheus-stack defaults, but nothing visualizes Loki's ingestion rate, storage growth, or error rate. The ServiceMonitors expose these metrics; building a dashboard from them is straightforward.

Loki is a single replica. The 97K dropped entries during upgrades show the consequence: when the pod restarts, Alloy buffers fail and entries are lost. For log completeness, run 3 SingleBinary replicas with replication_factor: 3. For a homelab where 99.9% retention is acceptable, one pod is fine.

No external backup exists. If the Longhorn volume suffers a double node failure, logs are gone. No S3 tier, no cross-cluster shipping. Longhorn's 2-replica protection is the only safety net.

What's Next

This post is the sixth in the "Building a Production-Grade Homelab" series:

Why kubeadm Over k3s, RKE2, and Talos in 2026
HA Control Plane with kube-vip: No Load Balancer Needed
Cilium Deep Dive: What Replacing kube-proxy Actually Means
Gateway API vs Ingress: No Ingress Controller Needed
Distributed Storage with Longhorn: 2 Replicas Are Enough
The Modern Logging Stack: Loki + Alloy (Why Not Promtail) (you are here)
Alerting That Actually Wakes You Up: Discord, Email, and Dead Man's Switches
Self-Hosted GitLab: CI/CD Without Cloud Vendor Lock-in

Part 7 covers the alerting layer: Discord webhooks, email notifications, and healthchecks.io dead man's switches for detecting silent failures.

The full Loki and Alloy Helm values, ServiceMonitors, and alert rules live in the homelab repo.