Why Centralized Logging Matters

When you run one small app on one server, reading logs is easy: you open /var/log/syslog and look. But real systems grow. Soon you have a web server, a database server, and three copies of your app behind a load balancer. Now a single user request touches five machines, and the clues you need are scattered across five different log files. Centralized logging fixes this by collecting logs from every server into one searchable place. This page explains the why before the how — the pain it removes and the tools that do the job.

The pain of scattered logs

Imagine a customer reports an error at 10:42 AM. To investigate, you would normally do this:

ssh app-server-1
sudo tail -n 200 /var/log/myapp/app.log
exit
ssh app-server-2
sudo tail -n 200 /var/log/myapp/app.log

Output:

[2026-06-15 10:42:03] ERROR request failed id=req-8841 upstream timeout

You found one line — but was that the request the customer hit? You do not know, because the load balancer might have sent them to app-server-2, app-server-3, or the database. You end up SSH-ing into machine after machine, grepping the same thing over and over. This approach has real problems:

No single search. You cannot search all servers at once. You repeat grep everywhere by hand.
No correlation. A request that hops across servers leaves a trail you cannot follow, because each log sits alone.
Logs disappear. When a server is replaced, restarted, or crashes, its local logs may be lost forever. In cloud setups where servers come and go automatically, this is constant.
No alerting. Nobody is watching the files, so you only learn about errors when a customer complains.

Term check: aggregation just means gathering many separate things into one place. Centralized logging is log aggregation — pulling every server’s logs into one system. Observability is the broader goal of being able to understand what your system is doing from the outside, using logs, metrics, and traces.

What central aggregation gives you

A centralized logging system runs a small agent on each server. The agent reads the local log files and ships (sends) every new line over the network to a central store. That central store indexes the logs so you can search them instantly. The payoff is large:

Capability	Scattered logs	Centralized logs
Search across all servers	Manual `grep` on each box	One query, all servers at once
Follow a request across machines	Nearly impossible	Filter by a shared request ID
Keep logs after a server dies	Lost	Safe in the central store
Long-term retention	Limited by disk on each box	Controlled in one place
Alert on errors	None	Rules fire automatically

Correlation

The biggest win is correlation. If every part of your app tags each log line with the same request_id, you can paste that ID into one search box and see the whole journey — web server, app, and database — in time order, no matter which machine each line came from. That turns a 30-minute SSH hunt into a 5-second filter.

Retention

Retention is how long you keep logs. Central systems let you set this once, for example “keep 30 days of app logs, 1 year of audit logs,” instead of fighting disk space on every server.

Alerting

Because all logs flow through one place, you can write a rule like “if more than 50 ERROR lines appear in a minute, send a Slack message.” You find out about problems before your users do.

The main stacks: ELK and Loki

Two open-source stacks dominate in 2026. You install them on a dedicated logging server (or use a managed cloud version).

ELK / Elastic Stack stands for Elasticsearch + Logstash + Kibana. Elasticsearch stores and indexes the logs, Logstash (or the lighter Beats agents) collects and parses them, and Kibana is the web UI for searching and building dashboards. ELK indexes the full text of every log, so search is extremely powerful — but that indexing uses a lot of RAM and disk.

Grafana Loki takes a lighter approach. Instead of indexing the full text, it only indexes a few labels (like host and service) and stores the rest compressed. The agent is called Promtail (or the newer Grafana Alloy), and you view logs in Grafana. Loki is cheaper to run and pairs naturally with Grafana metrics dashboards.

Stack	Best for	Resource cost	Search style
ELK / Elastic	Deep full-text search, complex analytics	High (RAM/disk heavy)	Index every field
Grafana Loki	Cost-conscious teams already using Grafana	Low to medium	Filter by labels, then grep

When to use which: Reach for Loki if you mostly filter by host/service and want low cost. Reach for ELK if you need rich full-text search and analytics across structured fields and can afford the hardware.

Both stacks need a basic foundation to send to. Before installing anything, make sure your servers actually produce useful logs — see the basics and structured logging pages linked below.

When centralized logging is overkill

Be honest about scale. When NOT to do this:

A single hobby server with one app — journalctl and /var/log are enough.
A short-lived experiment you will delete next week.

Running ELK or Loki means maintaining another service. Only adopt it once you have multiple servers, or when losing logs would genuinely hurt — for example anything customer-facing or anything you must keep for audits.

Security gotcha: Logs often contain sensitive data — IP addresses, emails, tokens. A central log store is a juicy target. Always lock the logging server behind your firewall with ufw, for example sudo ufw allow from 10.0.0.0/24 to any port 9200 so only your internal network can reach Elasticsearch, and never expose Kibana or Grafana to the public internet without a login.

Best Practices

Tag every log line with a shared request_id so you can correlate a request across all servers.
Decide retention up front (for example 30 days for app logs) and enforce it in the central store, not on each server.
Run the log shipper as a systemd service so it restarts automatically after reboots and crashes.
Keep logging traffic on a private network and protect the store with ufw and authentication.
Start small: ship logs from one service first, confirm they arrive and are searchable, then roll out to the rest.
Set at least one alert (such as a spike in ERROR lines) so the system actively warns you instead of waiting for complaints.