Why Centralized Logging Matters
When you run one small app on one server, reading logs is easy: you open /var/log/syslog and look. But real systems grow. Soon you have a web server, a database server, and three copies of your app behind a load balancer. Now a single user request touches five machines, and the clues you need are scattered across five different log files. Centralized logging fixes this by collecting logs from every server into one searchable place. This page explains the why before the how — the pain it removes and the tools that do the job.
The pain of scattered logs
Imagine a customer reports an error at 10:42 AM. To investigate, you would normally do this:
ssh app-server-1
sudo tail -n 200 /var/log/myapp/app.log
exit
ssh app-server-2
sudo tail -n 200 /var/log/myapp/app.log
Output:
[2026-06-15 10:42:03] ERROR request failed id=req-8841 upstream timeout
You found one line — but was that the request the customer hit? You do not know, because the load balancer might have sent them to app-server-2, app-server-3, or the database. You end up SSH-ing into machine after machine, grepping the same thing over and over. This approach has real problems:
- No single search. You cannot search all servers at once. You repeat
grepeverywhere by hand. - No correlation. A request that hops across servers leaves a trail you cannot follow, because each log sits alone.
- Logs disappear. When a server is replaced, restarted, or crashes, its local logs may be lost forever. In cloud setups where servers come and go automatically, this is constant.
- No alerting. Nobody is watching the files, so you only learn about errors when a customer complains.
Term check: aggregation just means gathering many separate things into one place. Centralized logging is log aggregation — pulling every server’s logs into one system. Observability is the broader goal of being able to understand what your system is doing from the outside, using logs, metrics, and traces.
What central aggregation gives you
A centralized logging system runs a small agent on each server. The agent reads the local log files and ships (sends) every new line over the network to a central store. That central store indexes the logs so you can search them instantly. The payoff is large:
| Capability | Scattered logs | Centralized logs |
|---|---|---|
| Search across all servers | Manual grep on each box | One query, all servers at once |
| Follow a request across machines | Nearly impossible | Filter by a shared request ID |
| Keep logs after a server dies | Lost | Safe in the central store |
| Long-term retention | Limited by disk on each box | Controlled in one place |
| Alert on errors | None | Rules fire automatically |
Correlation
The biggest win is correlation. If every part of your app tags each log line with the same request_id, you can paste that ID into one search box and see the whole journey — web server, app, and database — in time order, no matter which machine each line came from. That turns a 30-minute SSH hunt into a 5-second filter.
Retention
Retention is how long you keep logs. Central systems let you set this once, for example “keep 30 days of app logs, 1 year of audit logs,” instead of fighting disk space on every server.
Alerting
Because all logs flow through one place, you can write a rule like “if more than 50 ERROR lines appear in a minute, send a Slack message.” You find out about problems before your users do.
The main stacks: ELK and Loki
Two open-source stacks dominate in 2026. You install them on a dedicated logging server (or use a managed cloud version).
ELK / Elastic Stack stands for Elasticsearch + Logstash + Kibana. Elasticsearch stores and indexes the logs, Logstash (or the lighter Beats agents) collects and parses them, and Kibana is the web UI for searching and building dashboards. ELK indexes the full text of every log, so search is extremely powerful — but that indexing uses a lot of RAM and disk.
Grafana Loki takes a lighter approach. Instead of indexing the full text, it only indexes a few labels (like host and service) and stores the rest compressed. The agent is called Promtail (or the newer Grafana Alloy), and you view logs in Grafana. Loki is cheaper to run and pairs naturally with Grafana metrics dashboards.
| Stack | Best for | Resource cost | Search style |
|---|---|---|---|
| ELK / Elastic | Deep full-text search, complex analytics | High (RAM/disk heavy) | Index every field |
| Grafana Loki | Cost-conscious teams already using Grafana | Low to medium | Filter by labels, then grep |
When to use which: Reach for Loki if you mostly filter by host/service and want low cost. Reach for ELK if you need rich full-text search and analytics across structured fields and can afford the hardware.
Both stacks need a basic foundation to send to. Before installing anything, make sure your servers actually produce useful logs — see the basics and structured logging pages linked below.
When centralized logging is overkill
Be honest about scale. When NOT to do this:
- A single hobby server with one app —
journalctland/var/logare enough. - A short-lived experiment you will delete next week.
Running ELK or Loki means maintaining another service. Only adopt it once you have multiple servers, or when losing logs would genuinely hurt — for example anything customer-facing or anything you must keep for audits.
Security gotcha: Logs often contain sensitive data — IP addresses, emails, tokens. A central log store is a juicy target. Always lock the logging server behind your firewall with
ufw, for examplesudo ufw allow from 10.0.0.0/24 to any port 9200so only your internal network can reach Elasticsearch, and never expose Kibana or Grafana to the public internet without a login.
Best Practices
- Tag every log line with a shared
request_idso you can correlate a request across all servers. - Decide retention up front (for example 30 days for app logs) and enforce it in the central store, not on each server.
- Run the log shipper as a
systemdservice so it restarts automatically after reboots and crashes. - Keep logging traffic on a private network and protect the store with
ufwand authentication. - Start small: ship logs from one service first, confirm they arrive and are searchable, then roll out to the rest.
- Set at least one alert (such as a spike in ERROR lines) so the system actively warns you instead of waiting for complaints.