Navigation

DevOps devops monitoring 5 min read

Building Grafana Dashboards

A dashboard is a single screen that shows you the health of your servers at a glance. In Grafana (the open-source tool for turning metrics into charts), a dashboard is made of panels — each panel is one graph, number, or table. In this page you will build a dashboard from scratch: you will add panels, write the queries that feed them, pick how each one looks, add a drop-down to switch between servers, and save your work. This builds directly on the Grafana intro, so make sure Grafana is running and connected to Prometheus first.

What you need before you start

This page assumes you already have:

Prometheus (a tool that collects and stores time-series metrics — numbers measured over time) running and scraping at least one target.
Node Exporter (an agent that exposes Linux server metrics like CPU and memory) running on the server you want to watch.
Grafana installed and open in your browser at http://your-server-ip:3000, with Prometheus added as a data source.

You can confirm Grafana is up on Ubuntu 22.04 / 24.04 LTS with:

sudo systemctl status grafana-server

Output:

● grafana-server.service - Grafana instance
     Loaded: loaded (/lib/systemd/system/grafana-server.service; enabled)
     Active: active (running) since Mon 2026-06-15 09:12:44 UTC; 3h ago

If the data source is missing, go to Connections → Data sources → Add data source → Prometheus and set the URL to http://localhost:9090 (use localhost only if Prometheus runs on the same machine as Grafana).

Create an empty dashboard

In the Grafana web interface:

Click the + (plus) icon in the top-right and choose New dashboard.
Click Add visualization.
When asked, select your Prometheus data source.

You now have one empty panel and an edit screen. The big area is the graph; below it is the query editor where you tell Grafana what data to show.

Write your first PromQL query

PromQL (Prometheus Query Language) is how you ask Prometheus for data. The query editor has a Code mode and a Builder mode — switch to Code to type queries directly, which is faster once you learn the syntax.

A raw metric like node_cpu_seconds_total is a counter (a number that only ever goes up). Counters are not useful raw — you almost always wrap them in rate(), which gives you the per-second average over a time window.

CPU usage as a percentage. Paste this into the query box:

100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Here is what each part does:

node_cpu_seconds_total{mode="idle"} — seconds the CPU spent doing nothing.
rate(...[5m]) — the per-second change averaged over the last 5 minutes.
avg(...) — averages across all CPU cores into one number.
100 - (... * 100) — flips idle into “busy”, giving you CPU usage as a percent.

Press Run queries (or just wait — Grafana auto-runs). You should see a line climbing and dipping.

Pick a visualization

On the right-hand panel options, the top drop-down chooses the panel type (the visual style). Each type fits a different kind of data.

Panel type	Best for	Example
Time series	A value that changes over time	CPU %, request rate
Stat	One current number, big and bold	Current memory used
Gauge	A value against a min/max range	Disk usage out of 100%
Bar gauge	Several values side by side	Usage per disk mount
Table	Raw rows and labels	List of all targets

For the CPU query, keep Time series. Then set:

Panel title (right side, under “Panel options”) to CPU Usage.
Standard options → Unit to Percent (0-100) so the Y-axis reads %.

Click Apply (top-right) to drop the panel onto the dashboard.

Add more panels

Click Add → Visualization to create the next panel. Here are two more useful queries.

Memory used as a percentage:

100 * (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes))

Set its unit to Percent (0-100) and title it Memory Usage. A Gauge panel works nicely here.

Disk space used on the root filesystem:

100 * (1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}))

Title it Disk Usage (/) and use a Stat or Gauge panel with the Percent unit. In Thresholds, set a red threshold at 90 so the number turns red when the disk is nearly full.

Add a variable to switch between servers

A variable is a drop-down at the top of the dashboard. Instead of building one dashboard per server, you make one dashboard and switch servers with a click.

Click the gear / settings icon at the top of the dashboard.
Go to Variables → Add variable.
Set Type to Query, Name to instance, and Data source to Prometheus.
In Query, enter:

label_values(node_uname_info, instance)

This lists every instance label Prometheus knows about (one per server). Click Apply, then go back to the dashboard — you now have an instance drop-down.

Now make your panels respond to it. Edit the CPU panel and change the query to filter by the variable:

100 - (avg(rate(node_cpu_seconds_total{mode="idle", instance="$instance"}[5m])) * 100)

$instance is replaced with whatever you pick in the drop-down. Do the same for the other panels by adding instance="$instance" inside the curly braces.

Save the dashboard

Click the save icon (top-right), give the dashboard a name like Server Overview, choose a folder, and click Save. Grafana stores the dashboard as JSON.

To keep a backup or move it to another Grafana, export it: open the dashboard, click Share → Export → Save to file. This downloads a .json file you can commit to Git or re-import later via + → Import.

Best Practices

Always wrap counters in rate() — raw counters only go up and tell you nothing on their own.
Set the correct unit on every panel (percent, bytes, seconds) so axes and tooltips read clearly.
Use a $instance variable instead of duplicating dashboards per server — one dashboard scales to your whole fleet.
Add thresholds (for example red at 90%) so problems jump out visually without you reading numbers.
Keep query windows like [5m] consistent across panels so graphs are comparable.
Export dashboards to JSON and commit them to Git so they are version-controlled and survive a Grafana rebuild.
Give every panel a clear title — a graph with no title is useless during an incident at 3 a.m.