DevOps Roles & Skills
DevOps is not a single job you flip a switch to become — it is a set of skills that sit between writing code and running it in production. The good news is that you do not need all of them on day one. This page lists the core skill areas, explains why each one matters and when you reach for it, and describes the real job titles you will see in the field so you know what you are aiming at.
The core skill areas
Think of DevOps skills as layers. Each layer rests on the one below it. If you skip a lower layer, the higher ones feel like magic you cannot debug. Here is the realistic order to learn them, with why each matters.
| Skill area | Why it matters | When you use it |
|---|---|---|
| Linux command line | Almost every server runs Linux. You manage it through a terminal, not a mouse. | Every single day — it is the floor everything else stands on. |
| Networking basics | Apps talk over the network. Broken DNS or a closed port stops everything. | Debugging “it works locally but not in production”. |
| Scripting (Bash + Python) | You automate repeat tasks so humans do not do them by hand. | Backups, deploys, glue between tools. |
| Version control (Git) | All code, config, and infrastructure live in Git. | Every change you make, all day. |
| CI/CD pipelines | Continuous Integration / Continuous Delivery — automatically test and ship code. | Turning a commit into a running release. |
| Cloud (AWS / Azure / GCP) | Most companies rent servers instead of owning them. | Creating servers, databases, networks on demand. |
| Containers (Docker) | A container packages an app with everything it needs so it runs the same everywhere. | Shipping apps consistently across machines. |
| Orchestration (Kubernetes) | Kubernetes runs and heals many containers across many machines. | Larger systems with lots of moving parts. |
| Infrastructure as Code (IaC) | Define servers in code (e.g. Terraform) instead of clicking buttons. | Repeatable, reviewable infrastructure. |
| Monitoring & observability | Knowing what is happening inside a running system. | Catching problems before users do. |
Tip: Do not try to learn all of these at once. The fastest path to feeling lost is opening Kubernetes before you are comfortable in a Linux terminal. Master the bottom of the table first.
Linux: the foundation
Linux is the operating system most servers run, and you control it from a terminal. Before anything else, get comfortable navigating, reading files, and checking running services. On Ubuntu (this site targets Ubuntu 22.04 / 24.04 LTS), the package manager is apt and services are managed by systemd.
sudo apt update && sudo apt upgrade -y
systemctl status ssh
df -h
Output:
● ssh.service - OpenBSD Secure Shell server
Loaded: loaded (/usr/lib/systemd/system/ssh.service; enabled; preset: enabled)
Active: active (running) since Mon 2026-06-15 09:14:02 UTC; 3 days ago
Filesystem Size Used Avail Use% Mounted on
/dev/root 39G 12G 27G 31% /
When to use this: every time you log into a server. When not to: do not memorise every flag — learn to read man <command> and the --help output instead.
Networking: how machines talk
Networking is how one computer reaches another. You need to understand ports (numbered doors on a machine), DNS (the system that turns example.com into an IP address), and firewalls. On Ubuntu the simple firewall is ufw (Uncomplicated Firewall).
sudo ufw allow 22/tcp # SSH so you do not lock yourself out
sudo ufw allow 443/tcp # HTTPS web traffic
sudo ufw enable
sudo ufw status
Output:
Status: active
To Action From
-- ------ ----
22/tcp ALLOW Anywhere
443/tcp ALLOW Anywhere
Gotcha: Always allow port 22 (SSH) before you enable the firewall on a remote server. If you enable
ufwfirst, you can lock yourself out and lose access entirely.
When to use this: opening access to a new service. When not to: do not open ports you are not actively using — every open port is a way in for attackers.
Scripting: stop doing it by hand
If you do something twice, script it. Bash is perfect for short server tasks; Python is better when logic gets complex. A tiny example — a backup script:
#!/usr/bin/env bash
set -euo pipefail
BACKUP_DIR="/var/backups/app"
mkdir -p "$BACKUP_DIR"
tar -czf "$BACKUP_DIR/app-$(date +%F).tar.gz" /var/www/app
echo "Backup complete: $BACKUP_DIR/app-$(date +%F).tar.gz"
Output:
Backup complete: /var/backups/app/app-2026-06-15.tar.gz
The set -euo pipefail line makes the script stop on the first error instead of silently continuing — a habit that saves you from broken half-run scripts.
Git, CI/CD, cloud, and containers
These are the day-to-day tools of the trade. Git stores every change. CI/CD runs your tests and deploys automatically when you push. Cloud providers rent you servers and databases by the minute. Docker packages your app so it runs identically on your laptop and in production.
docker run -d --name web -p 8080:80 nginx:latest
docker ps
Output:
CONTAINER ID IMAGE COMMAND STATUS PORTS NAMES
a1b2c3d4e5f6 nginx:latest "/docker-entrypoint.…" Up 2 seconds 0.0.0.0:8080->80/tcp web
When to use containers: when you want the same environment everywhere. When not to: a tiny static site on one server does not need Docker — do not add complexity you will not use.
The roles in the field
The titles below overlap a lot, and small companies often roll them all into one “DevOps engineer”. Here is what each typically focuses on.
| Role | Core focus | Typical priorities |
|---|---|---|
| DevOps engineer | Build and run the pipeline that ships code | CI/CD, automation, cloud setup |
| Site Reliability Engineer (SRE) | Keep production reliable and measurable | Uptime, monitoring, incident response, error budgets |
| Platform engineer | Build internal tools so other developers ship easily | Self-service platforms, paved-road tooling |
| Cloud engineer | Design and manage cloud infrastructure | AWS/Azure/GCP architecture, networking, cost |
- DevOps engineer is the generalist. You connect development and operations and automate the path from commit to production.
- SRE (a role Google popularised) applies a software-engineering mindset to operations. SREs measure reliability with hard numbers and protect it.
- Platform engineer builds the “golden path” so other teams do not each reinvent deployment.
- Cloud engineer goes deep on one cloud provider’s services, networking, and billing.
For a deeper comparison of DevOps and SRE specifically, see the dedicated page linked below.
Best Practices
- Learn the layers bottom-up: Linux and networking first, Kubernetes much later.
- Automate anything you do more than twice — your future self will thank you.
- Put everything in Git, including config and infrastructure, not just application code.
- Never enable a firewall remotely without first allowing your SSH port.
- Pick one cloud provider to learn deeply rather than skimming all three.
- Treat monitoring as a first-class skill, not an afterthought — you cannot fix what you cannot see.
- Do not chase every tool; depth in fundamentals beats a shallow tour of trendy software.