Navigation

DevOps devops shell 6 min read

Text Processing with sed & awk

When you administer Linux servers you spend a huge amount of time reading and editing text: config files, log files, and the output of other commands. Two classic tools make this fast and repeatable. sed (short for “stream editor”, a tool that edits text as it flows through) is best for find-and-replace and line deletion. awk (named after its three authors, Aho, Weinberger, and Kernighan) is best for pulling out columns and building small reports. Both come pre-installed on Ubuntu 22.04 and 24.04 LTS, so there is nothing to install.

sed vs awk — when to use which

Both tools read text line by line, but they are good at different jobs. Reach for the right one and your scripts stay short and clear.

Task	Use	Why
Replace text in a file or stream	`sed`	One short `s/old/new/` command
Delete or print specific lines	`sed`	Line addressing is built in
Edit a config file in place	`sed -i`	Writes changes back to the file
Print column 3 from a log	`awk`	Splits each line into fields automatically
Sum or count values in a report	`awk`	Has variables, math, and `END` blocks
Filter rows by a condition	`awk`	`awk '$3 > 100'` reads like a sentence

A simple rule: if you are changing text, start with sed; if you are extracting or calculating from columns, start with awk.

Find and replace with sed

The core of sed is the substitute command, written s/old/new/. The s means substitute, the text between the first and second slash is what to find, and the text between the second and third slash is the replacement. By default it only changes the first match on each line. Add the g flag (for “global”) to change every match on the line.

echo "cat dog cat" | sed 's/cat/bird/'

Output:

bird dog cat

Now with the global flag so both cat words change:

echo "cat dog cat" | sed 's/cat/bird/g'

Output:

bird dog bird

You can also run sed on a whole file. This reads app.conf and prints the result to the screen without touching the file on disk:

sed 's/localhost/127.0.0.1/g' app.conf

The slash / is just the most common delimiter, not a magic one. When your text contains slashes (like file paths) use a different separator to avoid escaping every slash: sed 's#/var/www#/srv/www#g' nginx.conf. Any character after the s becomes the delimiter.

Editing a config file in place with sed -i

Printing to the screen is safe for testing, but eventually you want to actually save the change. The -i flag means “in place” — it writes the edited text back into the original file. When to use this: automating a config change across many servers, or in a provisioning script where you cannot open an editor by hand. When NOT to: on a file you have not backed up, because -i overwrites it immediately.

The safe habit is to make a backup at the same time. Adding a suffix after -i tells sed to save the original with that suffix first.

Say /etc/ssh/sshd_config contains the line #PasswordAuthentication yes and you want to turn password logins off. Run:

sudo sed -i.bak 's/^#*PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config

Here -i.bak saves the original as sshd_config.bak before editing. The pattern ^#*PasswordAuthentication.* matches the line whether or not it starts with # (the ^ anchors to the start of the line, #* allows zero or more #, and .* matches the rest). After editing, apply the change:

sudo systemctl restart ssh

Always keep a backup when using sudo sed -i on files under /etc. A bad pattern can silently break a service. The .bak copy lets you restore with sudo mv /etc/ssh/sshd_config.bak /etc/ssh/sshd_config.

Deleting lines with sed

sed can also remove lines. The d command deletes lines that match a pattern or a line number.

Delete every blank line from a file:

sed '/^$/d' messy.conf

The pattern /^$/ means a line with nothing between its start (^) and end ($) — that is, an empty line. To delete all comment lines (lines starting with #):

sed '/^#/d' app.conf

You can also target line numbers. This deletes the first line (useful for stripping a header):

sed '1d' data.csv

Extracting columns with awk

awk automatically splits each line into fields separated by whitespace. You refer to them as $1, $2, $3, and so on. $0 means the whole line. This makes pulling a single column trivial.

Imagine an Nginx access log line in /var/log/nginx/access.log:

203.0.113.5 - - [15/Jun/2026:10:22:01 +0000] "GET /home HTTP/1.1" 200 1024

The first field ($1) is the visitor’s IP address. To print every IP that hit your server:

awk '{print $1}' /var/log/nginx/access.log

Output:

203.0.113.5
198.51.100.7
203.0.113.5

To find the most frequent visitors, pipe that into sort and uniq:

awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head

Output:

     42 203.0.113.5
     19 198.51.100.7
      8 192.0.2.44

Choosing a different field separator

Not all files use spaces. CSV files (comma-separated values) use commas. The -F flag sets the separator. To print the email column (field 2) from a comma-separated file:

awk -F',' '{print $2}' users.csv

For files separated by colons, like /etc/passwd, use -F':'. This prints each username (the first field):

awk -F':' '{print $1}' /etc/passwd

Filtering and simple reports with awk

awk shines when you add a condition. Put a test before the { } block and only matching lines run it. This prints only log lines where the HTTP status code (field 9 in the log above) is 404:

awk '$9 == 404 {print $7}' /var/log/nginx/access.log

That gives you the list of missing URLs people requested. You can also do math across all lines using a special END block, which runs once after the last line. This sums the bytes-sent column (field 10) to report total traffic:

awk '{total += $10} END {print "Total bytes:", total}' /var/log/nginx/access.log

Output:

Total bytes: 89231044

Here total += $10 adds each line’s value to a running variable, and END prints the final sum. When to use this: quick one-off reports straight from a log, before reaching for heavier monitoring tools.

Best Practices

Test sed without -i first; once the screen output looks right, add -i.bak to save safely.
Always keep a backup (-i.bak) when editing files under /etc, and verify the service still starts afterward.
Use a non-slash delimiter (s#a#b#) when your text contains file paths to avoid messy escaping.
Anchor patterns with ^ and $ so you match exactly the line you mean, not a substring elsewhere.
Pick awk for columns and math, sed for substitution and deletion — combining them in a pipe is often cleaner than forcing one tool to do everything.
Quote your sed and awk programs in single quotes so the shell does not expand $1, $2, and other symbols before the tool sees them.

Text Processing with sed & awk

sed vs awk — when to use which

Find and replace with sed

Editing a config file in place with sed -i

Deleting lines with sed

Extracting columns with awk

Choosing a different field separator

Filtering and simple reports with awk

Best Practices

Related Topics