awk

awk is a programming language for processing text field by field (column by column). It excels at aggregating, extracting, and transforming structured text such as CSV files and log files.

Syntax

awk [options] 'program' [file...]

awk -F delimiter 'program' [file...]

Syntax and Variables

Syntax / Variable	Description
'{print $1}'	Prints the first field ($0 refers to the entire line).
-F ':'	Specifies the field delimiter (default is whitespace).
NR	The line number of the record currently being processed.
NF	The number of fields in the current line. Use $NF to reference the last field.
FS	The field separator (Field Separator).
OFS	The output field separator (Output Field Separator).
RS	The record separator (Record Separator; default is a newline).
BEGIN { }	A block that runs once before any input is read.
END { }	A block that runs once after all input has been read.
/pattern/ { }	Applies the action only to lines matching the pattern.

Sample Code

The following files are used in the examples below.

scores.txt

alice 85 Tokyo
bob 92 Osaka
charlie 78 Tokyo
diana 95 Nagoya
eve 88 Osaka

sales.csv

name,amount,region
alice,1200,east
bob,800,west
charlie,1500,east

Prints the first field of each line.

awk '{print $1}' scores.txt
alice
bob
charlie
diana
eve

Filters rows where the second column is 90 or greater.

awk '$2 >= 90' scores.txt
bob 92 Osaka
diana 95 Nagoya

Prints specific fields only for rows that match the condition.

awk '$2 >= 90 {print $1, $2}' scores.txt
bob 92
diana 95

Uses NR to print each line with its line number.

awk '{print NR": "$0}' scores.txt
1: alice 85 Tokyo
2: bob 92 Osaka
3: charlie 78 Tokyo
4: diana 95 Nagoya
5: eve 88 Osaka

Uses the END block to sum a column (summing the second column).

awk '{sum += $2} END {print "Total:", sum}' scores.txt
Total: 438

Uses -F to specify a delimiter and process CSV input.

awk -F',' '{print $1, $2}' sales.csv
name amount
alice 1200
bob 800
charlie 1500

Uses BEGIN and END to add a header and footer.

awk 'BEGIN{print "=== Results ==="} {print $1, $2} END{print "=== End ==="}' scores.txt
=== Results ===
alice 85
bob 92
charlie 78
diana 95
eve 88
=== End ===

Processes only lines matching a pattern (rows where the third column is Tokyo).

awk '$3 == "Tokyo" {print $1, $2}' scores.txt
alice 85
charlie 78

Uses an associative array to remove duplicates.

awk '{print $3}' scores.txt | awk '!seen[$0]++'
Tokyo
Osaka
Nagoya

Notes

awk is a small programming language available in the shell. It supports variables, conditionals, loops, arrays, and built-in functions such as gsub, split, substr, and sprintf. For complex processing, Python or Perl may be more readable, but for one-liner aggregations and transformations, awk is the fastest choice.

For simple line searching, see grep. For text substitution, see sed.

If you find any errors or copyright issues, please contact us.

Home

Bash Dictionary