wc / sort / uniq

wc counts the number of lines, words, and bytes in a file. sort sorts lines, and uniq removes consecutive duplicate lines. These commands are frequently combined in text-processing pipelines.

Syntax

wc [options] [file...]
sort [options] [file...]
uniq [options] [file...]

Options

Command / Option	Description
wc file	Displays the line count, word count, and byte count.
wc -l file	Displays only the line count.
wc -w file	Displays only the word count.
wc -c file	Displays only the byte count.
wc -m file	Displays only the character count (multibyte-aware).
sort file	Sorts lines in alphabetical order.
sort -r file	Sorts in reverse order.
sort -n file	Sorts numerically (so 10 comes after 9).
sort -h file	Sorts by human-readable numbers such as 1K, 2M, and 3G.
sort -k N file	Sorts by the Nth field.
sort -u file	Removes duplicate lines after sorting.
sort -t delimiter file	Specifies the field delimiter character.
uniq file	Collapses consecutive duplicate lines into one (requires prior sort).
uniq -c file	Prefixes each line with the number of times it appeared.
uniq -d file	Displays only lines that were duplicated.
uniq -u file	Displays only lines that were not duplicated.

Sample Code

The following files are used in the examples below.

fruits.txt

banana
apple
cherry
apple
banana
cherry
apple

scores.txt

access.log

192.168.1.100
192.168.1.101
192.168.1.100
10.0.0.1
192.168.1.100
192.168.1.101

Use wc -l to count the number of lines in a file.

wc -l fruits.txt
7 fruits.txt

Use wc -w to count words and wc -c to count bytes.

echo "Hello World" | wc -w
2

Use sort to sort lines alphabetically.

sort fruits.txt
apple
apple
apple
banana
banana
cherry
cherry

Use sort -n to sort numerically. Without -n, lines are sorted lexicographically, so "10" would come before "3".

sort -n scores.txt
3
9
10
42
100

Use sort | uniq to remove duplicates. Use uniq -c to show how many times each line appears.

sort fruits.txt | uniq -c
      3 apple
      2 banana
      2 cherry

sort | uniq -c | sort -rn is a classic pattern for building a frequency ranking.

sort access.log | uniq -c | sort -rn
      3 192.168.1.100
      2 192.168.1.101
      1 10.0.0.1

Use uniq -d to show only duplicated lines, and uniq -u to show only unique lines.

sort fruits.txt | uniq -d
apple
banana
cherry

Use sort -r to sort in reverse order.

sort -rn scores.txt
100
42
10
9
3

Notes

sort | uniq -c | sort -rn is a classic pipeline pattern for frequency ranking, commonly used in log analysis.

Note that uniq only collapses consecutive duplicate lines. To remove all duplicates across an entire file, you must run sort first. You can also combine these commands with grep for text filtering.

If you find any errors or copyright issues, please contact us.

Home

Bash Dictionary