wc / sort / uniq
| Since: | All Linux | |
|---|---|---|
| macOS(2001 Cheetah) | ||
| Bash 1.0(1989) |
wc counts the number of lines, words, and bytes in a file. sort sorts lines, and uniq removes consecutive duplicate lines. These commands are frequently combined in text-processing pipelines.
Syntax
wc [options] [file...] sort [options] [file...] uniq [options] [file...]
Options
| Command / Option | Description |
|---|---|
| wc file | Displays the line count, word count, and byte count. |
| wc -l file | Displays only the line count. |
| wc -w file | Displays only the word count. |
| wc -c file | Displays only the byte count. |
| wc -m file | Displays only the character count (multibyte-aware). |
| sort file | Sorts lines in alphabetical order. |
| sort -r file | Sorts in reverse order. |
| sort -n file | Sorts numerically (so 10 comes after 9). |
| sort -h file | Sorts by human-readable numbers such as 1K, 2M, and 3G. |
| sort -k N file | Sorts by the Nth field. |
| sort -u file | Removes duplicate lines after sorting. |
| sort -t delimiter file | Specifies the field delimiter character. |
| uniq file | Collapses consecutive duplicate lines into one (requires prior sort). |
| uniq -c file | Prefixes each line with the number of times it appeared. |
| uniq -d file | Displays only lines that were duplicated. |
| uniq -u file | Displays only lines that were not duplicated. |
Sample Code
The following files are used in the examples below.
fighters.txt
Terry Bogard Yagami Iori Kusanagi Kyo Yagami Iori Terry Bogard Kusanagi Kyo Yagami Iori
scores.txt
100 9 42 10 3
access.log
192.168.1.100 192.168.1.101 192.168.1.100 10.0.0.1 192.168.1.100 192.168.1.101
Use wc -l to count the number of lines in a file.
wc -l fighters.txt 7 fighters.txt
Use wc -w to count words and wc -c to count bytes.
echo "Hello World" | wc -w 2
Use sort to sort lines alphabetically.
sort fighters.txt Kusanagi Kyo Kusanagi Kyo Terry Bogard Terry Bogard Yagami Iori Yagami Iori Yagami Iori
Use sort -n to sort numerically. Without -n, lines are sorted lexicographically, so "10" would come before "3".
sort -n scores.txt 3 9 10 42 100
Use sort | uniq to remove duplicates. Use uniq -c to show how many times each line appears.
sort fighters.txt | uniq -c
2 Kusanagi Kyo
2 Terry Bogard
3 Yagami Iori
sort | uniq -c | sort -rn is a classic pattern for building a frequency ranking.
sort access.log | uniq -c | sort -rn
3 192.168.1.100
2 192.168.1.101
1 10.0.0.1
Use uniq -d to show only duplicated lines, and uniq -u to show only unique lines.
sort fighters.txt | uniq -d Kusanagi Kyo Terry Bogard Yagami Iori
Use sort -r to sort in reverse order.
sort -rn scores.txt 100 42 10 9 3
Common Mistakes
Common Mistake 1: Using uniq without sort first — consecutive duplicates only
uniq only removes consecutive duplicate lines. If duplicates are not adjacent, they are not removed. Always run sort before uniq.
cat unsorted.txt Yagami Iori Kusanagi Kyo Yagami Iori uniq unsorted.txt Yagami Iori Kusanagi Kyo Yagami Iori (duplicate "Yagami Iori" lines were not adjacent, so uniq didn't remove them)
Run the following command:
sort unsorted.txt | uniq Kusanagi Kyo Yagami Iori
Common Mistake 2: sort without -n sorts numbers lexicographically
By default, sort compares lines as strings. For numbers, this means "10" sorts before "3" because "1" < "3" lexicographically. Use -n for correct numeric sorting.
printf "10\n3\n42\n9\n" | sort 10 3 42 9 (lexicographic order — wrong for numbers)
Run the following command:
printf "10\n3\n42\n9\n" | sort -n 3 9 10 42
Notes
sort | uniq -c | sort -rn is a classic pipeline pattern for frequency ranking, commonly used in log analysis.
Note that uniq only collapses consecutive duplicate lines. To remove all duplicates across an entire file, you must run sort first. You can also combine these commands with grep for text filtering.
If you find any errors or copyright issues, please contact us.