wc / sort / uniq
wc counts the number of lines, words, and bytes in a file. sort sorts lines, and uniq removes consecutive duplicate lines. These commands are frequently combined in text-processing pipelines.
Syntax
wc [options] [file...] sort [options] [file...] uniq [options] [file...]
Options
| Command / Option | Description |
|---|---|
| wc file | Displays the line count, word count, and byte count. |
| wc -l file | Displays only the line count. |
| wc -w file | Displays only the word count. |
| wc -c file | Displays only the byte count. |
| wc -m file | Displays only the character count (multibyte-aware). |
| sort file | Sorts lines in alphabetical order. |
| sort -r file | Sorts in reverse order. |
| sort -n file | Sorts numerically (so 10 comes after 9). |
| sort -h file | Sorts by human-readable numbers such as 1K, 2M, and 3G. |
| sort -k N file | Sorts by the Nth field. |
| sort -u file | Removes duplicate lines after sorting. |
| sort -t delimiter file | Specifies the field delimiter character. |
| uniq file | Collapses consecutive duplicate lines into one (requires prior sort). |
| uniq -c file | Prefixes each line with the number of times it appeared. |
| uniq -d file | Displays only lines that were duplicated. |
| uniq -u file | Displays only lines that were not duplicated. |
Sample Code
The following files are used in the examples below.
fruits.txt
banana
apple
cherry
apple
banana
cherry
apple
scores.txt
100
9
42
10
3
access.log
192.168.1.100
192.168.1.101
192.168.1.100
10.0.0.1
192.168.1.100
192.168.1.101
100 9 42 10 3
access.log
192.168.1.100
192.168.1.101
192.168.1.100
10.0.0.1
192.168.1.100
192.168.1.101
Use wc -l to count the number of lines in a file.
wc -l fruits.txt 7 fruits.txt
Use wc -w to count words and wc -c to count bytes.
echo "Hello World" | wc -w 2
Use sort to sort lines alphabetically.
sort fruits.txt apple apple apple banana banana cherry cherry
Use sort -n to sort numerically. Without -n, lines are sorted lexicographically, so "10" would come before "3".
sort -n scores.txt 3 9 10 42 100
Use sort | uniq to remove duplicates. Use uniq -c to show how many times each line appears.
sort fruits.txt | uniq -c
3 apple
2 banana
2 cherry
sort | uniq -c | sort -rn is a classic pattern for building a frequency ranking.
sort access.log | uniq -c | sort -rn
3 192.168.1.100
2 192.168.1.101
1 10.0.0.1
Use uniq -d to show only duplicated lines, and uniq -u to show only unique lines.
sort fruits.txt | uniq -d apple banana cherry
Use sort -r to sort in reverse order.
sort -rn scores.txt 100 42 10 9 3
Notes
sort | uniq -c | sort -rn is a classic pipeline pattern for frequency ranking, commonly used in log analysis.
Note that uniq only collapses consecutive duplicate lines. To remove all duplicates across an entire file, you must run sort first. You can also combine these commands with grep for text filtering.
If you find any errors or copyright issues, please contact us.