collections.defaultdict()
defaultdict is a subclass of the built-in dictionary that automatically generates a default value when a key does not exist. It lets you aggregate or group data without worrying about KeyError.
Syntax
from collections import defaultdict # Specify the type used to generate default values. d = defaultdict(list) # Generates an empty list for missing keys. d = defaultdict(int) # Generates 0 for missing keys. d = defaultdict(set) # Generates an empty set for missing keys. d = defaultdict(str) # Generates an empty string for missing keys. d = defaultdict(lambda: default_value) # Specify any default value with a lambda. # Accessing a missing key automatically generates the default value. d['new_key'].append(value) # No KeyError is raised.
Constructor Arguments and Attributes
| Argument / Attribute | Description |
|---|---|
| defaultdict(list) | Generates an empty list [] as the default value when a missing key is accessed. Useful for grouping data. |
| defaultdict(int) | Generates the integer 0 as the default value when a missing key is accessed. Useful for counting. |
| defaultdict(set) | Generates an empty set set() as the default value when a missing key is accessed. Useful for managing unique collections. |
| defaultdict(lambda: value) | Specifies any default value using a lambda. Use this when you want a default other than 0 or an empty container. |
| d.default_factory | An attribute that holds the callable (the constructor argument) used to generate default values. |
Sample Code
from collections import defaultdict
# Group data using defaultdict(list).
students = [('Tanaka', 'Math'), ('Sato', 'Japanese'), ('Tanaka', 'English'), ('Sato', 'Math')]
by_student = defaultdict(list)
for name, subject in students:
by_student[name].append(subject) # No KeyError even for new keys.
print(dict(by_student))
# Outputs: {'Tanaka': ['Math', 'English'], 'Sato': ['Japanese', 'Math']}
# With a regular dict, you must initialize keys manually (for comparison).
by_student_normal = {}
for name, subject in students:
if name not in by_student_normal:
by_student_normal[name] = []
by_student_normal[name].append(subject)
# Count items using defaultdict(int).
words = ['apple', 'banana', 'apple', 'cherry', 'banana', 'apple']
word_count = defaultdict(int)
for word in words:
word_count[word] += 1 # Missing keys start at 0.
print(dict(word_count))
# Outputs: {'apple': 3, 'banana': 2, 'cherry': 1}
# Group without duplicates using defaultdict(set).
visits = [('Tanaka', 'Tokyo'), ('Sato', 'Osaka'), ('Tanaka', 'Tokyo'), ('Tanaka', 'Kyoto')]
visited = defaultdict(set)
for name, city in visits:
visited[name].add(city) # Duplicates are automatically excluded.
print(dict(visited))
# Outputs: {'Tanaka': {'Tokyo', 'Kyoto'}, 'Sato': {'Osaka'}}
# Set a custom default value using a lambda.
prices = defaultdict(lambda: 9999) # Unlisted items default to 9999
prices['apple'] = 120
print(prices['apple']) # Outputs: 120
print(prices['grape']) # Not listed, so outputs: 9999
# Check the default_factory attribute.
d = defaultdict(list)
print(d.default_factory) # Outputs: <class 'list'>
Notes
defaultdict works just like a regular dictionary, but instead of raising a KeyError when a missing key is accessed, it automatically generates a default value and assigns it to that key. This eliminates the boilerplate of checking whether a key exists before using it.
Counting with defaultdict(int) achieves the same result as using Counter, but defaultdict is more flexible when you need to combine counting with other complex operations. For simple counting, Counter produces more concise code.
When a default value is generated by defaultdict, that key is added to the dictionary. If you only want to return a default value without adding the key, use the regular dictionary's get() method or setdefault() instead.
For counting elements, see collections.Counter().
If you find any errors or copyright issues, please contact us.