itertools.groupby() / itertools.starmap() / itertools.accumulate()
| Since: | groupby() / starmap() | Python 2(2000) |
|---|---|---|
| accumulate() | Python 3.2(2011) |
itertools.groupby() groups consecutive elements that share the same key. itertools.starmap() unpacks arguments and maps them to a function, and itertools.accumulate() performs running cumulative calculations. groupby() is designed to work on sorted data — it only groups consecutive elements with the same key.
Syntax
import itertools # Group consecutive elements by key itertools.groupby(iterable, key=None) # Unpack arguments and map to a function itertools.starmap(function, iterable) # Cumulative calculation itertools.accumulate(iterable, func=operator.add, initial=None)
Functions
| Function | Description |
|---|---|
| groupby(iterable, key) | Groups consecutive elements with the same key and returns an iterator. |
| starmap(func, iterable) | Unpacks each element of the iterable and passes it as arguments to func, returning an iterator of results. |
| accumulate(iterable, func) | Returns the running cumulative results. Omitting func defaults to cumulative sum. |
Sample Code
sample_itertools_groupby.py
import itertools
import operator
# groupby: group by key (must be sorted first)
data = [
{'name': 'Kiryu Kazuma', 'org': 'Tojo Clan'},
{'name': 'Nishikiyama Akira', 'org': 'Tojo Clan'},
{'name': 'Akiyama Shun', 'org': 'Sky Finance'},
{'name': 'Saejima Taiga', 'org': 'Sky Finance'},
{'name': 'Majima Goro', 'org': 'Tojo Clan'}, # not sorted, so becomes a separate group
]
# Sort by org, then group
sorted_data = sorted(data, key=lambda x: x['org'])
for org, members in itertools.groupby(sorted_data, key=lambda x: x['org']):
names = [m['name'] for m in members]
print(f"{org}: {names}")
# Sky Finance: ['Akiyama Shun', 'Saejima Taiga']
# Tojo Clan: ['Kiryu Kazuma', 'Nishikiyama Akira', 'Majima Goro']
# groupby: group consecutive identical characters in a string
for char, group in itertools.groupby('AAABBCCDD'):
print(char, list(group))
# A ['A', 'A', 'A']
# B ['B', 'B']
# C ['C', 'C']
# D ['D', 'D']
# starmap: unpack tuples and pass as arguments to a function
pairs = [(2, 3), (4, 2), (10, 3)]
result = list(itertools.starmap(pow, pairs))
print(result) # [8, 16, 1000] (2^3, 4^2, 10^3)
# accumulate: running sum
nums = [1, 2, 3, 4, 5]
acc_sum = list(itertools.accumulate(nums))
print(acc_sum) # [1, 3, 6, 10, 15]
# accumulate: running product
acc_prod = list(itertools.accumulate(nums, operator.mul))
print(acc_prod) # [1, 2, 6, 24, 120] (1!, 2!, 3!, 4!, 5!)
# accumulate: running maximum
nums2 = [3, 1, 4, 1, 5, 9, 2, 6]
acc_max = list(itertools.accumulate(nums2, max))
print(acc_max) # [3, 3, 4, 4, 5, 9, 9, 9]
python3 itertools_groupby.py Sky Finance: ['Akiyama Shun', 'Saejima Taiga'] Tojo Clan: ['Kiryu Kazuma', 'Nishikiyama Akira', 'Majima Goro'] A ['A', 'A', 'A'] B ['B', 'B'] C ['C', 'C'] D ['D', 'D'] [8, 16, 1000] [1, 3, 6, 10, 15] [1, 2, 6, 24, 120] [3, 3, 4, 4, 5, 9, 9, 9]
Details
groupby() works similarly to SQL's GROUP BY, but it only groups consecutive elements, so you must sort the data by key beforehand. If you do not sort first, elements with the same key that appear at different positions will be treated as separate groups. Also note that a group's iterator becomes exhausted once you advance to the next group — convert it to a list with list() if you need to reuse the values.
starmap() is similar to map(), but it unpacks each element (assumed to be a tuple) and passes its items as separate arguments to the function. It offers a cleaner alternative to patterns like map(func, zip(a, b)).
accumulate() accepts a custom func since Python 3.3, and supports an initial parameter for a starting value since Python 3.8. It is useful for a wide variety of running aggregations, including cumulative sums, products, and running maximums.
If you find any errors or copyright issues, please contact us.