re.findall() / re.finditer()

Functions that use regular expressions to extract all parts of a string matching a pattern. re.findall() returns a list, while re.finditer() returns an iterator of match objects.

Syntax

import re

# Returns all matching strings as a list.
match_list = re.findall(pattern, string)

# Returns all matching objects as an iterator.
for m in re.finditer(pattern, string):
    print(m.group())
    print(m.start(), m.end())  # You can also get the position of the match.

Functions

Function	Description
re.findall(pattern, string)	Returns all matching strings as a list. If the pattern contains capture groups, returns a list of the group contents instead. Returns an empty list if there are no matches.
re.finditer(pattern, string)	Returns an iterator of match objects for all matches. You can retrieve the match position and groups from each object.

Sample Code

import re

# Use findall() to extract all numbers.
text = 'Item A: 1200, Item B: 850, Item C: 3000'
prices = re.findall(r'\d+', text)
print(prices)  # Outputs: ['1200', '850', '3000']

# Calculate the total.
total = sum(int(p) for p in prices)
print(f'Total: {total}')  # Outputs: 'Total: 5050'

# With a capture group, returns the group contents.
html = '<a href="https://example.com">Site 1</a> <a href="https://test.jp">Site 2</a>'
urls = re.findall(r'href="([^?lang=en"]+)"', html)
print(urls)  # Outputs: ['https://example.com', 'https://test.jp']

# With multiple groups, returns a list of tuples.
log = '2025-04-01 ERROR: timeout\n2025-04-02 INFO: started\n2025-04-03 ERROR: crash'
errors = re.findall(r'(\d{4}-\d{2}-\d{2}) (ERROR): (.+)', log)
print(errors)
# Outputs: [('2025-04-01', 'ERROR', 'timeout'), ('2025-04-03', 'ERROR', 'crash')]

# Use finditer() to also get match positions.
text2 = 'apple and orange and banana'
for m in re.finditer(r'[a-z]+', text2):
    print(f'{m.group()} (position: {m.start()}-{m.end()})')
# Prints each alphabetic word along with its position.

# finditer() is memory-efficient even for large texts.
big_text = 'A1 B2 C3 D4 E5 ' * 10000
count = sum(1 for _ in re.finditer(r'[A-Z]\d', big_text))
print(f'Match count: {count}')  # Outputs: 'Match count: 50000'

Notes

Use re.findall() when you simply want a list of matched strings. The return type varies depending on whether the pattern contains capture groups: no groups returns a list of strings, one group returns a list of the group's strings, and multiple groups returns a list of tuples.

re.finditer() returns an iterator of match objects, so you can retrieve not only the matched text but also position information (start() and end()) and capture groups. Because it processes results one at a time without loading everything into memory, it is more memory-efficient than findall() when there are a large number of matches.

When matches overlap, the next search starts from where the previous match ended, so overlapping matches cannot be retrieved. Use lookahead assertions if you need overlapping matches.

For single-match searches, see re.match() / re.search() / re.fullmatch().

If you find any errors or copyright issues, please contact us.

Home

Python Dictionary

re.findall() / re.finditer()

Syntax

Functions

Sample Code

Notes