Language
日本語
English

Caution

JavaScript is disabled in your browser.
This site uses JavaScript for features such as search.
For the best experience, please enable JavaScript before browsing this site.

Python Dictionary

  1. Home
  2. Python Dictionary
  3. re.sub() / re.split() / re.compile()

re.sub() / re.split() / re.compile()

Functions for replacing and splitting strings using regular expressions, and for compiling patterns. Commonly used in text processing and parser tasks.

Syntax

import re

# Replaces parts of the string that match the pattern.
new_string = re.sub(pattern, replacement, string)
new_string, count = re.subn(pattern, replacement, string)

# Splits the string at pattern matches.
list = re.split(pattern, string)

# Compiles a pattern for reuse.
pattern = re.compile(pattern)
pattern.search(string)
pattern.findall(string)

Function List

FunctionDescription
re.sub(pattern, repl, string, count=0)Replaces all occurrences of the pattern in the string with the replacement. Use count to limit the maximum number of replacements. Returns the new string.
re.subn(pattern, repl, string)Same as re.sub(), but returns a tuple of (new_string, number_of_replacements).
re.split(pattern, string, maxsplit=0)Splits the string at each match of the pattern and returns a list. If the pattern contains a capturing group, the matched separators are also included in the result.
re.compile(pattern, flags=0)Compiles a regular expression pattern into a reusable object. Efficient when the same pattern is used repeatedly.

Sample Code

import re

# Replace matches using sub().
text = 'Hello World hello python'
result = re.sub(r'hello', 'Hi', text, flags=re.IGNORECASE)
print(result)  # Outputs: 'Hi World Hi python'

# Collapse consecutive whitespace into a single space.
messy = 'This    is    a    test.'
clean = re.sub(r'\s+', ' ', messy)
print(clean)  # Outputs: 'This is a test.'

# Use capturing groups in the replacement string (referenced with \1).
date = '2025/04/15'
iso = re.sub(r'(\d{4})/(\d{2})/(\d{2})', r'\1-\2-\3', date)
print(iso)  # Outputs: '2025-04-15'

# Use subn() to also get the number of replacements.
new_text, count = re.subn(r'\d+', 'N', 'item1 and item2 and item3')
print(new_text)  # Outputs: 'itemN and itemN and itemN'
print(f'{count} replacement(s) made.')  # Outputs: '3 replacement(s) made.'

# Use split() to split a string by a regular expression pattern.
csv_like = 'Alice,Bob、Charlie Dave'  # Mixed delimiters
names = re.split(r'[,、\s]+', csv_like)
print(names)  # Outputs: ['Alice', 'Bob', 'Charlie', 'Dave']

# Use compile() to reuse a pattern.
email_pattern = re.compile(r'[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}')
emails = [
    'user@example.com',
    'not-an-email',
    'info@test.co.jp',
]
for e in emails:
    if email_pattern.fullmatch(e):
        print(f'{e} is valid.')
    else:
        print(f'{e} is invalid.')

Notes

The replacement string in re.sub() can reference capturing groups using \1, \2, and so on. You can also pass a function instead of a replacement string to generate the replacement dynamically for each match.

When re.split() is used with a pattern that contains a capturing group, the matched separator is included in the returned list. To split without including the separator, use a non-capturing group ((?:...)).

re.compile() improves performance when the same pattern is used repeatedly. Note that Python internally caches compiled patterns to some extent, so the benefit is minimal for a small number of uses. It is most effective when the pattern is used heavily inside a loop.

For pattern matching, see re.match() / re.search() / re.fullmatch().

If you find any errors or copyright issues, please .