How to match non-word characters in Python using Regular Expression?



A Non-word character is any type of character that is not a letter, digit, or underscore, i.e., anything other than [a-z, A-Z, 0-9_]. Examples of non-word characters are symbols, punctuation marks, spaces, tabs, and other special characters.

In Python, Regular Expressions (RegEx) are used to match the patterns in the given input strings. The re module in Python provides different methods to use the regular expressions. This module helps the developers to perform search operations, validations, filtering and much more, based on string patterns.

To match a non-word character in Python, we can use the special character class \W inside a raw string in any re module methods. In this article, let's see all the different methods to match a non-word character in Python using Regular Expressions.

Example: Finding First Non-Word Character

In this example, we will use the method re.search() by passing \W inside a raw string to find the first non-word character -

import re

text = "Python_3.10@2025!"
match = re.search(r"\W", text)

if match:
    print("First non-word character:", match.group())
    print("Character position in string:", match.start())
else:
    print("No non-word character found.")

Here is the output of the above example -

First non-word character: .
Character position in string: 10

Example: Finding All Non-Word Characters

In this example, we will use the method re.findall() by passing \W inside a raw string to find and return all non-word characters from the input string -

import re

text = "Hello@World! 2025_Year#AI"
matches = re.findall(r"\W", text)

print("All non-word characters:", matches)

Following is the output of the above example -

All non-word characters: ['@', '!', ' ', '#']

Word vs Non-Word Patterns

In Python regular expressions, different character classes are used to identify word and non-word characters. Below is a comparison of common patterns used to work with word-based data -

Pattern Description Matches
\w Matches any word character Letters, digits, underscore (_)
\W Matches any non-word character Spaces, symbols, punctuation
\s Matches any whitespace character Space, tab, newline, carriage return
\S Matches any non-whitespace character Letters, digits, punctuation, symbols

These patterns are extremely useful when we want to validate, extract or clean text data based on the presence or absence of word characters.

Updated on: 2025-06-08T10:26:33+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements