
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to match non-word characters in Python using Regular Expression?
A Non-word character is any type of character that is not a letter, digit, or underscore, i.e., anything other than [a-z, A-Z, 0-9_]. Examples of non-word characters are symbols, punctuation marks, spaces, tabs, and other special characters.
In Python, Regular Expressions (RegEx) are used to match the patterns in the given input strings. The re module in Python provides different methods to use the regular expressions. This module helps the developers to perform search operations, validations, filtering and much more, based on string patterns.
To match a non-word character in Python, we can use the special character class \W inside a raw string in any re module methods. In this article, let's see all the different methods to match a non-word character in Python using Regular Expressions.
Example: Finding First Non-Word Character
In this example, we will use the method re.search() by passing \W inside a raw string to find the first non-word character -
import re text = "Python_3.10@2025!" match = re.search(r"\W", text) if match: print("First non-word character:", match.group()) print("Character position in string:", match.start()) else: print("No non-word character found.")
Here is the output of the above example -
First non-word character: . Character position in string: 10
Example: Finding All Non-Word Characters
In this example, we will use the method re.findall() by passing \W inside a raw string to find and return all non-word characters from the input string -
import re text = "Hello@World! 2025_Year#AI" matches = re.findall(r"\W", text) print("All non-word characters:", matches)
Following is the output of the above example -
All non-word characters: ['@', '!', ' ', '#']
Word vs Non-Word Patterns
In Python regular expressions, different character classes are used to identify word and non-word characters. Below is a comparison of common patterns used to work with word-based data -
Pattern | Description | Matches |
---|---|---|
\w | Matches any word character | Letters, digits, underscore (_) |
\W | Matches any non-word character | Spaces, symbols, punctuation |
\s | Matches any whitespace character | Space, tab, newline, carriage return |
\S | Matches any non-whitespace character | Letters, digits, punctuation, symbols |
These patterns are extremely useful when we want to validate, extract or clean text data based on the presence or absence of word characters.