- Matching One of Several Characters
- Using Character Set Ranges
- “Anything But” Matching
- Summary
“Anything But” Matching
Character sets are usually used to specify a list of characters of which any must match. But occasionally, you’ll want the reverse—a list of characters that you don’t want to match. In other words, anything but the list specified here.
Rather than having to enumerate every character you want (which could get rather lengthy if you want all but a few), character sets can be negated using the ^ metacharacter. Here’s an example:
Text
sales1.xls orders3.xls sales2.xls sales3.xls apac1.xls europe2.xls sam.xls na1.xls na2.xls sa1.xls ca1.xls
RegEx
[ns]a[^0-9]\.xls
Result
sales1.xls orders3.xls sales2.xls sales3.xls apac1.xls europe2.xls sam.xls na1.xls na2.xls sa1.xls ca1.xls
Analysis
The pattern used in this example is the exact opposite of the one used previously. [0-9] matches all digits (and only digits). [^0-9] matches anything by the specified range of digits. As such, [ns]a[^0-9]\.xls matches sam.xls but not na1.xls, na2.xls, or sa1.xls.