- Introduction
- Terminology
- Concatenation
- Metacharacters
- Precedence
- Characters
- Character Class Expressions
- Constraining Simple Content
- Examples
- More Information
- About the Author
Character Class Expressions
A character class expression provides a way to match a range of characters. A character class expression, barring a quantifier, matches exactly one character. A character class expression may contain positive character groups (inclusive) or negative character groups (exclusive). In addition, an expression may subtract one character class expression (which is a set of characters) from another. The result is a mathematical set subtraction. The characters '['and ']' delimit a character class expression.
Character class expressions require a lengthy discussion not appropriate for this article. However, as in introduction, an example of each type of character class expression ensues.
The following positive character groups might be used to match a Latin-based identifier in a programming language:
[A-Z][A-Za-z0-9_]*
The following negative character group matches any character except an uppercase Latin character:
[^A-Z]
The following expression matches any Greek uppercase letter (specifically, the set of Greek characters, minus the set of non-uppercase letters):
[\p{isGreek}]-[\P{Lu}]
The various types can be combined. The following probably useless expression matches any Latin character, except for the uppercase characters 'G' through 'L':
[\p{IsLatin}-[G-L]]