Repetition
We can create regular expressions that match repeated sequences of characters by using some special characters. We can look for a repetition of a single character or group of characters using the following metacharacters (see Table 2).
Table 2
Metacharacters Used in Repetition
Character |
Meaning |
Example |
? |
Means zero or one of the preceding characters. |
pythonl?y matches: |
* |
Looks for zero or more of the preceding characters. |
pythonl*y matches: |
+ |
Looks for one or more of the preceding characters. |
pythonl+y matches: |
{n,m} |
looks for n to m repetitions of the preceding characters. |
fo{1,2} matches: |
All of these repetition characters can be applied to groups of characters, too. Thus:
>>> import re # Python's reg exp module - implicit import in all examples >>> re.match('(.an){1,2}s', 'cans') <re.MatchObject instance at 862760>
The same pattern will also match 'cancans' or 'pans' or 'canpans', but not 'bananas'.
There is one caveat with the {m,n} form of repetition, which is that it does not limit the match to only n units. Thus, in the example in Table 2, fo{1,2} will successfully match fooo because it matches the foo at the beginning of fooo. Thus, if you want to limit how many characters are matched, you need to follow the multiplying expression with an anchor or a negated range. In our case, fo{1,2}[^o] would prevent fooo from matching because it says to match 1 or 2 o's followed by anything other than an o.