- Python Libraries
- Python Services
- The String Group
- Miscellaneous
- Generic Operational System
- Optional Operational System
- Debugger
- Profiler
- Internet Protocol and Support
- Internet Data Handling
- Restricted Execution
- Multimedia
- Cryptographic
- UNIX Specific
- SGI IRIX Specific
- Sun OS Specific
- MS Windows Specific
- Macintosh Specific
- Undocumented Modules
- Summary
The String Group
This group is responsible for many kinds of string services available. These modules provide access to several types of string manipulation operations.
Note that since release 2.0, all these functions are tied directly to string objects, as methods. The string module is still around only for backward compatibility.
string
The string module supports common string operations by providing several functions and constants that manipulate Python strings.
string.split()
This function splits a string into a list. If the delimiter is omitted, white-spaces are used.
basic syntax: string.split(string [,delimiter])
>>> print string,split("a b c") ["a","b","c"]
string.atof()
It converts a string to a floating number.
basic syntax: string.atof(string)
string.atoi()
It converts a string to an integer. atoi takes an optional second argument: base. If omitted, the start of the string (for instance, 0x for hexadecimal) is used to determine the base.
basic syntax: string.atoi(string[, base])
string.atol()
It converts a string to a long integer. atol takes an optional second argument: base. If omitted, the start of the string (for instance, 0x for hexadecimal) is used to determine the basic syntax: string.atol(string[, base])
string.upper()
It converts a string to uppercase.
basic syntax: string.upper(string)
string.find()
It returns the index position of the substring within string. Optionally, you can specify the string's range that should be used in the search.
basic syntax: string.find(string, substring[, start [,end]])
string.join()
This function joins the string elements of a list using separator to separate them.
basic syntax: string.join(list, separator)
string.capitalize()
It capitalizes the first character of string.
basic syntax: string.capitalize(string)
string.capwords()
This function capitalizes the first letter of each word in string and removes repeated, leading, and trailing whitespace.
basic syntax: string.capwords(string)
string.lower()
It converts all characters in string to lowercase.
basic syntax: string.lower(string)
string.lstrip(),string.rstrip() and string.strip()
These functions remove leading and/or trailing whitespace from string.
basic syntaxes:
string.lstrip(string) string.rstrip(string) string.strip(string)
string.ljust(),string.rjust() and string.center()
These functions define the alignment of string within a variable of width characters.
basic syntaxes:
string.ljust(string, width) string.rjust(string, width) string.center(string, width)
string.replace()
It replaces a maximum number of occurrences of oldtext with newtext in string. If maximum is omitted, all occurrences are replaced.
basic syntax: string.replace(string, oldtext, newtext [,maximum])
string.zfill()
It inserts zeros on the left side of a string that has width characters.
basic syntax: string.zfill(string, width)
Next, I list a few constants that can be used to test whether a certain variable is part of a specific domain:
>>> import string >>> string.digits "0123456789" >>> string.octdigits "01234567" >>> string.uppercase "ABCDEFGHIJKLMNOPQRSTUVWXY" >>> string.hexdigits "0123456789abcdefABCDEF" >>> string.lowercase "abcdefghijklmnopqrstuvwxy"
Let's write an example that uses string.uppercase:
>>> text = "F" >>> if text in string.uppercase: ... print "%s is in uppercase format" % text ... "F is in uppercase format"
string.maketrans()
Returns a translation table that maps each character in the from string into the character at the same position in the to string. Then this table is passed to the translate function. Note that both from and to must have the same length.
basic syntax: string.maketrans(from, to)
string.translate()
Based on the given table, it replaces all the informed characters, according to the table created by the string.maketrans function. Optionally, it deletes from the given string all characters that are presented in charstodelete.
basic syntax: string.translate(string, table[, charstodelete])
re
The re module performs Perl-style regular expression operations in strings, such as matching and replacement.
TIP
As a suggestion, always use raw string syntax when working with regular expression because it makes the work of handling special characters simpler.
>>> import re >>> data = r"Andre Lessa" >>> data = re.sub("Lessa", "L.", data) >>> print data Andre L.
See Chapter 9, "Other Advanced Topics," for more details about creating regular expression patterns.
NOTE
It is expected that in version 1.6, the re module will be changed to a front end to the new sre module.
regex
The regex module is an obsolete module since Python version 1.5. This module used to support regular expression search and match operations.
If necessary, you can use the regex-to-re HOWTO to learn how to migrate from the regex module to the re module. Check out the address http://py-howto.sourceforge.net/regex-to-re/.
regsub
The regsub module is another obsolete module. It also handles string operations (such as substitution and splitting) by using regular expressions. The functions in this module are not thread-safe, so be careful.
struct
The struct module interprets strings as packed binary data. It processes binary files using the functions pack(),unpack(), and calcsize(). This module allows users to write platform-independent, binary-file manipulation code when using the big-endian or little-endian format characters. Using the native formats does not guarantee platform independence.
fpformat
The fpformat module provides functions that deal with floating point numbers and conversions.
StringIO
The StringIO module creates a string object that behaves like a file, but actually, it reads and writes data from string buffers. The StringIO class, which is exposed by the StringIO module supports all the standard file methods.
>>> import StringIO >>> str = StringIO.StringIO("Line 1\nLine 2\nLine 3") >>> str.readlines() ['Line1\012', 'Line2\012', 'Line3']
An additional method provided by this class is StringIO.getvalue()
It returns and closes the string object.
basic syntax: variable = stringobject.getvalue()
>>> import StringIO >>> text = "Line 1\nLine 2\nLine 3" >>> str = StringIO.StringIO() >>> str.write(text) >>> result = str.getvalue() "Line 1\012Line 2\012Line 3"
cStringIO
The cStringIO is a faster version of the StringIO module. The difference is that you cannot subclass this module. It is necessary to use StringIO instead.