- 2.1 Representing Ordinary Strings
- 2.2 Representing Strings with Alternate Notations
- 2.3 Using Here-Documents
- 2.4 Finding the Length of a String
- 2.5 Processing a Line at a Time
- 2.6 Processing a Byte at a Time
- 2.7 Performing Specialized String Comparisons
- 2.8 Tokenizing a String
- 2.9 Formatting a String
- 2.10 Using Strings As IO Objects
- 2.11 Controlling Uppercase and Lowercase
- 2.12 Accessing and Assigning Substrings
- 2.13 Substituting in Strings
- 2.14 Searching a String
- 2.15 Converting Between Characters and ASCII Codes
- 2.16 Implicit and Explicit Conversion
- 2.17 Appending an Item Onto a String
- 2.18 Removing Trailing Newlines and Other Characters
- 2.19 Trimming Whitespace from a String
- 2.20 Repeating Strings
- 2.21 Embedding Expressions Within Strings
- 2.22 Delayed Interpolation of Strings
- 2.23 Parsing Comma-Separated Data
- 2.24 Converting Strings to Numbers (Decimal and Otherwise)
- 2.25 Encoding and Decoding rot13 Text
- 2.26 Encrypting Strings
- 2.27 Compressing Strings
- 2.28 Counting Characters in Strings
- 2.29 Reversing a String
- 2.30 Removing Duplicate Characters
- 2.31 Removing Specific Characters
- 2.32 Printing Special Characters
- 2.33 Generating Successive Strings
- 2.34 Calculating a 32-Bit CRC
- 2.35 Calculating the MD5 Hash of a String
- 2.36 Calculating the Levenshtein Distance Between Two Strings
- 2.37 Encoding and Decoding base64 Strings
- 2.38 Encoding and Decoding Strings (uuencode/uudecode)
- 2.39 Expanding and Compressing Tab Characters
- 2.40 Wrapping Lines of Text
- 2.41 Conclusion
2.7 Performing Specialized String Comparisons
Ruby has built-in ideas about comparing strings; comparisons are done lexicographically as we have come to expect (that is, based on character set order). But if we want, we can introduce rules of our own for string comparisons, and these can be of arbitrary complexity.
For example, suppose that we want to ignore the English articles a, an, and the at the front of a string, and we also want to ignore most common punctuation marks. We can do this by overriding the built-in method <=> (which is called for <, <=, >, and >=). Listing 2.1 shows how we do this.
Listing 2.1. Specialized String Comparisons
class String alias old_compare <=> def <=>(other) a = self.dup b = other.dup # Remove punctuation a.gsub!(/[\,\.\?\!\:\;]/, "") b.gsub!(/[\,\.\?\!\:\;]/, "") # Remove initial articles a.gsub!(/^(a |an |the )/i, "") b.gsub!(/^(a |an |the )/i, "") # Remove leading/trailing whitespace a.strip! b.strip! # Use the old <=> a.old_compare(b) end end title1 = "Calling All Cars" title2 = "The Call of the Wild" # Ordinarily this would print "yes" if title1 < title2 puts "yes" else puts "no" # But now it prints "no" end
Note that we "save" the old <=> with an alias and then call it at the end. This is because if we tried to use the < method, it would call the new <=> rather than the old one, resulting in infinite recursion and a program crash.
Note also that the == operator does not call the <=> method (mixed in from Comparable). This means that if we need to check equality in some specialized way, we will have to override the == method separately. But in this case, == works as we want it to anyhow.
Suppose that we wanted to do case-insensitive string comparisons. The built-in method casecmp will do this; we just have to make sure that it is used instead of the usual comparison.
Here is one way:
class String def <=>(other) casecmp(other) end end
But there is a slightly easier way:
class String alias <=> casecmp end
However, we haven't finished. We need to redefine == so that it will behave in the same way:
class String def ==(other) casecmp(other) == 0 end end
Now all string comparisons will be strictly case-insensitive. Any sorting operation that depends on <=> will likewise be case-insensitive.