- 2.1 Representing Ordinary Strings
- 2.2 Representing Strings with Alternate Notations
- 2.3 Using Here-Documents
- 2.4 Finding the Length of a String
- 2.5 Processing a Line at a Time
- 2.6 Processing a Byte at a Time
- 2.7 Performing Specialized String Comparisons
- 2.8 Tokenizing a String
- 2.9 Formatting a String
- 2.10 Using Strings As IO Objects
- 2.11 Controlling Uppercase and Lowercase
- 2.12 Accessing and Assigning Substrings
- 2.13 Substituting in Strings
- 2.14 Searching a String
- 2.15 Converting Between Characters and ASCII Codes
- 2.16 Implicit and Explicit Conversion
- 2.17 Appending an Item Onto a String
- 2.18 Removing Trailing Newlines and Other Characters
- 2.19 Trimming Whitespace from a String
- 2.20 Repeating Strings
- 2.21 Embedding Expressions Within Strings
- 2.22 Delayed Interpolation of Strings
- 2.23 Parsing Comma-Separated Data
- 2.24 Converting Strings to Numbers (Decimal and Otherwise)
- 2.25 Encoding and Decoding rot13 Text
- 2.26 Encrypting Strings
- 2.27 Compressing Strings
- 2.28 Counting Characters in Strings
- 2.29 Reversing a String
- 2.30 Removing Duplicate Characters
- 2.31 Removing Specific Characters
- 2.32 Printing Special Characters
- 2.33 Generating Successive Strings
- 2.34 Calculating a 32-Bit CRC
- 2.35 Calculating the MD5 Hash of a String
- 2.36 Calculating the Levenshtein Distance Between Two Strings
- 2.37 Encoding and Decoding base64 Strings
- 2.38 Encoding and Decoding Strings (uuencode/uudecode)
- 2.39 Expanding and Compressing Tab Characters
- 2.40 Wrapping Lines of Text
- 2.41 Conclusion
2.16 Implicit and Explicit Conversion
At first glance, the to_s and to_str methods seem confusing. They both convert an object into a string representation, don't they?
There are several differences. First, any object can in principle be converted to some kind of string representation; that is why nearly every core class has a to_s method. But the to_str method is never implemented in the core.
As a rule, to_str is for objects that are really very much like strings—that can "masquerade" as strings. Better yet, think of the short name to_s as being explicit conversion and the longer name to_str as being implicit conversion.
You see, the core does not define any to_str methods (that I am aware of). But core methods do call to_str sometimes (if it exists for a given class).
The first case we might think of is a subclass of String; but in reality, any object of a subclass of String already "is-a" String, so to_str is unnecessary there.
Here is a real-life example. The Pathname class is defined for convenience in manipulating filesystem pathnames (for example, concatenating them). However, a pathname maps naturally to a string (even though it does not inherit from String).
require 'pathname' path = Pathname.new("/tmp/myfile") name = path.to_s # "/tmp/myfile" name = path.to_str # "/tmp/myfile" (So what?) # Here's why it matters... heading = "File name is " + path puts heading # "File name is /tmp/myfile"
Notice that in the preceding code fragment, we take a string "File name is" and directly append a path onto it. Normally this would give us a runtime error, since the + operator expects the second operand to be another string. But because Pathname has a to_str method, that method is called. A Pathname can "masquerade" as a String; it is implicitly converted to a String in this case.
In real life, to_s and to_str usually return the same value; but they don't have to do so. The implicit conversion should result in the "real string value" of the object; the explicit conversion can be thought of as a "forced" conversion.
The puts method calls an object's to_s method in order to find a string representation. This behavior might be thought of as an implicit call of an explicit conversion. The same is true for string interpolation. Here's a crude example:
class Helium def to_s "He" end def to_str "helium" end end e = Helium.new print "Element is " puts e # Element is He puts "Element is " + e # Element is helium puts "Element is #{e}" # Element is He
So you can see how defining these appropriately in your own classes can give you a little extra flexibility. But what about honoring the definitions of the objects passed into your methods?
For example, suppose that you have a method that is "supposed" to take a String as a parameter. Despite our "duck typing" philosophy, this is frequently done and is often completely appropriate. For example, the first parameter of File.new is "expected" to be a string.
The way to handle this is simple. When you expect a string, check for the existence of to_str and call it as needed.
def set_title(title) if title.respond_to? :to_str title = title.to_str end # ... end
Now, what if an object doesn't respond to to_str? We could do several things. We could force a call to to_s; we could check the class to see whether it is a String or a subclass thereof; or we could simply keep going, knowing that if we apply some meaningless operation to this object, we will eventually get an ArgumentError anyway.
A shorter way to do this is
title = title.to_str rescue title
which depends on an unimplemented to_str raising an exception. The rescue modifiers can of course be nested:
title = title.to_str rescue title.to_s rescue title # Handle the unlikely case that even to_s isn't there
Implicit conversion would allow you to make strings and numbers essentially equivalent. You could do this:
class Fixnum def to_str self.to_s end end str = "The number is " + 345 # The number is 345
However, I don't recommend this sort of thing. There is such a thing as "too much magic"; Ruby, like most languages, considers strings and numbers to be different, and I believe that most conversions should be explicit for the sake of clarity.
Another thing to remember: There is nothing magical about the to_str method. It is intended to return a string, but if you code your own, it is your responsibility to see that it does.