Characters
Character variables (type char) are typically 1 byte, enough to hold 256 values (see Appendix C). A char can be interpreted as a small number (0-255) or as a member of the ASCII set. ASCII stands for the American Standard Code for Information Interchange. The ASCII character set and its ISO (International Standards Organization) equivalent are a way to encode all the letters, numerals, and punctuation marks.
NOTE
Computers do not know about letters, punctuation, or sentences. All they understand are numbers. In fact, all they really know about is whether a sufficient amount of electricity is at a particular junction of wires. If so, it is represented symbolically as a 1; if not, it is represented as a 0. By grouping ones and zeros, the computer is able to generate patterns that can be interpreted as numbers, and these, in turn, can be assigned to letters and punctuation.
In the ASCII code, the lowercase letter "a" is assigned the value 97. All the lower- and uppercase letters, all the numerals, and all the punctuation marks are assigned values between 1 and 128. An additional 128 marks and symbols are reserved for use by the computer maker, although the IBM extended character set has become something of a standard.
NOTE
ASCII is usually pronounced "Ask-ee."
Characters and Numbers
When you put a character, for example, "a," into a char variable, what really is there is a number between 0 and 255. The compiler knows, however, how to translate back and forth between characters (represented by a single quotation mark and then a letter, numeral, or punctuation mark, followed by a closing single quotation mark) and one of the ASCII values.
The value/letter relationship is arbitrary; there is no particular reason that the lowercase "a" is assigned the value 97. As long as everyone (your keyboard, compiler, and screen) agrees, no problem occurs. It is important to realize, however, that a big difference exists between the value 5 and the character "5". The latter is actually valued at 53, much as the letter "a" is valued at 97. This is illustrated in Listing 3.6.
Listing 3.6 Printing Characters Based on Numbers
0: #include <iostream> 1: int main() 2: { 3: for (int i = 32; i<128; i++) 4: std::cout << (char) i; 5: return 0; 6: } !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmno pqrstuvwxyz{|}~_
This simple program prints the character values for the integers 32 through 127.
Special Printing Characters
The C++ compiler recognizes some special characters for formatting. Table 3.2 shows the most common ones. You put these into your code by typing the backslash (called the escape character), followed by the character. Thus, to put a tab character into your code, you would enter a single quotation mark, the slash, the letter t, and then a closing single quotation mark:
char tabCharacter = `\t';
This example declares a char variable (tabCharacter) and initializes it with the character value \t, which is recognized as a tab. The special printing characters are used when printing either to the screen or to a file or other output device.
An escape character changes the meaning of the character that follows it. For example, normally the character n means the letter n, but when it is preceded by the escape character (\) it means new line.
Table 3.2 The Escape Characters
Character |
What It Means |
---|---|
\a |
Bell (alert) |
\b |
Backspace |
\f |
Form feed |
\n |
New line |
\r |
Carriage return |
\t |
Tab |
\v |
Vertical tab |
\' |
Single quote |
\" |
Double quote |
\? |
Question mark |
\\ |
Backslash |
\000 |
Octal notation |
\xhhh |
Hexadecimal |