- Data Types and Other Tokens
- Working with Variables
- The boolean Primitive
- The Flavors of Integer
- Operators
- Character Variables
- Floating-Point Variables
- Literals: Assigning Values
- Integer Literals
- Character Literals
- Floating-Point Literals
- String Literals
- Arrays
- Non-Token Input Elements
- Troubleshooting
Character Literals
Character literals are enclosed in single quotation marks.
Any printable character, other than a backslash (\), can be specified as the single character itself enclosed in single quotes. Some examples of these literals are 'a', 'A', '9', '+', '_', and '~'.
Some characters, such as the backspace, cannot be written out like this, so these characters are represented by escape sequences. Escape sequences, like all character literals, are enclosed within single quotes. They consist of a backslash followed by one of the following:
-
A single character (b, t, n, f, r, ", ', or \)
-
An octal number between 000 and 377
-
A u followed by four hexadecimal digits specifying a Unicode character
The escape sequences built from single characters are shown in Table 3.7.
Table 3.7 Escape Sequences
Escape Sequence |
Unicode |
Meaning |
'\b' |
\u0008 |
Backspace |
'\t' |
\u0009 |
Horizontal tab |
'\n' |
\u000a |
Linefeed |
'\f' |
\u000c |
Form feed |
'\r' |
\u000d |
Carriage return |
'\"' |
\u0022 |
Double quotation mark |
'\'' |
\u0027 |
Single quotation mark |
'\\' |
\u005c |
Backslash |
Caution - Don't use the Unicode format to express an end-of-line character. Use the '\n' or '\r' characters instead.
The octal values allowed in character literals support the Unicode values from '\u0000' to '\u00ff' (the traditional ASCII range). Table 3.8 shows some examples of octal character literals.
Table 3.8 Octal Character Literals
Octal Literal |
Unicode |
Meaning |
'\007' |
\u0007 |
Bell |
'\101' |
\u0041 |
'A' |
'\141' |
\u0061 |
'a' |
'\071' |
\u0039 |
'9' |
'\042' |
\u0022 |
Double quotation mark |
You can use Unicode sequences anywhere in your Java code, not just as character literals. As indicated earlier, identifiers can be composed of any Unicode character. In fact, comments, identifiers, and the contents of character and string literals can all be expressed using Unicode. You must use caution, however, because they are interpreted early by the compiler. For example, if you were to use the Unicode representation for a linefeed ('\u000a') as part of a print statement, it would cause a compiler error. This is because the compiler would see this as an actual linefeed in your source code that occurs before the closing single quote of a character literal. This is the reason for the earlier caution to always use '\n' and '\r' for line termination literals.
For an example of using Unicode, look at the following statements that declare and reference a variable using an identifier specified with a Unicode sequence:
int \u0074\u0065\u0073\u0074 = 3; System.out.println( test ); System.out.println( \u0074\u0065\u0073\u0074 );
This code probably looks strange to you, but the first statement in this example declares and initializes an integer variable named test ('\u0074' equates to 't', '\u0065' equates to 'e', and so on). Although quite different in appearance, both println statements are equivalent; they each display the value assigned to test when executed.
Now look at two attempts to output a linefeed using different representations:
System.out.print( "\n" ); // OK System.out.print( '\u000a' ); // a compiler error
The first statement is valid and is the equivalent of calling System.out.println(). The second statement, however, causes a compiler error. As mentioned previously, the Unicode sequence is interpreted early, and it appears to the compiler that the argument to print is a character literal that is prematurely terminated by a linefeed.