Data: Data-Type Keywords
Beyond the distinction between variable and constant is the distinction between different types of data. Some types of data are numbers. Some are letters or, more generally, characters. The computer needs a way to identify and use these different kinds. C does this by recognizing several fundamental data types. If a datum is a constant, the compiler can usually tell its type just by the way it looks: 42 is an integer, and 42.100 is floating point. A variable, however, needs to have its type announced in a declaration statement. You'll learn the details of declaring variables as you move along. First, though, take a look at the fundamental types recognized by C. K&R C recognized seven keywords relating to types. The C90 standard added two to the list. The C99 standard adds yet another three (see Table 3.1).
Table 3.1 C Data Keywords
Original K&R Keywords |
C90 Keywords |
C99 Keywords |
int |
signed |
_Bool |
long |
void |
_Complex |
short |
|
_Imaginary |
unsigned |
|
|
char |
|
|
float |
|
|
double |
|
|
The int keyword provides the basic class of integers used in C. The next three keywords (long, short, and unsigned) and the ANSI addition signed are used to provide variations of the basic type. Next, the char keyword designates the type used for letters of the alphabet and for other characters, such as #, $, %, and *. The char type also can be used to represent small integers. Next, float, double, and the combination long double are used to represent numbers with decimal points. The _Bool type is for Boolean values (true and false), and _Complex and _Imaginary represent complex and imaginary numbers, respectively.
The types created with these keywords can be divided into two families on the basis of how they are stored in the computer: integer types and floating-point types.
Bits, Bytes, and Words
The terms bit, byte, and word can be used to describe units of computer data or to describe units of computer memory. We'll concentrate on the second usage here.
The smallest unit of memory is called a bit. It can hold one of two values: 0 or 1. (Or you can say that the bit is set to "off" or "on.") You can't store much information in one bit, but a computer has a tremendous stock of them. The bit is the basic building block of computer memory.
The byte is the usual unit of computer memory. For nearly all machines, a byte is 8 bits, and that is the standard definition, at least when used to measure storage. (The C language, however, has a different definition, as discussed in the "Using Characters: Type char" section later in this chapter.) Because each bit can be either 0 or 1, there are 256 (that's 2 times itself 8 times) possible bit patterns of 0s and 1s that can fit in an 8-bit byte. These patterns can be used, for example, to represent the integers from 0 to 255 or to represent a set of characters. Representation can be accomplished with binary code, which uses (conveniently enough) just 0s and 1s to represent numbers. (Chap- ter 15, "Bit Fiddling," discusses binary code, but you can read through the introductory material of that chapter now if you like.)
A word is the natural unit of memory for a given computer design. For 8-bit microcomputers, such as the original Apples, a word is just 8 bits. Early IBM compatibles using the 80286 processor are 16-bit machines. This means that they have a word size of 16 bits. Machines such as the Pentium-based PCs and the Macintosh PowerPCs have 32-bit words. More powerful computers can have 64-bit words or even larger.
Integer Versus Floating-Point Types
Integer types? Floating-point types? If you find these terms disturbingly unfamiliar, relax. We are about to give you a brief rundown of their meanings. If you are unfamiliar with bits, bytes, and words, you might want to read the nearby sidebar about them first. Do you have to learn all the details? Not really, not any more than you have to learn the principles of internal combustion engines to drive a car, but knowing a little about what goes on inside a computer or engine can help you occasionally.
For a human, the difference between integers and floating-point numbers is reflected in the way they can be written. For a computer, the difference is reflected in the way they are stored. Let's look at each of the two classes in turn.
The Integer
An integer is a number with no fractional part. In C, an integer is never written with a decimal point. Examples are 2, 23, and 2456. Numbers such as 3.14, 0.22, and 2.000 are not integers. Integers are stored as binary numbers. The integer 7, for example, is written 111 in binary. Therefore, to store this number in an 8-bit byte, just set the first 5 bits to 0 and the last 3 bits to 1 (see Figure 3.2).
Figure 3.2 Storing the integer 7 using a binary code.
The Floating-Point Number
A floating-point number more or less corresponds to what mathematicians call a real number. Real numbers include the numbers between the integers. Some floating-point numbers are 2.75, 3.16E7, 7.00, and 2e8. Notice that adding a decimal point makes a value a floating-point value. So 7 is an integer type but 7.00 is a floating-point type. Obviously, there is more than one way to write a floating-point number. We will discuss the e-notation more fully later, but, in brief, the notation 3.16E7 means to multiply 3.16 by 10 to the 7th power; that is, by 1 followed by 7 zeros. The 7 would be termed the exponent of 10.
The key point here is that the scheme used to store a floating-point number is different from the one used to store an integer. Floating-point representation involves breaking up a number into a fractional part and an exponent part and storing the parts separately. Therefore, the 7.00 in this list would not be stored in the same manner as the integer 7, even though both have the same value. The decimal analogy would be to write 7.0 as 0.7E1. Here, 0.7 is the fractional part, and the 1 is the exponent part. Figure 3.3 shows another example of floating-point storage. A computer, of course, would use binary numbers and powers of two instead of powers of 10 for internal storage. You'll find more on this topic in Chapter 15. Now, let's concentrate on the practical differences:
An integer has no fractional part; a floating-point number can have a fractional part.
Floating-point numbers can represent a much larger range of values than integers can. See Table 3.3 near the end of this chapter.
For some arithmetic operations, such as subtracting one large number from another, floating-point numbers are subject to greater loss of precision.
Because there is an infinite number of real numbers in any rangefor example, in the range between 1.0 and 2.0computer floating-point numbers can't represent all the values in the range. Instead, floating-point values are often approximations of a true value. For example, 7.0 might be stored as a 6.99999 float valuemore about precision later.
Floating-point operations are normally slower than integer operations. However, microprocessors developed specifically to handle floating-point operations are now available, and they have closed the gap.
Figure 3.3 Storing the number pi in floating-point format (decimal version).