- 2.1. The class File Format
- 2.2. Data Types
- 2.3. Primitive Types and Values
- 2.4. Reference Types and Values
- 2.5. Run-Time Data Areas
- 2.6. Frames
- 2.7. Representation of Objects
- 2.8. Floating-Point Arithmetic
- 2.9. Special Methods
- 2.10. Exceptions
- 2.11. Instruction Set Summary
- 2.12. Class Libraries
- 2.13. Public Design, Private Implementation
2.11. Instruction Set Summary
A Java Virtual Machine instruction consists of a one-byte opcode specifying the operation to be performed, followed by zero or more operands supplying arguments or data that are used by the operation. Many instructions have no operands and consist only of an opcode.
Ignoring exceptions, the inner loop of a Java Virtual Machine interpreter is effectively
do { atomically calculate pc and fetch opcode at pc; if (operands) fetch operands; execute the action for the opcode; } while (there is more to do);
The number and size of the operands are determined by the opcode. If an operand is more than one byte in size, then it is stored in big-endian order - high-order byte first. For example, an unsigned 16-bit index into the local variables is stored as two unsigned bytes, byte1 and byte2, such that its value is (byte1 << 8) | byte2.
The bytecode instruction stream is only single-byte aligned. The two exceptions are the lookupswitch and tableswitch instructions (§lookupswitch, §tableswitch), which are padded to force internal alignment of some of their operands on 4-byte boundaries.
- The decision to limit the Java Virtual Machine opcode to a byte and to forgo data alignment within compiled code reflects a conscious bias in favor of compactness, possibly at the cost of some performance in naive implementations. A one-byte opcode also limits the size of the instruction set. Not assuming data alignment means that immediate data larger than a byte must be constructed from bytes at run time on many machines.
2.11.1. Types and the Java Virtual Machine
Most of the instructions in the Java Virtual Machine instruction set encode type information about the operations they perform. For instance, the iload instruction (§iload) loads the contents of a local variable, which must be an int, onto the operand stack. The fload instruction (§fload) does the same with a float value. The two instructions may have identical implementations, but have distinct opcodes.
For the majority of typed instructions, the instruction type is represented explicitly in the opcode mnemonic by a letter: i for an int operation, l for long, s for short, b for byte, c for char, f for float, d for double, and a for reference. Some instructions for which the type is unambiguous do not have a type letter in their mnemonic. For instance, arraylength always operates on an object that is an array. Some instructions, such as goto, an unconditional control transfer, do not operate on typed operands.
Given the Java Virtual Machine’s one-byte opcode size, encoding types into opcodes places pressure on the design of its instruction set. If each typed instruction supported all of the Java Virtual Machine’s run-time data types, there would be more instructions than could be represented in a byte. Instead, the instruction set of the Java Virtual Machine provides a reduced level of type support for certain operations. In other words, the instruction set is intentionally not orthogonal. Separate instructions can be used to convert between unsupported and supported data types as necessary.
Table 2.2 summarizes the type support in the instruction set of the Java Virtual Machine. A specific instruction, with type information, is built by replacing the T in the instruction template in the opcode column by the letter in the type column. If the type column for some instruction template and type is blank, then no instruction exists supporting that type of operation. For instance, there is a load instruction for type int, iload, but there is no load instruction for type byte.
Table 2.2. Type support in the Java Virtual Machine instruction set
opcode |
byte |
short |
int |
long |
float |
double |
char |
reference |
Tipush |
bipush |
sipush |
|
|
|
|
|
|
Tconst |
|
|
iconst |
lconst |
fconst |
dconst |
|
aconst |
Tload |
|
|
iload |
lload |
fload |
dload |
|
aload |
Tstore |
|
|
istore |
lstore |
fstore |
dstore |
|
astore |
Tinc |
|
|
iinc |
|
|
|
|
|
Taload |
baload |
saload |
iaload |
laload |
faload |
daload |
caload |
aaload |
Tastore |
bastore |
sastore |
iastore |
lastore |
fastore |
dastore |
castore |
aastore |
Tadd |
|
|
iadd |
ladd |
fadd |
dadd |
|
|
Tsub |
|
|
isub |
lsub |
fsub |
dsub |
|
|
Tmul |
|
|
imul |
lmul |
fmul |
dmul |
|
|
Tdiv |
|
|
idiv |
ldiv |
fdiv |
ddiv |
|
|
Trem |
|
|
irem |
lrem |
frem |
drem |
|
|
Tneg |
|
|
ineg |
lneg |
fneg |
dneg |
|
|
Tshl |
|
|
ishl |
lshl |
|
|
|
|
Tshr |
|
|
ishr |
lshr |
|
|
|
|
Tushr |
|
|
iushr |
lushr |
|
|
|
|
Tand |
|
|
iand |
land |
|
|
|
|
Tor |
|
|
ior |
lor |
|
|
|
|
Txor |
|
|
ixor |
lxor |
|
|
|
|
i2T |
i2b |
i2s |
|
i2l |
i2f |
i2d |
|
|
l2T |
|
|
l2i |
|
l2f |
l2d |
|
|
f2T |
|
|
f2i |
f2l |
|
f2d |
|
|
d2T |
|
|
d2i |
d2l |
d2f |
|
|
|
Tcmp |
|
|
|
lcmp |
|
|
|
|
Tcmpl |
|
|
|
|
fcmpl |
dcmpl |
|
|
Tcmpg |
|
|
|
|
fcmpg |
dcmpg |
|
|
if_TcmpOP |
|
|
if_icmpOP |
|
|
|
|
if_acmpOP |
Treturn |
|
|
ireturn |
lreturn |
freturn |
dreturn |
|
areturn |
Note that most instructions in Table 2.2 do not have forms for the integral types byte, char, and short. None have forms for the boolean type. A compiler encodes loads of literal values of types byte and short using Java Virtual Machine instructions that sign-extend those values to values of type int at compile-time or run-time. Loads of literal values of types boolean and char are encoded using instructions that zero-extend the literal to a value of type int at compile-time or run-time. Likewise, loads from arrays of values of type boolean, byte, short, and char are encoded using Java Virtual Machine instructions that sign-extend or zero-extend the values to values of type int. Thus, most operations on values of actual types boolean, byte, char, and short are correctly performed by instructions operating on values of computational type int.
The mapping between Java Virtual Machine actual types and Java Virtual Machine computational types is summarized by Table 2.3.
Table 2.3. Actual and Computational types in the Java Virtual Machine
Actual type |
Computational type |
Category |
boolean |
int |
1 |
byte |
int |
1 |
char |
int |
1 |
short |
int |
1 |
int |
int |
1 |
float |
float |
1 |
reference |
reference |
1 |
returnAddress |
returnAddress |
1 |
long |
long |
2 |
double |
double |
2 |
Certain Java Virtual Machine instructions such as pop and swap operate on the operand stack without regard to type; however, such instructions are constrained to use only on values of certain categories of computational types, also given in Table 2.3.
2.11.2. Load and Store Instructions
The load and store instructions transfer values between the local variables (§2.6.1) and the operand stack (§2.6.2) of a Java Virtual Machine frame (§2.6):
- Load a local variable onto the operand stack: iload, iload_<n>, lload, lload_<n>, fload, fload_<n>, dload, dload_<n>, aload, aload_<n>.
- Store a value from the operand stack into a local variable: istore, istore_<n>, lstore, lstore_<n>, fstore, fstore_<n>, dstore, dstore_<n>, astore, astore_<n>.
- Load a constant on to the operand stack: bipush, sipush, ldc, ldc_w, ldc2_w, aconst_null, iconst_m1, iconst_<i>, lconst_<l>, fconst_<f>, dconst_<d>.
- Gain access to more local variables using a wider index, or to a larger immediate operand: wide.
Instructions that access fields of objects and elements of arrays (§2.11.5) also transfer data to and from the operand stack.
Instruction mnemonics shown above with trailing letters between angle brackets (for instance, iload_<n>) denote families of instructions (with members iload_0, iload_1, iload_2, and iload_3 in the case of iload_<n>). Such families of instructions are specializations of an additional generic instruction (iload) that takes one operand. For the specialized instructions, the operand is implicit and does not need to be stored or fetched. The semantics are otherwise the same (iload_0 means the same thing as iload with the operand 0). The letter between the angle brackets specifies the type of the implicit operand for that family of instructions: for <n>, a nonnegative integer; for <i>, an int; for <l>, a long; for <f>, a float; and for <d>, a double. Forms for type int are used in many cases to perform operations on values of type byte, char, and short (§2.11.1).
This notation for instruction families is used throughout this specification.
2.11.3. Arithmetic Instructions
The arithmetic instructions compute a result that is typically a function of two values on the operand stack, pushing the result back on the operand stack. There are two main kinds of arithmetic instructions: those operating on integer values and those operating on floating-point values. Within each of these kinds, the arithmetic instructions are specialized to Java Virtual Machine numeric types. There is no direct support for integer arithmetic on values of the byte, short, and char types (§2.11.1), or for values of the boolean type; those operations are handled by instructions operating on type int. Integer and floating-point instructions also differ in their behavior on overflow and divide-by-zero. The arithmetic instructions are as follows:
- Add: iadd, ladd, fadd, dadd.
- Subtract: isub, lsub, fsub, dsub.
- Multiply: imul, lmul, fmul, dmul.
- Divide: idiv, ldiv, fdiv, ddiv.
- Remainder: irem, lrem, frem, drem.
- Negate: ineg, lneg, fneg, dneg.
- Shift: ishl, ishr, iushr, lshl, lshr, lushr.
- Bitwise OR: ior, lor.
- Bitwise AND: iand, land.
- Bitwise exclusive OR: ixor, lxor.
- Local variable increment: iinc.
- Comparison: dcmpg, dcmpl, fcmpg, fcmpl, lcmp.
The semantics of the Java programming language operators on integer and floating-point values (JLS §4.2.2, JLS §4.2.4) are directly supported by the semantics of the Java Virtual Machine instruction set.
The Java Virtual Machine does not indicate overflow during operations on integer data types. The only integer operations that can throw an exception are the integer divide instructions (idiv and ldiv) and the integer remainder instructions (irem and lrem), which throw an ArithmeticException if the divisor is zero.
Java Virtual Machine operations on floating-point numbers behave as specified in IEEE 754. In particular, the Java Virtual Machine requires full support of IEEE 754 denormalized floating-point numbers and gradual underflow, which make it easier to prove desirable properties of particular numerical algorithms.
The Java Virtual Machine requires that floating-point arithmetic behave as if every floating-point operator rounded its floating-point result to the result precision. Inexact results must be rounded to the representable value nearest to the infinitely precise result; if the two nearest representable values are equally near, the one having a least significant bit of zero is chosen. This is the IEEE 754 standard’s default rounding mode, known as round to nearest mode.
The Java Virtual Machine uses the IEEE 754 round towards zero mode when converting a floating-point value to an integer. This results in the number being truncated; any bits of the significand that represent the fractional part of the operand value are discarded. Round towards zero mode chooses as its result the type’s value closest to, but no greater in magnitude than, the infinitely precise result.
The Java Virtual Machine’s floating-point operators do not throw run-time exceptions (not to be confused with IEEE 754 floating-point exceptions). An operation that overflows produces a signed infinity, an operation that underflows produces a denormalized value or a signed zero, and an operation that has no mathematically definite result produces NaN. All numeric operations with NaN as an operand produce NaN as a result.
Comparisons on values of type long (lcmp) perform a signed comparison. Comparisons on values of floating-point types (dcmpg, dcmpl, fcmpg, fcmpl) are performed using IEEE 754 nonsignaling comparisons.
2.11.4. Type Conversion Instructions
The type conversion instructions allow conversion between Java Virtual Machine numeric types. These may be used to implement explicit conversions in user code or to mitigate the lack of orthogonality in the instruction set of the Java Virtual Machine.
The Java Virtual Machine directly supports the following widening numeric conversions:
- int to long, float, or double
- long to float or double
- float to double
The widening numeric conversion instructions are i2l, i2f, i2d, l2f, l2d, and f2d. The mnemonics for these opcodes are straightforward given the naming conventions for typed instructions and the punning use of 2 to mean “to.” For instance, the i2d instruction converts an int value to a double. Widening numeric conversions do not lose information about the overall magnitude of a numeric value. Indeed, conversions widening from int to long and int to double do not lose any information at all; the numeric value is preserved exactly. Conversions widening from float to double that are FP-strict (§2.8.2) also preserve the numeric value exactly; however, such conversions that are not FP-strict may lose information about the overall magnitude of the converted value.
Conversion of an int or a long value to float, or of a long value to double, may lose precision, that is, may lose some of the least significant bits of the value; the resulting floating-point value is a correctly rounded version of the integer value, using IEEE 754 round to nearest mode.
A widening numeric conversion of an int to a long simply sign-extends the two’s-complement representation of the int value to fill the wider format. A widening numeric conversion of a char to an integral type zero-extends the representation of the char value to fill the wider format.
Despite the fact that loss of precision may occur, widening numeric conversions never cause the Java Virtual Machine to throw a run-time exception (not to be confused with an IEEE 754 floating-point exception).
Note that widening numeric conversions do not exist from integral types byte, char, and short to type int. As noted in §2.11.1, values of type byte, char, and short are internally widened to type int, making these conversions implicit.
The Java Virtual Machine also directly supports the following narrowing numeric conversions:
- int to byte, short, or char
- long to int
- float to int or long
- double to int, long, or float
The narrowing numeric conversion instructions are i2b, i2c, i2s, l2i, f2i, f2l, d2i, d2l, and d2f. A narrowing numeric conversion can result in a value of different sign, a different order of magnitude, or both; it may thereby lose precision.
A narrowing numeric conversion of an int or long to an integral type T simply discards all but the N lowest-order bits, where N is the number of bits used to represent type T. This may cause the resulting value not to have the same sign as the input value.
In a narrowing numeric conversion of a floating-point value to an integral type T, where T is either int or long, the floating-point value is converted as follows:
- If the floating-point value is NaN, the result of the conversion is an int or long 0.
- Otherwise, if the floating-point value is not an infinity, the floating-point value is rounded to an integer value V using IEEE 754 round towards zero mode. There are two cases:
- If T is long and this integer value can be represented as a long, then the result is the long value V.
- If T is of type int and this integer value can be represented as an int, then the result is the int value V.
- Otherwise:
- Either the value must be too small (a negative value of large magnitude or negative infinity), and the result is the smallest representable value of type int or long.
- Or the value must be too large (a positive value of large magnitude or positive infinity), and the result is the largest representable value of type int or long.
A narrowing numeric conversion from double to float behaves in accordance with IEEE 754. The result is correctly rounded using IEEE 754 round to nearest mode. A value too small to be represented as a float is converted to a positive or negative zero of type float; a value too large to be represented as a float is converted to a positive or negative infinity. A double NaN is always converted to a float NaN.
Despite the fact that overflow, underflow, or loss of precision may occur, narrowing conversions among numeric types never cause the Java Virtual Machine to throw a run-time exception (not to be confused with an IEEE 754 floating-point exception).
2.11.5. Object Creation and Manipulation
Although both class instances and arrays are objects, the Java Virtual Machine creates and manipulates class instances and arrays using distinct sets of instructions:
- Create a new class instance: new.
- Create a new array: newarray, anewarray, multianewarray.
- Access fields of classes (static fields, known as class variables) and fields of class instances (non-static fields, known as instance variables): getfield, putfield, getstatic, putstatic.
- Load an array component onto the operand stack: baload, caload, saload, iaload, laload, faload, daload, aaload.
- Store a value from the operand stack as an array component: bastore, castore, sastore, iastore, lastore, fastore, dastore, aastore.
- Get the length of array: arraylength.
- Check properties of class instances or arrays: instanceof, checkcast.
2.11.6. Operand Stack Management Instructions
A number of instructions are provided for the direct manipulation of the operand stack: pop, pop2, dup, dup2, dup_x1, dup2_x1, dup_x2, dup2_x2, swap.
2.11.7. Control Transfer Instructions
The control transfer instructions conditionally or unconditionally cause the Java Virtual Machine to continue execution with an instruction other than the one following the control transfer instruction. They are:
- Conditional branch: ifeq, ifne, iflt, ifle, ifgt, ifge, ifnull, ifnonnull, if_icmpeq, if_icmpne, if_icmplt, if_icmple, if_icmpgt if_icmpge, if_acmpeq, if_acmpne.
- Compound conditional branch: tableswitch, lookupswitch.
- Unconditional branch: goto, goto_w, jsr, jsr_w, ret.
The Java Virtual Machine has distinct sets of instructions that conditionally branch on comparison with data of int and reference types. It also has distinct conditional branch instructions that test for the null reference and thus it is not required to specify a concrete value for null (§2.4).
Conditional branches on comparisons between data of types boolean, byte, char, and short are performed using int comparison instructions (§2.11.1). A conditional branch on a comparison between data of types long, float, or double is initiated using an instruction that compares the data and produces an int result of the comparison (§2.11.3). A subsequent int comparison instruction tests this result and effects the conditional branch. Because of its emphasis on int comparisons, the Java Virtual Machine provides a rich complement of conditional branch instructions for type int.
All int conditional control transfer instructions perform signed comparisons.
2.11.8. Method Invocation and Return Instructions
The following five instructions invoke methods:
- invokevirtual invokes an instance method of an object, dispatching on the (virtual) type of the object. This is the normal method dispatch in the Java programming language.
- invokeinterface invokes an interface method, searching the methods implemented by the particular run-time object to find the appropriate method.
- invokespecial invokes an instance method requiring special handling, whether an instance initialization method (§2.9), a private method, or a superclass method.
- invokestatic invokes a class (static) method in a named class.
- invokedynamic invokes the method which is the target of the call site object bound to the invokedynamic instruction. The call site object was bound to a specific lexical occurrence of the invokedynamic instruction by the Java Virtual Machine as a result of running a bootstrap method before the first execution of the instruction. Therefore, each occurrence of an invokedynamic instruction has a unique linkage state, unlike the other instructions which invoke methods.
The method return instructions, which are distinguished by return type, are ireturn (used to return values of type boolean, byte, char, short, or int), lreturn, freturn, dreturn, and areturn. In addition, the return instruction is used to return from methods declared to be void, instance initialization methods, and class or interface initialization methods.
2.11.9. Throwing Exceptions
An exception is thrown programmatically using the athrow instruction. Exceptions can also be thrown by various Java Virtual Machine instructions if they detect an abnormal condition.
2.11.10. Synchronization
The Java Virtual Machine supports synchronization of both methods and sequences of instructions within a method by a single synchronization construct: the monitor.
Method-level synchronization is performed implicitly, as part of method invocation and return (§2.11.8). A synchronized method is distinguished in the run-time constant pool’s method_info structure (§4.6) by the ACC_SYNCHRONIZED flag, which is checked by the method invocation instructions. When invoking a method for which ACC_SYNCHRONIZED is set, the executing thread enters a monitor, invokes the method itself, and exits the monitor whether the method invocation completes normally or abruptly. During the time the executing thread owns the monitor, no other thread may enter it. If an exception is thrown during invocation of the synchronized method and the synchronized method does not handle the exception, the monitor for the method is automatically exited before the exception is rethrown out of the synchronized method.
Synchronization of sequences of instructions is typically used to encode the synchronized block of the Java programming language. The Java Virtual Machine supplies the monitorenter and monitorexit instructions to support such language constructs. Proper implementation of synchronized blocks requires cooperation from a compiler targeting the Java Virtual Machine (§3.14).
Structured locking is the situation when, during a method invocation, every exit on a given monitor matches a preceding entry on that monitor. Since there is no assurance that all code submitted to the Java Virtual Machine will perform structured locking, implementations of the Java Virtual Machine are permitted but not required to enforce both of the following two rules guaranteeing structured locking. Let T be a thread and M be a monitor. Then:
- The number of monitor entries performed by T on M during a method invocation must equal the number of monitor exits performed by T on M during the method invocation whether the method invocation completes normally or abruptly.
- At no point during a method invocation may the number of monitor exits performed by T on M since the method invocation exceed the number of monitor entries performed by T on M since the method invocation.
Note that the monitor entry and exit automatically performed by the Java Virtual Machine when invoking a synchronized method are considered to occur during the calling method’s invocation.