- Part 2: Rules and examples
- Late binding
- Summary
- Looking Further
Late binding
The secret to understanding binary compatibility is to understand late binding. Late binding means that Java looks at the names of classes, fields, and methods at runtime. That's unlike most C/C++ compilers, which remove the names and replace them with numerical offsets. It's the reason that binary compatibility works in Java.
With late binding, method names, field names, and class names are resolved at run time. That means that you can provide any class you like, as long as the names (and types) of its fields and methods are identical. There are other rules about access (private, public, etc.) and abstract-ness (you can't call a method that isn't really there), but late binding is the heart of binary compatibility.
You can use late binding to try to confuse the JVM. For example, if you start with these classes:
class Tinker { } |
class Evers extends Tinker { } |
class Chance extends Evers { } |
In a different directory (or another computer), create different versions of Tinker and Chance:
class Tinker extends Chance { System.out.println("Double play!"); } |
class Chance { } |
These two sets of code compile independently. But if you take version 1 of Evers and Chance, and version 2 of Tinker, you get:
Really, Chance refers to two different things in versions 1 and 2. But when you mix and match the code, late binding means that the two versions appear to be the same.
Obviously this would be a violation in Java, but Java compiler gets no say in the matter: we're just putting together three class files that worked independently. However, this breaks binary compatibility, and you get this error:
java.lang.ClassCircularityError: Tinker
When you try to create an instance of Tinker, you get:
java.lang.ClassCircularityError: Tinker
This example may seem trivial and silly, but it's not all that hard to accidentally introduce circularities or other binary incompatibilities. It's easy to re-implement a class (for performance reasons, or because you lost the source code, or because you didn't like the way it was written the first time) and leave something out.
But conversely, once you understand the rules, you can substantially re-implement classes without disturbing the other binaries, with all the advantages we talked about in the first issue of this article.
For example, let's start out with the bare outlines of a couple of related E-mail programs we'll call FrodoMail and SamMail:
abstract class Message implements Classifiable { } class EmailMessage extends Message { public boolean isJunk() { return false; } } interface Classifiable { boolean isJunk(); }
class FrodoMail { public static void main(String a[]) { Classifiable m = new EmailMessage(); System.out.println(m.isJunk()); } } |
class SamMail { public static void main(String a[]) { EmailMessage m = new EmailMessage(); System.out.println(m.isJunk()); } } |
I can choose to reimplement Message without Classifiable:
class Message { }
and SamMail will continue to run just fine. However, FrodoMail will throw an exception:
java.lang.IncompatibleClassChangeError at FrodoMail.main
That's because SamMail never depended on an EmailMessage being Classifiable, but FrodoMail did. The binary of FrodoMail mentions the Classifiable interface, by name. The methods for Classifiable are still there, but because the class doesn't name the interface anymore, it's a Classifiable as far as the JVM is concerned.
Other common causes of binary incompatibility include:
Changing a method from static to non-static (or vice versa) (IncompatibleClassChangeError)
Changing a member (field or method) to be private (IllegalAccessError)
Changing a class to be abstract (IncompatibleClassChangeError)
Changing a method to be abstract (AbstractMethodError)
Removing a field (NoSuchFieldError)
Removing an interface (IncompatibleClassChangeError)
IncompatibleClassChangeError is the root of most of the exceptions you'll see caused by incompatible binaries. The ClassCircularityError is actually a sister class to IncompatibleClassChangeError. Both are subsumed by LinkageError, which reports the sorts of errors caused by altering a class without recompiling, as well as other kinds of errors that occur when a class is being loaded:
ClassFormatError (corrupt class files)
NoClassDefFoundError (missing class file)
UnsatisfiedLinkError (missing library for native definitions)
VerifyError (invalid bytecodes)
This last is hard to get by writing in Java; it would indicate a serious compiler error or by writing class files in some other way than Java.
Methods
From a binary compatibility standpoint, a method is made up of four things:
The name
The return type
The arguments
Whether or not it's static
If you change any of these, it is a different method as far as the JVM is concerned. Take a look at this class:
class Ticket { boolean isValid() { ... } }
If you decide to change the Ticket class to accept a Date:
class Ticket { boolean isValid(Date when) { ... } }
then you can't substitute the new class for the old one. Anybody expecting to find the isValid() method will get cranky, and you'll get an error message like this:
java.lang.NoSuchMethodError: Ticket.isValid()Z
(The ()Z is how the JVM represents a method which takes nothing and returns a boolean; we'll discuss it in greater detail in the final installment of this article.)
Personally, I recommend either adding a valid approximation:
boolean isValid() { return isValid(new Date()); }
or at least providing a meaningful warning that the program has gotten dangerously out of synch.
boolean isValid() { throw new Error("Program out of synch. Call 1-800-555-CODE to complain."); }
If nobody is actually using isValid, you can feel free to change it. But you can't be sure who's using isValid until you actually run or recompile the other code.
The JVM figures out which method body to call by using a technique called virtual method dispatch. It decides which method body to use based on the exact instance on which the method is called. Think of it as very late binding.
As Java programmer, you're already intimately familiar with virtual method dispatch. Java uses the method body found in the class to which the instance belongs. If a particular class doesn't have a method with exactly that name, arguments, and return type, it inherits one from the super class. (It doesn't inherit the static property, since static things aren't inherited the way virtual methods are.)
Java's binary compatibility rules mean that the inheritance is managed at run time rather than at compile time. Suppose you start with:
class Play { void perform() { System.out.println("Oklahoma! Where the wind comes sweeping down the plain"); } } class ShakespearePlay extends Play { void perform() { System.out.println("To be or not to be."); } } class Hamlet extends ShakespearePlay { } class RichardIII extends ShakespearePlay { void perform() { System.out.println("Now is the winter of our discontent."); } } class Othello extends ShakespearePlay { void perform() { System.out.println("Chaos is come again."); } }
Then
Play play = new Hamlet(); play.perform();
prints
To be or not to be.
That's because the method body for perform is chosen at run time. Although Hamlet doesn't have a method body for perform, it inherits one from ShakespearePlay. It doesn't use the one from the generic Play, because the perform in ShakespearePlay overrides it.
We can change Hamlet dynamically, without recompiling ShakespearePlay, to pick another quote:
class Hamlet extends ShakespearePlay { System.out.println("Get thee to a nunnery"); }
And now:
Play play = new Hamlet(); play.perform();
prints
Get thee to a nunnery
But
Play play = new ShakespearePlay(); play.perform();
still prints
To be or not to be.
You can remove the body of ShakespearePlay:
class ShakespearePlay extends Play { }
and now that same code prints:
Oklahoma! Where the wind comes sweeping down the plain.
which is, of course, very wrong. So, perhaps, you should forbid this case by making ShakespearePlay abstract:
abstract class ShakespearePlay extends Play { }
Now, if you now run the perform example without recompiling, you'll get:
java.lang.InstantiationError: ShakespearePlay
Because ShakespearePlay is abstract and cannot be instantiated, the JVM detects it at runtime and won't allow the code to run.
Fields
Fields are different from methods. When you remove a method, it's possible for your Java class to inherit a different method with the same name and arguments, and to override parent class methods. Fields can't be overridden, which has an effect on their binary compatibility.
For example, let's look at these three classes:
class Language { String greeting = "Hello"; } class German extends Language { String greeting = "Guten tag"; } class French extends Language { String greeting = "Bon jour"; }
The code:
void test1() { System.out.println(new French().greeting); }
prints
Bon jour
but
void test2() { System.out.println(((Language) new French()).greeting); }
prints
Hello
That's because the field selected depends on the class you think you're accessing. In the first case, test1 knows it's accessing a French object, so it prints the French greeting. In the second case, even though it's really accessing a French object, test2 prints the standard (in this case, English) greeting.
You might choose to eliminate the standard greeting:
class Language { }
But if you try to run test2 without recompiling, you get:
java.lang.NoSuchFieldError: greeting
That's the JVM's way of detecting the error you'd get if you recompiled test2:
cannot resolve symbol symbol : variable greeting location: class Language System.out.println(((Language) new French()).greeting);
The test1 code continues to run just fine, without recompilation, because it doesn't depend on the existence of a greeting in Language.