- Protecting the Ideas Behind Your Code
- Obfuscation As a Protection of Intellectual Property
- Transformations Performed by Obfuscators
- Knowing the Best Obfuscators
- Potential Problems and Common Solutions
- Using Zelix KlassMaster to Obfuscate a Chat Application
- Cracking Obfuscated Code
- Quick Quiz
- In Brief
Potential Problems and Common Solutions
Obfuscation is a reasonably safe process that should preserve application functionality. However, in certain cases the transformations performed by obfuscators can inadvertently break code that used to work. The following sections look at the common problems and recommended solutions.
Dynamic Class Loading
The renaming of packages, classes, methods, and variables works fine as long as the name is changed consistently throughout the system. Obfuscators ensure that any static references within the bytecode are updated to reflect the new name. However, if the code performs dynamic class loading using Class.forName() or ClassLoader.loadClass() passing an original class name, a ClassNotFound exception can result. Modern obfuscators are pretty good with handling such cases, and they attempt to change the strings to reflect the new names. If the string is created at runtime or read from a properties file, though, the obfuscator is incapable of handling it. Good obfuscators produce a log file with warnings pointing out the code that has potential for runtime problems.
The simplest solution is to configure the obfuscator to preserve the names of dynamically loaded classes. The content of the class, such as the methods, variables, and code, can still be transformed.
Stories From the Trenches
The most innovative product from CreamTec is WebCream, which is available for a free download from the Web. The free edition is limited to five concurrent users; to get more users, you must buy a commercial license. Having grown up in the Ukraine, I knew many people who would prefer to crack the licensing to turn the free edition into an unlimited edition that would normally be worth thousands of dollars. At CreamTec, we used a simple, free obfuscator that didn't do much more than name mangling. We thought it was good enough until a friend of mine, who views limited-functionality commercial software as a personal insult, cracked our licensing code in less than 15 minutes. The message was clear enough, and we decided to purchase Zelix KlassMaster to protect the product as well as we could. After we used the aggressive control flow obfuscation with a few extra tricks, our friend has not been able to get to the licensing code with the same ease as beforeand because he didn't want to spend days figuring it out, he has given up.
Reflection
Reflection requires compile-time knowledge of method and field names, so it is also affected by obfuscation. Be sure to use a good obfuscator and to review the log file for warnings. Just as with the dynamic class loading, if runtime errors are caused by obfuscation, you must exclude from obfuscation the method or field names that are referenced in Class.getMethod or Class.getField.
Serialization
Serialized Java objects include instance data and information about the class. If the version of the class or its structure changes, a deserialization exception can result. Obfuscated classes can be serialized and deserialized, but an attempt to deserialize an instance of a nonobfuscated class by an obfuscated class will fail. This is not a very common problem, and it can usually be solved by excluding the serializable classes from obfuscation or avoiding the mixing of serialized classes.
Naming Conventions Violation
The renaming of methods can violate design patterns such as Enterprise JavaBeans (EJB), where the bean developer is required to provide methods with certain names and signatures. EJB callback methods such as ejbCreate and ejbRemove are not defined by a super class or an interface. Providing these methods with a specific signature is a mere convention prescribed by EJB specification and enforced by the container. Changing callback method names violates the naming convention and makes the bean unusable. You should always be sure to exclude the names of such methods from obfuscation.
Maintenance Difficulties
Last, but not least, obfuscation makes maintaining and troubleshooting applications more difficult. Java exception handling is an effective way of isolating the faulty code, and looking at the stack trace can generally give you a good idea of what went wrong and where. Keeping the debugging information for source filenames and line numbers enables the runtime to report the exact location in code where the error occurred. If done carelessly, obfuscation can inhibit this feature and make debugging harder because the developer sees only the obfuscated class names instead of the real class names and line numbers.
You should preserve at least the line number information in the obfuscated code. Good obfuscators produce a log of the transformations, including the mapping between the original class names and methods and the obfuscated counterparts. The following is an excerpt from the log file generated by Zelix KlassMaster for the ChatServer class:
Class: public covertjava.chat.ChatServer => covertjava.chat.d throw new Exception(a("\002)qUe7egDs1,rM6:*g@6 covertjava.chat.d Source: "ChatServer.java" FieldsOf: covertjava.chat.ChatServer hostName => e protected static instance => a messageListener => d protected registry => c protected registryPort => b userName => f MethodsOf: covertjava.chat.ChatServer public static getInstance() => a public getRegistry(int) => a public init() => b public receiveMessage(java.lang.String, covertjava.chat.MessageInfo) _NameNotChanged public sendMessage(java.lang.String, java.lang.String) => a public setMessageListener(covertjava.chat.MessageListener) => a
So, if an exception stack trace shows the covertjava.chat.d.b method, you can use the log and find out that it was originally called "init" in a class that was originally called covertjava.chat.ChatServer. If the exception occurred in covertjava.chat.d.a, you would not know the original method name for sure because multiple mappings exist (witness the power of overloading). That's why line numbers are so important. By using the log file and the line number in the original source file, you can quickly locate the problem area in the application code.
Some obfuscators provide a utility that reconstructs the stack traces. This is a convenient way of getting the real stack trace for the obfuscated stack trace. The utility typically uses the same method as we used earlier, but it automates the jobso why not save ourselves some time? It also allows scrambling the line numbers for extra protection.