Comparing Python Object-Oriented Code with Java
Outline: Object Orientation
Programmers are creatures of habit, and we tend to stick with established language features unless we have some compelling reason to embrace new ones. Object-oriented (OO) features are a good example of this issue. PHP programmers consider PHP's OO features to be a good idea—but use them sparingly, if at all. The story is similar with many Python programmers, who prefer not to use Python's OO features.
Java sits at the other end of the language spectrum: It's an OO language, so there's no getting away from classes when you use Java. Despite Java's OO pedigree, however, a lot of Java code is still written in a procedural manner.
Why this bias against (or possible misuse of) OO? I think it boils down to a combination of personal inclination and engineering judgment. If a PHP or Python programmer has extensive experience with one of these languages and hasn't used the OO features often, the disinclination may be due to simple inertia. But certain development tasks might be better implemented in an OO context than in the familiar procedural/functional paradigm.
It's true that OO programming can result in issues such as heap fragmentation or other nondeterministic platform states, such as performance deterioration. Indeed, the issue of OO heap use was one reason why C++ took many years to replace C in embedded development projects. Back in the 1990s, disk, CPU, and memory were at such a premium that, at least in the minds of designers, they precluded the use of OO languages (which also precluded potential productivity gains from using these emerging languages).
I think it's still fair to say that many Python programmers avoid OO features unless no other option exists. In this article, I compare Python and Java to show how they stack up against each other in terms of complexity and speed. I hope this will allow for an objective assessment!
Let's take a look at some code, starting with Python.
A Python Class
As is the case with Python in general, the Python OO paradigm is pretty concise, as the simple class in Listing 1 illustrates.
Listing 1A Python class.
class Student: def __init__(self, name, age, major): self.name = name self.age = age self.major = major def is_old(self): return self.age > 100
The Student class has three data members: name, age, and major subject.
The __init__() method is the closest thing Python has to a constructor. Notice the use of self.name to initialize the state of the instance. Also included is the simple method is_old() to determine (in a slightly "ageist" manner) whether the underlying student is young or old (with "old" being over 100 years).
The code in Listing 1 illustrates one of the great merits of OO programming: Code and data reside in close proximity to each other. Data is of course the repository of state, so the use of OO brings code, data, and state together in a manner useful to programmers. Clearly, you can do all of this without OO code, but OO makes it a matter of rather beautiful simplicity.
Remember: Most source code on the planet exists to model some real-world entity or process. OO can be a very clear, minimum-impedance technique for such modeling. This might even be a compelling reason for using the OO approach at all costs!
An Equivalent Java Class
Not to be outdone by our Python coding effort, Listing 2 shows an equivalent Java class.
Listing 2A Java student class.
public class Student { String name; int age; String major; public Student() { // TODO Auto-generated constructor stub } public Student(String name, int age, String major) { this.name = name; this.age = age; this.major = major; } }
The Java code in Listing 2 is very similar to the Python code in Listing 1. Notice that the use of OO can produce quite readable code in either language. Listing 1 is not likely to baffle a Java programmer, even without a background in Python. Likewise, a Python programmer well versed in the Python OO features would easily understand the Java code in Listing 2.
So here's our first takeaway: Well-written OO code can help to promote inter-language comprehensibility.
Why is this important? In our multi-language era, such comprehensibility is a prize worth pursuing. (For more on this topic, interested readers can check out my blog and my most recent eBook.)
The modern era of software can be defined by the rapid adoption of application deployment on the Web and the concomitant use of browsers to access those applications. Users now routinely demand from web-hosted applications what used to be called "desktop features." Such usability generally can't be delivered using just one programming language. Programmers must increasingly be comfortable in numerous languages: Java, Scala, JavaScript, HTML, CSS, Python, SQL, and so on.
A Matter of Speed: Python Versus Java Code
Speed is always an issue. Let's modify Listing 1 so that we can get a feel for the speed of the underlying code.
Running the Python Code
Listing 3 illustrates a simple (toy) program that attempts to "stress" the platform a little.
Listing 3A timed program run.
import time class Student: def __init__(self, name, age, major): self.name = name self.age = age self.major = major def is_old(self): return self.age > 100 start = time.clock() for x in xrange(500000): s = Student('John', 23, 'Physics') print 'Student %s is %s years old and is studying %s' %(s.name, s.age, s.major) print 'Student is old? %d ' %(s.is_old()) stop = time.clock() print stop - start
Listing 3 is a slightly augmented version of Listing 1. This revised code does the following:
- Import the time module.
- Create a time snapshot at the beginning of the program.
- Instantiate a large number of Student objects.
- Access the data inside each object.
- Take a time snapshot and subtract the original time.
- Display the time required to run the program.
Admittedly, this is a pretty crude test. But let's see an example run that creates 500,000 objects. This is an excerpt from the full program run:
Student John is 23 years old and is studying Physics Student is old? 0 29.8887370933
We can think of this as a baseline test: It takes about 30 seconds for a program run of 500,000 objects. Now let's raise the number of objects created to 800,000:
Student John is 23 years old and is studying Physics Student is old? 0 48.2298926572
From this, we see that a program run of 800,000 objects takes about 48 seconds. Let's double the number of objects created, to 1,600,000:
Student John is 23 years old and is studying Physics Student is old? 0 97.3272409408
That's 97 seconds for 1,600,000 objects.
Now let's do a comparative run using Java.
Running the Java Code
Listing 4 illustrates a simple Java program that also attempts to stress the platform a little.
Listing 4The Java test program.
public class Student { String name; int age; String major; public Student() { // TODO Auto-generated constructor stub } public Student(String name, int age, String major) { this.name = name; this.age = age; this.major = major; } public String getName() { return name; } public void setName(String name) { this.name = name; } public int getAge() { return age; } public void setAge(int age) { this.age = age; } public String getMajor() { return major; } public void setMajor(String major) { this.major = major; } public static void main(String[] args) { long startTime = System.currentTimeMillis(); for (int i = 0; i < 500000; i++) { Student student = new Student("John", 23, "Physics"); System.out.println("Student " + student.getName() + " is "
+ student.getAge() + " years old and is studying " + student.getMajor()); } long estimatedTime = System.currentTimeMillis() - startTime; System.out.println("Time estimate: " + estimatedTime/1000); } }
Notice in Listing 4 that I've included automatically generated getter and setter methods. Experienced Eclipse Java developers use this feature all the time; the getters and setters are automatically generated by Eclipse. The same is true for the two constructors. This type of productivity enhancement is really handy, and because such code is machine-generated, it's completely error-free.
Let's run the Java code with 500,000 objects, just as we did for the Python case:
Student John is 23 years old and is studying Physics Student is old: false Time estimate: 31
That's 31 seconds for 500,000 objects. Now we run it with 800,000 objects:
Student John is 23 years old and is studying Physics Student is old: false Time estimate: 50
That's 50 seconds for 800,000 objects. Now we run our final Java test with 1,600,000 objects:
Student John is 23 years old and is studying Physics Student is old: false Time estimate: 104
That's 104 seconds for 1,600,000 objects.
Let's tabulate the results for comparison.
Comparative Speed Test
Number of Objects |
Java Speed |
Python Speed |
500,000 |
31 |
30 |
800,000 |
50 |
48 |
1,600,000 |
104 |
97 |
The test results show that the Python code outperforms the Java code by a small margin. This is not unexpected. Java might be called a "heavyweight" mainstream language; it comes with a certain amount of baggage, including but not limited to the following:
- Portability. This simply means that Java bytecode will run on any platform with an appropriate Java virtual machine (JVM).
- Type safety. As the example illustrates, type safety is closely related to memory safety. This language feature helps to avoid situations in which an attempt is made to copy an invalid bit pattern into a given memory area.
- Built-in security. The Java security model is based on a sandbox in which code can run safely with minimal negative effects on the underlying platform.
As with any technology, this feature set comes at a cost; however, as the table shows, the cost in the current test context is not exactly exorbitant.
This is our second takeaway: OO does have a cost, but it's relatively cheap considering all the extra capabilities you get.
Extending the Test
The tests I ran for this example are pretty simple. A more realistic test might use objects that read and write to a database, or send and receive network traffic. If the data in such programs is derived from files, that would help in stressing the application with disk I/O.
Running the Code
Python can be run from the command line; more conveniently, you can run it from within an integrated development environment (IDE) such as Eclipse. I prefer to use an IDE because of the many productivity enhancements they offer: code generation, unit testing, package and module creation, and so on.
Getting started with Python and Eclipse is easy: Install Eclipse and then use the Eclipse Marketplace to install the PyDev plug-in. Create a Python (or PyDev) module, and you're all set to start creating your Python code.
Of course, it's even easier to run the Java code in Eclipse, because the default installation already includes support for Java. And let's not forget all the ancillary Java productivity enhancements: code completion, code generation (getters, setters, constructors, etc.), refactoring, and so on.
Regardless of your language choice or programming model (OO versus procedural or functional), there is no denying that the use of a modern IDE such as Eclipse is a major productivity enhancement. This type of tool facilitates agile development in the form of code generation, refactoring, and tool integration via plug-ins.
Final Thoughts
OO languages were the subject of a certain amount of mistrust back in the 1990s. In those days, many organizations preferred to stick with mainstream languages such as C, rather than adopting the new C++. Then along came Java, and I think it's fair to say that C++ was no longer the de facto OO language.
Nowadays, OO languages are used in embedded platforms pretty much as a matter of course. However, there is still some resistance to using the OO features in languages such as Python and PHP. The reasons for this resistance might have more to do with programmer preferences than with reality!
One interesting aspect of a comparison between OO code in different languages is the commonality between such languages. Python OO code is not vastly different from equivalent code in Java. This could be considered an advantage of using OO features in the multi-language era, helping programmers to produce good code. Simpler code is generally well received by maintenance programmers and production support staff.
The speed of such broadly equivalent Java and Python code is pretty similar, as I've illustrated here with simple tests, as well as comparable results in my article "Database Development: Comparing Python and Java ORM Performance."
OO offers many advantages in any language. The potential ease of understanding that OO provides could be a strong motivation for its use. Given this and the other advantages, OO seems to offer too many pluses right now, and potentially in the future, for smart programmers to keep avoiding it.