Database Development: Comparing Python and Java ORM Performance
- Building a Java ORM Facility
- A Note About Running Python Example Code
- Conclusion
- References
In the late 1990s, it was commonplace to hear quite emotional arguments about the dangers of C programming versus other languages such as Pascal. The Pascal proponents pointed out the pitfalls in C: unchecked array access, pointer manipulation, complexity, and so on. The C supporters, on the other hand, regarded Pascal as a language for learning programming, rather than one for professional development.
Then along came C++ and Java, and a lot of the old arguments were consigned to the history books. I think it's fair to say that Java has become the mainstream language of choice. But other languages such as C/C++ and so forth still have an important to role to play. For example, if you need to develop on small embedded devices, C is a good choice. For larger embedded systems, C++ is routinely used.
One interesting development in the Java world has been the plethora of programming frameworks, including offerings such as Spring, Struts, and so on. Although very useful to the programmer, these frameworks haven't always been to everyone's taste. Indeed, getting up and running with a Spring-based tool chain is most decidedly not an easy undertaking. The same can be said for other frameworks such as Seam. This level of complexity is somewhat at variance with modern trends.
As organizations seek to simplify their IT development, a move is underway toward more lightweight technologies. As part of this effort, some very large corporate and governmental organizations are moving away from the complex and framework-rich environments of Java, embracing languages such as Python. This trend can be seen all the way out to the front end, in the way JavaScript is sometimes being used aggressively as a platform for business logic.
Despite all these changes in programming languages, relational databases still tend to sit at the epicenter of most systems—for large websites as well as even small mobile applications. A separate movement is headed toward NoSQL databases such as MongoDB. But, for the moment, relational technology still dominates. Getting database access from a chosen programming language is therefore a key part of the developer's toolkit. Not surprisingly, a range of options are available for programmatic database access. Let's see how Python and Java stack up.
Python Versus Java
In this article, I compare the object-relational mapping facilities from Python and Java. While preparing the code for this article, I was struck by the relative difficulty of setting up object-relational mapping (ORM) facilities in the two languages. For Python, the setup is pretty straightforward. Indeed, the online Python tutorials are of high quality, which also helps greatly in speeding up user adoption.
On the other hand, the Java tutorials are somewhat more distributed across the Web, and getting up and running with Java ORM is much more difficult than with Python. I have a good deal of experience with Java, which helped me in getting the Java ORM working, but a beginner might not be able to get past the setup difficulties.
However, having used and written about both languages, I should note that despite these occasional difficulties my preference would generally be to use Java. Why? Well, Java is a proven technology; out of the box, Java is structured, secure, strongly typed, and thread-safe.
Anyway, that's enough grumbling. Let's see how to set up some ORM code in these two languages.
Building a Python ORM Module
Listing 1 has a bare-bones Python ORM module, based on the example in SQLAlchemy documentation. This Python module uses a class definition, which contains the database mapping.
Listing 1A complete Python ORM module.
import sqlalchemy from sqlalchemy import create_engine from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Column, Integer, String from sqlalchemy.orm import sessionmaker import time if __name__ == '__main__': sqlalchemy.__version__ engine = create_engine('mysql://userName:password@localhost', echo=False) engine.execute('USE sqlAlchemyJavaORM') Base = declarative_base() class User(Base): __tablename__ = 'DBPythonUsers' id = Column(Integer, primary_key=True, nullable = False) name = Column(String(30), nullable = False) fullname = Column(String(30), nullable = False) password = Column(String(20), nullable = False) def greeting(self): return 'Hello world from class name ' + self.__class__.__name__ def __repr__(self): return "<User(name='%s', fullname='%s', password='%s')>" % ( self.name, self.fullname, self.password) x = User() print x.greeting() startTime = time.time() for i in range(0, 100000): Base.metadata.create_all(engine) Session = sessionmaker(bind=engine) session = Session() ed_user = User(name='ed' + str(i), fullname='Ed Jones', password='edspassword') # Now we can talk to the database session.add(ed_user) session.commit() print time.time() - startTime, 'seconds'
Let's break down Listing 1 into its constituent parts:
- Create an engine with which to communicate with the database:
- Next up, the line x = User() creates an instance of the above class.
- Then I call a test method on the class called x.greeting().
- Next, the schema is created with Base.metadata.create_all(engine).
- Using a session instance, I insert data into the database.
- The last step is to roughly record the time taken to execute the database code.
engine = create_engine('mysql://userName:password@localhost', echo=False)
I used MySQL Server, but to get started quickly you can use the in-memory SQLite. To use the line above, substitute your login name and password in place of userName and password.
The class User provides the main ORM mapping. Notice the multiple uses of Column; each of them maps to an actual database column.
That's the overall view. Now let's dig a little deeper into this interesting Python ORM technology. The Python code used here is based on the SQLAlchemy tutorial, so I won't go into too much detail beyond the main points above.
One aspect of Listing 1 that requires some explanation is the way the code interacts with the database engine. That is, I allow the Python code to create the table as follows:
Base.metadata.create_all(engine)
Listing 2 illustrates the MySQL database table created by the Python module.
Listing 2The newly created database table.
+----------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+-------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | name | varchar(30) | NO | | NULL | | | fullname | varchar(30) | NO | | NULL | | | password | varchar(20) | NO | | NULL | | +----------+-------------+------+-----+---------+----------------+
A careful comparison between Listings 1 and 2 illustrates the power of ORM: One Python class and module does all this work. But remember that this power brings a lot of responsibility! Notice also the auto-incrementing primary key in Listing 2. This is a very convenient and standard MySQL database column type, where the id value is automatically incremented each time a row is inserted into the table.
The last noteworthy section in Listing 1 is the data insertion loop. I use a loop to insert 100,000 rows into the database table.
The Python code is done. Now how do we go about doing something similar in Java?
Building a Java ORM Facility
We've seen the approach to Python ORM. Let's look at how an equivalent Java solution might work. Unfortunately, as I mentioned earlier, getting set up with a Java ORM project is not so trivial an undertaking. Interested readers can look at some of my earlier JPA articles. For this specific case, look at this EclipseLink example for some guidance.
Listing 3 illustrates a basic Java entity class JavaORMClass. Notice the heavy use of Java annotations, such as @Entity, @Table, @Column, and so on. These annotations provide the mapping between the class data members and the underlying database table. Just as in the Python case, Java code can create the database schema.
Listing 3A Java JPA class.
@Entity @Table(name = "DBJavaUsers") public class JavaORMClass { @Id @GeneratedValue @Column(name = "IDENT_PARAMS_ID") private Long id; @Column(name = "name") private String name; @Column(name = "fullname") private String fullName; @Column(name = "password") private String password; public JavaORMClass() { } public JavaORMClass(String name, String fullName, String password) { super(); this.name = name; this.fullName = fullName; this.password = password; } public Long getId() { return id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getFullName() { return fullName; } public void setFullName(String fullName) { this.fullName = fullName; } public String getPassword() { return password; } public void setPassword(String password) { this.password = password; }
Listing 4 illustrates a simple Java class to write to the database. This Java JPA code is all pretty standard stuff, and the bulk of the code in Listing 3 is automatically generated by Eclipse. Notice the use of a transaction in Listing 4 and the call to persist().
Listing 4A Java client class.
public void saveJavaOrm(JavaORMClass javaORMClassObject) { try { // Start EntityManagerFactory EntityManagerFactory emf = Persistence .createEntityManagerFactory("helloworld"); // First unit of work EntityManager entityManager = emf.createEntityManager(); EntityTransaction entityTransaction = entityManager .getTransaction(); entityTransaction.begin(); entityManager.persist(javaORMClassObject); entityTransaction.commit(); entityManager.close(); emf.close(); } catch (Exception e) { // TODO Auto-generated catch block e.printStackTrace(); } }
Finally, Listing 5 illustrates invocation of the client class to persist the data to the database.
Listing 5A Java client class to persist the required data.
ClientCreate clientCreate = new ClientCreate(); long startTime = System.nanoTime(); for (int i = 0; i < 100000; i++) { JavaORMClass javaORMClassObject = new JavaORMClass( "ed" + i, "Ed Jones" + i, "password"); clientCreate.saveJavaOrm(map, javaORMClassObject); }
In Listing 5, I use a for loop to add the JavaORMClass objects to the database.
Okay, we've seen how to produce the code. How about the execution times? Listing 6 illustrates the Python and Java execution times in seconds.
Listing 6Execution times.
Python ------> Execution time: 6332 Java --------> Execution time: 7353
Python looks a good bit faster! Obviously, the use case here is a little extreme—inserting 100,000 rows isn't going to happen every day. But it shows that Python is considered a lightweight language for good reasons.