Adding Workflow Control to Your Java Applications
In a factory job, workers are assigned to very specific tasks in assembly lines. Each worker does the same repetitive task, over and over, and no one worker completes a single product. One product is built from many smaller units, which are completed in a specific order. When the occasional custom order comes down the assembly line, it's handled differently than the standard construction.
In software development, workflow is functionally the same as on an assembly line. Each product or workflow is broken down into many smaller functional pieces (units). Each unit of work can then be assigned to an individual worker or machine, who may handle one or more units of work. Thus, one of the first steps in designing a workflow is to break the product into its smallest functional pieces or units of work. From there, the flow of the work can be mapped and variances to the normal flow can be plotted.
Sound complicated? Not necessarily. Workflow can be as simple as emailing a document to a coworker or customer, or as complicated as triggering multi-step processing over a huge computing network.
This article explains how an efficient workflow functions, and how the right workflow can help you with your Java development projects.
NOTE
The e-Workflow site is a great reference site for anyone interested in implementing workflow applications. It lists books, links, white papers, and more details to get you started.
Workflow Basics
Let's start with terminology. Every workflow consists of destinations (queues). At each destination are activities that need to occur (tasks). These activities may or may not include a document, object, or unit of work (the payload).
For simplicity, let's apply this terminology in a typical business workflow: routing a memo via interoffice mail to a group of employees. Assume for this example that the memo goes into an interoffice mail envelope with several employees' names listed as recipients. Here's the workflow:
The envelope is passed to the first person in the queue.
The recipient reads the contents of the envelope (the payload) and performs some activity or task based on that content. (The task may simply be to read and absorb the information enclosed in the envelope.)
When the first recipient is done reading the memo, he checks or crosses off his name on the envelope and puts the envelope back into the interoffice mail to be routed to the next person.
When all the steps in a workflow have been completed, that flow is "end of lifed." Depending on the flow, this could mean that the data is filed away, destroyed, etc. In our interoffice memo example, the last person to receive the memo may route it back to its starting point, throw it away, shred it, or whatever "end of life" activity is dictated by office policy or by the memo's originator.
To better understand how to build a good workflow, let's imagine a fictitious company. ABC Collectors handles billing and the collection of payments from the customers of other companies. Figure 1 shows the normal workflow for the receiving of payments. This is a fairly straightforward workflow and easy to follow.
Figure 1 A basic payment workflow.
There are two basic types of application workflow systems: independent workflow and integrated workflow. Let's examine both.
Independent Workflow
An independent workflow application tracks the work that a user has done and what work still may need to be done by that user. As units of work or tasks are completed, the application is updated to reflect that completion and to find out what task(s) to do next. The major weakness of this type of workflow application is that it's cumbersome. It's easy to forget to update (or to simply ignore the workflow), and the application gets out of sync with reality.
A typical independent workflow application is a GUI that connects to a back-end database. For more complex decision-making in the workflow process, such an application would need to attach some form of rules engine to handle the branching. In the ABC Collectors example, the first person in the chain would both enter the payment information into the accounts receivable application and create an entry into the workflow application. This duplication of effort requires entering data into at least two different systems. The best way to avoid the duplication is to integrate the workflow directly into the application.
Integrated Workflow
An integrated workflow doesn't need a separate application designed specifically to manage the workflow. Instead, the primary application (for ABC Collectors, the finance application) handles workflow automatically on the back end of the system. The workflow stays in sync automatically, and users don't need to worry about keeping multiple applications up to date.
There are two basic types of integrated workflows:
In a stateless workflow, the workflow doesn't know which task the payload came from or which task is next. The workflow determines the next task by querying the data contained in the payload itself. The idea behind a stateless workflow is that the system doesn't need to have one copy of the workflow for every unit of work. If the workflow is simple and/or doesn't change very often, this can be an efficient means of handling the flow. The advantagewhich may also be a disadvantageis that changing the workflow can (depending on design) change the flow for every existing unit of work.
In a stateful workflow, the workflow knows everything about the payload, and the payload itself doesn't contain any history or "stateful" information. This tends to make the workflow subsystem heavier than in a stateless workflow, but offers some pluses. The payload doesn't have to know who it's assigned to or who worked on it previously, as all this information is stored in the workflow. The payload therefore can be accessed by more than one person at a time, giving the application all the advantages of parallel processing. Changing a particular workflow also doesn't affect any tasks that are already in process. Thus, management can implement a policy change without worrying about affecting existing work.
All state information in a stateless workflow is contained within the unit of work itself. Thus, the data needs to indicate who is assigned to that unit of work, and all history information about who has worked on it in the past. One limitation of a stateless workflow is that it doesn't lend itself easily to parallel processing of an individual unit of work. A stateless workflow is a good choice when the tasks follow a singular, linear path without concurrent activities.