1.4 Estimating
Many porting projects go over budget and miss schedules because risks were not managed properly. Risks play a major part in estimating the schedule as well as resources needed for porting projects. In an application porting project, such risks come from different aspects that relate to application porting, including the following:
- Skill levels and porting experience
- Compiler used
- Programming language used
- Third-party and middleware product availability
- Build environment and tools
- Platform-dependent constructs
- Platform- and hardware-dependent code
- Test environment setup needed
- User interface requirements
Depending on the application to be ported, each of these aspects presents varying levels of complexity and risks to the project. Assessing complexity and risk level help you determine whether they are manageable.
1.4.1 Skill Levels and Porting Experience
The most glaring difference between porting applications and software development is the skill set of the programmers. Although software application developers tend to be more specialized in their knowledge domain, software developers who do porting and migration need broader and more generalized skill sets. An application developer can be someone who is an expert in Java and works on a Windows development environment. Another may be a programmer specializing in database programming on a Sun Solaris operating system environment. On the other hand, engineers who port code are expected to be experts in two or more operating system platforms, programming languages, compilers, debuggers, databases, middleware, and the latest Web-based technologies. They are expected to know how to install and configure third-party database applications and middleware.
Whereas application developers tend to be specialists, porting engineers need to be generalists. Application developers may work on an application for about 18 months (the typical development cycle), whereas porting engineers work on a 3- to 6-month project cycle and are ready to start on a new port as soon as the last one is finished. Finding skilled porting engineers who fit the exact requirements of a porting project may be difficult at times. This is most true when the porting efforts require porting from legacy technologies to newer ones.
1.4.2 Compiler
The compiler and compiler framework used by the source platform make a big difference when porting applications to the Linux environment. If the same compiler is used in both the source and target platforms, the task becomes easier. An example in this case is when the GNU compiler is used in both the source and the target platform. Besides the -g and -c flags, different brands of compilers use flags differently. Compiler differences become more difficult if the programming language used is C++.
Another thing to consider is the version of the compilers. Source code compiled a few years back with older versions of compilers may have syntax that does not conform to present syntax checking by compilers (because of standards conformity). This makes it more difficult to port on different compilers because these compilers may or may not support backward compatibility for standards. Even if the compilers support older standards, different compilers may implement support for these standards differently.
1.4.3 Third-Party and Middleware Product Availability
The complexity of the port increases when the application to be ported uses third-party and middleware products. Complexity increases even more when the versions of those products are not available for the target platform. In rare cases, some third-party products may not even be available on the target platform. In the past few years, middleware vendors such as IBM, Oracle, Sybase, and many more have ported their middleware products to Linux. They have made efforts to make their products available for companies that are ready and willing to make Linux their enterprise platform. This is partly why we are seeing more and more companies willing to port their applications to Linux.
1.4.4 Build Environment and Tools
The simpler the build environment, the less time the porting team takes to understand how to build the source code. More complicated build environments include multipass builds where objects are built with different compiler flags to get around dependencies between modules. After a first-pass build to build some modules, a second pass compiles yet more modules, but this time builds on top of the previously compiled modules (and so on until the build completes). Sometimes the build scripts call other scripts that automatically generate files on-the-fly based on nonstandard configuration files. Most of these files may be interpreted as ordinary files to be ported, but in fact, it is really the tools and the configuration files that need to be ported to work on the target module. These types of tools are mostly missed during assessment and analysis, which can result in less-than-perfect schedules.
One build environment that came into fashion sometime in the early 1990s was the use of imake. Imake is a makefile generator intended to ease build environment portability issues. Instead of writing makefiles, one writes imakefiles. Imakefiles are files that contain machine-independent descriptions of the application build environment. Imake creates makefiles by reading the imakefile and combining it with machine-dependent configuration files on the target platform. The whole concept of architecting a build environment around the imake facility to make it portable is a noble idea. Of the most recent 50 porting projects done by our porting group, only a few were built around the imake facility. Unfortunately, all of them needed as many modifications to their own environment and imakefiles, negating the ease that imake is intended to provide. And because imake was rarely used, anyone who needed to do a port needed to "relearn" the utility every time.
Nowadays, makefiles are written for the open source make facility called GNU Make (or gmake for short). Most makefile syntax used in different UNIX platforms is syntax that gmake can understand or has an easy equivalent to. Architecting applications around the gmake6 facility is the most accepted and widely used method in build environments today.
Source code control is another aspect that must be considered. In the Linux environment, CVS is the most frequently used source code control environment. Other source code control environments are Subversion7 and Arch.8 A porting project needs to have source code control if there is a possibility of several porting engineers working on the same modules at the same time.
1.4.5 Platform-Dependent Constructs
When porting from UNIX platforms based on RISC architecture to x86-based platforms, a chance exists that application code needs to be assessed for byte-endian dependencies. For example, the application implements functions that use byte swapping for calculations or data manipulation purposes. Porting the code that uses byte-swapping logic to an Intel-based machine, which is little-endian architecture, requires that code be modified to adhere to little-endian semantics. In cases where byte-swapping logic is not changed between platforms, debugging becomes a chore because it is difficult to track where data corruption takes place. When this happens several instructions before the application fail. Make sure to identify platform-dependent constructs in the scoping and analysis steps.
1.4.6 Platform- and Hardware-Dependent Code
Applications that require kernel extensions and device drivers to operate are the hardest to port. Kernel APIs for all platforms do not follow any standards. In this case, API calls, number of arguments, and even how to load these extensions into the kernel for a specific platform will be different. In most cases, new code that needs to run on Linux must be developed. One thing is for certain: Kernel code will usually, if not always, be written in C or platform-dependent assembler language.
1.4.7 Test Environment and Setup
When the port is done and the project scope includes system tests and verification, equipment required to test the code may contribute to the port’s complexity. If the application needs to be tested on specialized devices that need to be ordered or leased, this can add complexity to the project. If the application needs to be tested in a complicated clustered and networked environment, setting up the environment and scheduling the resource adds time to the project.
Testing may also include performance testing. Normally, performance tests require the maximum configuration setup one can afford to be able to load up the application to see how it scales in big installations.
Porting of the application test harness needs to be included in the porting schedule. A test harness that includes both software tests and specialized hardware configurations needs to be assessed as to how much time it will take to port and configure.
1.4.8 User Interface Requirements
Porting user interfaces can range from being easy to complex. User interfaces based on Java technology are easy to port, whereas user interfaces based on X11 technology are more complex. In this type of application, the code may be written in C or C++.
Table 1-2 is a summary of the various aspects involved. Each aspect is ranked as easy, medium, or high complexity, depending on the technology used for each aspect.
TABLE 1-2 Software Porting Risks and Complexities
View TableExperience shows that real efforts needed to port the application are uncovered during the first two or three weeks of the porting project. This is the time when the code is delivered to the project team and porting engineers get a first glimpse of the software application. In other cases, this is the first time porting engineers work on the application in a Linux environment. They begin to dissect the application components and start preliminary build environment setups. Within the first two to three weeks, the porting engineers will have configured the build environment and started compiling some modules using GNU-based tools and compilers.
After two to three weeks, it is good to revisit the estimated time schedules to determine whether they need to be adjusted based on preliminary real experiences of porting the application into the Linux platform.