Reverse Engineering Inputs
A complete process must consider all inputs. The available information varies widely across problems.
Database structure. The database structure is normally the dominant input. It specifies the data structure and many constraintsprecisely and explicitly. The structure varies according to the kind of DBMS, both by paradigm and product. Relational DBMSs (RDBMSs) have declarative structure that can be readily inspected. Network and hierarchical DBMSs express fewer constraints. COBOL files often contain extensive data declarations.
Reverse engineering can be difficult because information is often missing from the database structure. Reverse engineers must conjecture and augment the structure by considering other inputs.
Data. If data are available, you can discover much of the data structure. A database structure may be badly flawed, yet the data may still be quite good. A thorough application program or disciplined users may yield data of better quality than the structure enforces. The downside with data is that they are much more time-consuming to study than database structure.
For large databases, you may have to sample the data to reach tentative conclusions and then explore the full database for verification. Examination cannot prove many propositions, but the more data you encounter, the more likely will be the conclusion.
Queries. You can scan application code and look for clues from queries that manipulate data. Relational database views can also be suggestive. For example, a join of two fields might indicate a foreign-to-candidate key relationship.
Forms and reports. Suggestive titles and layouts can clarify a database structure. Form and report definitions are especially helpful if their binding to database structure is available. A more empirical approach is to enter known unusual values to establish the binding between forms and the underlying structure.
Documentation. Problems vary in their quality, quantity, and kind of documentation. Documentation provides context for reverse engineering. User manuals are especially helpful. Data dictionarieslists of important entities and their definitionsmay be available. Use data dictionaries with care, however, because they often become stale as the underlying database changes.
Application understanding. If you understand an application well, you can make better inferences. Application experts may be available to answer questions and explain rationale. You may be able to leverage models from related applications.
Table 2 summarizes the strengths and weaknesses of the various input sources.
Table 2-Input Sources. The Reverse Engineering Inputs Have Different Tradeoffs.
Input Source |
Strengths |
Weaknesses |
Database structure |
Specifies data structure precisely and explicitly. Often available. |
Information may be missing, and there are often errors. |
Data |
Can clarify ambiguities. |
Tedious to study. |
Queries |
Can clarify ambiguities. |
May not be available. |
Forms and reports |
Provides meaningful names. |
Can be tedious to tie to the database. |
Documentation |
Provides context. |
Can be difficult to reconcile with the database structure. |
Application understanding |
Provides context. |
Sometimes not available. Application experts are busy. |