SQL Standards and Guidelines for DB2 Developers
Every IT shop that has applications involving DB2 should have a set of SQL standards and guidelines for its developers to follow. This chapter is a start for developers and project managers to use as part of their development. Once you have a set of standards and guidelines, be sure to enforce them. Every program should have code walkthroughs to ensure that standards and guidelines are being followed.
The standards and guidelines that follow serve multiple purposes:
- Relate to performance
- Alleviate abends and/or production incident reporting
- Reduce I/O and CPU costs
- Increase productivity
- Improve client satisfaction
- Improve readability and understandability
The standards and guidelines that follow are grouped into two separate areas: one specific to COBOL SQL developers and the other specific to all SQL developers (no matter the language in which they are embedding their SQL code).
For COBOL Developers
- The SQLCODE must be checked after every SQL statement. The Declare cursor statement is only a declarative, and it gets no return code from DB2. All other SQL calls get some return code. Return code data from the DB2 database system gets automatically loaded in the SQLCA communications area.
- Every program must include the SQLCA and a DCLGEN for each table being coded against. The DCLGEN is predefined with host variables that match the column definitions. They are used to select data into, insert and update from, and serve as the host variables in any Where clause.
If DCLGEN fields are not being used, then any program declaring variables in the code must make sure that the variable being declared exactly matches the definition in DB2. If it doesn’t, then there is a possibility that DB2 may not choose an index to process. For example, if Column1 is defined as an Integer, then the host variable in COBOL should be defined as S9(9) comp.
- Every program must have a consistent DB2 abend routine. For batch programs, it is easiest to have a called program that handles the display of the SQLCA fields and calls the DSNTIAR DB2 routine to display further DB2 messages. For online programs, sometimes it is good to write out the SQLCA and DSNTIAR information to a file or table in order to fall back on errors that occur. The SQLCA contains a lot of information specific to a call that is critical to troubleshooting an error. It is important to write out all the information captured. Make sure that at least the SQLSTATE is displayed, along with the SQLCODE.
- Never code Select * in a program. Only code for the columns needed. If a program needs all the columns, then code each one. This will prevent an abend if a new column is ever added to the table. The fewer columns being brought into the program, the more efficient the processing. (See tuning tip #3 and tuning tip #29 in Chapter 1, “SQL Optimization Top 100+.”) More columns can have an effect on performance due to larger sort sizes, possible index-only processing, and join types. When DB2 looks at which join type is best, part of its analysis is the number of columns from each table being selected.
- Make sure any columns defined as Nullable contain a null indicator host variable as part of the Select, Insert, or Update statements. This is most important in Select statements because DB2 will return an invalid -305 SQLCODE when it returns a column of null to the program and there is no null indicator specified. These null indicators must be defined in working storage as Pic S9(4) Comp.
It is preferable to code the VALUE, COALESCE, or IFNULL SQL scalar function for any nullable columns because the program will not receive null indicators from DB2. This will alleviate -305 SQL errors where a program is not set up to handle the null indicator. It will also spare the program from having to define the null indicators in working storage.
For example, Select COALESCE(PK_ID, 0) will return the PK_ID value if there is one, or it will return a zero if it is null. This could also be coded with the VALUE and COALESCE functions. All three would return the same result. The default specified must match the column definition. For example, since PK_ID is numeric, then the default must be a numeric—in this case, zero.
- Any SQL statement that contains one of the following aggregate functions should have a Null-Indicator host variable as part of the select (MIN MAX, AVG, SUM). DB2 will return a null indicator to the program if it finds no data to process these functions, and the COBOL program will have to define a null indicator. If the program is not set up with a null indicator, an invalid -305 SQLCODE is returned. It is preferable to code the VALUE, COALESCE, or IFNULL function to alleviate any null indicator logic. For example:
SELECT IFNULL(AVG(SALARY), 0) FROM EMP WHERE WORKDEPT = 'XYZ'
This will either return the average if rows are found or a zero if no rows were met in order to calculate an average.
- Minimize the number of times cursors are opened and closed during execution. If most of the time the open cursor and fetch retrieves only one row, then code a simple Select statement and execute the cursor processing only when a -811 (duplicate rows) SQLCODE is returned.
Do not break up processing into multiple cursors unless performance seems to be an issue. If it takes a seven-table join, then code all seven tables in one cursor and let DB2 do the work. When you break it up, the process usually takes longer due to the extra times DB2 is sent SQL statements to process. So break up the join only when all other tuning efforts have been applied. Typically it would be more efficient to execute a seven-table join.
- CASE expressions should always contain an ELSE clause. If none of the conditions in the CASE are met, then DB2 will return a null (via a null indicator) to the program. If the program is not set up to handle a null being returned from the CASE expression, then a -305 SQLCODE is returned, which usually causes the program to abend.
- Always display counts for the number of Selects, Inserts, Updates, Deletes, and Open cursors that have been executed in the program. The overhead in COBOL to define the counters and increment them through the processing is minimal to the overall runtime of the program. Displaying these counts provides invaluable information when problems occur, helping a developer figure out which program to look into. Make sure the counts are displayed on every abend and at the end of processing.
- Always display the values in host variables for a SQL statement that has an invalid SQL return code and the program goes into its abend error routine. Every developer knows how frustrating it is to have a program error out or even abend and not know what values were being processed.
- Watch out for any SQL warnings that may occur in an SQL statement. Most programs seem to ignore warnings that many times help to detect potential problems. There are two indications of a warning message in the SQLCA: One is a positive SQLCODE other than +100; the other is a W in the SQLCA’s SQLWARN0 field. When either of these exists, DB2 is issuing a warning that something worrisome happened on the prior call and that while you may have received data back, it may not be what you expected. When SQLWARN0 is a W, DB2 also provides helpful information about the problem in one or more of the other SQLWARNn fields. Also check warnings on every SQL statement return. For example:
Evaluate SQLCODE When 0 If SQLWARN0 = 'W' Display '*** Warning error ***' Display 'Sqlstate = ' Sqlstate End-If When Other ... End-Evaluate
- Take advantage of the SQLERRD (3) out of the SQLCA. The third occurrence of the SQLERRD array is one of the most useful fields in the SQLCA. This field is populated after a successful insert, update, or delete with a count of the number of rows inserted, updated, or deleted. This is not populated when a mass delete with no Where logic is coded or populated due to deletes affected by delete cascade.
- Take advantage of fetching rowsets in your cursor processing. (See tuning tip #46 in Chapter 1.) This should be strictly enforced for large cursors because of the runtime savings.
- Apply all calculations within the COBOL code and then move the value to a host variable. Then reference the host variable in the SQL statement. Keep calculations out of SQL statements whenever possible.
- Hard code any and all values known within an SQL statement. For example, if a program always processes the terminated rows on a table, then use the SQL statement Where Status_Code = ‘T’. This is extremely helpful especially if frequency value statistics are present for the different values of Status_Code in the catalog tables. (See tuning tip #10 in Chapter 1.)