CHAPTER 17 ----- PROCEDURE DIVISION

RULE 11 - Use the Balanced Line algorithm for file "matching" instead of the key matching algorithm.

Of all the COBOL coding constructs that could have the most effect on making programs simpler and more maintainable, this is the foremost. As was shown in chapter 8, this algorithm has been available since the mid 1970's, and yet most textbook authors do not appear to be aware of its existence. Instead they generally offer some variation of the "less than, greater than, equal" key matching algorithm. And this in spite of some of the difficulties it presents to them.

One author goes to a great deal of detail (19 pages of text) in outlining how to handle the nine cases involved in matching two files during an update (three key match conditions - less than, greater than, and equal to, combined with three transaction types - add, change, and delete).1 If the balanced line algorithm were being used, there would only be six cases (two key existence conditions - key exists, and key does not exist, combined with the three transaction types). In this latter scenario, whether a key exists is just one more type of edit condition to be applied in editing the transaction.

Of course, no mention is made of what happens when an add transaction and a change transaction happen to occur in the same batch of transactions, or what would happen if there were more than two files. The necessary logic to explain those situations would be far more than the 19 pages the simple situation consumes.

Another author uses this key matching algorithm as the example in his chapter on "Refinement".2 Such statements as "As we start to work backward from the required output to the required input, we notice (perhaps for the first time) that nothing guarantees that a record will be available in each input area." and "This was probably a wise decision, but it has created a problem with some code already written. If we invoke these routines from two higher levels, how shall we place the definitions of the FLAGS?" and "There are a number of potential solutions to this paradox." The author seems to regard this file update algorithm as a sufficiently complex problem to require a detailed discussion of how to code it.

In addition, the authors of texts which specify the key matching algorithm for sequential file updating, also generally also assume that their programs are processing only valid transactions ("If T-KEY = N-KEY and the transaction code is a deletion (neither an insert nor a change), the new master file record, which contains the item to be deleted, is simply not written onto disk."3). Adding editing for such things as invalid transaction codes or changes with invalid values would unduly complicate the issue. But with the balanced line algorithm, it is trivial, since the existence or non-existence of a master record, which is checked simply by interrogating a flag, is just another type of edit check.

One text went beyond simply using the key matching algorithm as the assumed way that file matching must be done, and put it in a standard as if to say, this is the only way it can be done, so everyone should do it this way.4

Finally, one author used the key match algorithm as his example of how to create a structured program design complete with structure charts, and flowcharts.5

Fortunately, there were a few good examples of the correct algorithm available. Two out of the eleven texts which presented a file update algorithm used the balanced line algorithm. Popkin simply assumes that this is the correct way and presents it with no explanation.6 Lim gives a more elaborate explanation that is well worth reading by anyone who may not know of the algorithm.7 An entire chapter is devoted to it, although the entire logic flow and explanation only takes five pages, the rest of the chapter being devoted to an extensive example. His opening remarks are worth repeating:

"Sequential update programs can be standardized through what is know [sic] as the Balanced Line algorithm; although this algorithm was discovered more than a decade ago, only a few of the more experienced programmers are aware of its existence. This is unfortunate, since the algorithm makes it possible to write update programs with ease."

This was written almost ten years ago, yet few other authors seem to have "discovered" the existence of this algorithm so they could pass it on to their readers.

RULE 12 - Use the PERFORM UNTIL structure instead of the historical control break structure.

This control structure is equally important as the balanced line algorithm discussed above. It is also just as misunderstood. Interestingly, neither of the authors who presented the balanced line algorithm presented a proper solution to the control break situation. This reinforces my beliefs that the authors do not really understand the algorithm behind the code, but simply pass on those coding constructs that they have personally encountered, not very dissimilar to what the students who learn from their courses will do.

Lim, who gave such an excellent treatise on the balanced line algorithm, simply lists the pseudo code for handling control breaks without any explanation about why it should be that way.8 Others are more elaborate and may even include structure charts showing all the detail on how to implement a multi-level control break.9,10 Those that use this algorithm also generally have to note some of the problems that using an incorrect algorithm generates.

"At the end of the file, we need to print the subtotals since there can be no control break generated without more records."11

"There are two tricky things about testing for control breaks. At the beginning of program processing, the program logic must bypass the false control break that will occur when the first record is read. Then after all the input records have been processed and end-of-file has been reached, the program logic must force out the final control total line. Failure to provide for these requirements will result in the common programming control-break program bugs shown [above]."12

"Forcing the Last Store Footing ... This kind of situation usually occurs in control-break programs, and the explicit documentation is therefore not necessary."13

Only a single author presents the correct algorithm for a multi-level control break problem.14 It is no wonder that few programmers ever learn it.

RULE 13 - Use the "triform" structure as the main control structure in your program.

This rule is really a generalization of the preceding one where it is specifically applied to control breaks. Since only one author presented the correct algorithm there, it is not surprising that none of the authors presents a good argument for the triform structure. However, it is interesting that all of the authors who show any program examples use the triform structure for the "mainline" paragraph of their programs. However, they never extrapolate this structure and use it for lower level structuring where it would be useful in either control break programs or file update programs mentioned earlier in this chapter. With this lack of notice of this central control structure, one cannot fault the average programmer for being unaware of it.

RULE 14 - Do not indent when coding a "linear" nested IF to implement a CASE structure. Code the ELSE IF on the same line as if it were a single verb.

A key part of any COBOL textbook is a discussion of the IF verb. Equally important is a discussion of how control structures, especially the IF statement, may be "nested". However, there is a lot of confusion among the various textbook authors about the implementation of the CASE structure. Most of those that discuss it limit their remarks to the GO TO DEPENDING ON statement in COBOL and do not include the IF ... ELSE IF ... ELSE IF ... ELSE construct in their remarks. Following are some of the obvious examples of an n-way IF that is really an instance of a CASE construct, but which are improperly nested.

       IF PRODUCT-CODE-INPUT = 'H'
MOVE HARDWARE-CONSTANT TO PRODUCT-TYPE-REPORT
ELSE
IF PRODUCT-CODE-INPUT = 'S'
MOVE SOFTWARE-CONSTANT TO PRODUCT-TYPE-REPORT
ELSE
MOVE INVALID-CONSTANT TO PRODUCT-TYPE-REPORT.15

IF K = 1
MOVE "FRESHMAN" TO LINE1
ELSE
IF K = 2
MOVE "SOPH" TO LINE1
ELSE
IF K = 3
MOVE "JUNIOR" TO LINE1
ELSE
MOVE "ERROR" TO LINE1.16

IF NEW-YORK
THEN ADD TRANS-AMOUNT TO NEW-YORK-CTR
ELSE IF WASHINGTON
THEN ADD TRANS-AMOUNT TO WASHINGTON-CTR
ELSE IF BOSTON
THEN ADD TRANS-AMOUNT TO BOSTON-CTR
ELSE ADD TRANS-AMOUNT TO OTHER-CITY-CTR.17

Some even make comments on the "difficulty" of understanding an IF statement if it is nested more than 3 levels deep.

"The reason we recommend avoiding more than three levels of nesting is that more often than not, code involving complex nested IF statements is hard to understand."18

Fortunately, some authors seem to realize that a series of IF statements in a single sentence do not necessarily have to be "nested". McClure calls this structure an "n-Way Branch" and gives the following example:

       IF FORMAT-TYPE = 01
MOVE 01 TO OUTPUT-TYPE
ELSE IF FORMAT-TYPE = 02
MOVE 02 TO OUTPUT-TYPE
ELSE IF FORMAT-TYPE = 03
MOVE 03 TO OUTPUT-TYPE
ELSE
PERFORM ERROR-PROCESSING.19

This example is probably a little too simplistic. Another states it a little better. "The IF...ELSE combination can also be used to generate another version of the CASE structure."20 The following example is used:

       IF EDIT-CODE = 1
PERFORM 4110-EDIT-1
ELSE IF EDIT-CODE = 2
PERFORM 4120-EDIT-2
ELSE IF EDIT-CODE = 3
PERFORM 4130-EDIT-3
ELSE IF EDIT-CODE = 4
PERFORM 4140-EDIT-4
ELSE
PERFORM 4150-EDIT-5.

One author calls this type of structure "A Special Kind of IF Statement" and states "Some programmers feel that since one of the True paths at most will be executed, it is clearer to write it this way."21

Finally, one author calls the two kinds of nested IF statements "Linear Nested IF Statements" and "Nonlinear Nested IF Statements" and states "Before the development of structured programming concepts, use of nested IF statements was usually discouraged because they were considered complicated and difficult to understand. However, with structured programming, nested IF statements are often required to provide proper control of statement selection. The complexity of nested IF statements is reduced when [1] the programmer thoroughly understands how the ELSE statement groups are paired with IF conditions, [2] proper indentation forms are used when coding the nested IF, and [3] the number of levels of nesting is limited to perhaps three or four."22

This last author has reached a balanced view of nesting which many of the other authors have not.

RULE 15 - When coding nested IF statements, always code both the true and false paths, using the NEXT SENTENCE or ELSE NEXT SENTENCE construct as necessary.

This topic is not addressed by many authors, except syntactically. If it is addressed, it usually expressed something like "The clause ELSE NEXT SENTENCE may be omitted if it appears immediately before the period."23

One author does make a positive statement.

"... the ELSE NEXT SENTENCE could be optionally coded. eginners may prefer to actually code this line, however, and he (or she) should if it helps in reading the program better."24

However, since a later reader of any program may be a beginner other than the original programmer, I would recommend that it be coded to help others read the program

besides the original programmer.

RULE 16 - Do not code an IF statement when the terminating condition of a PERFORM loop will also include the condition.

This particular construct does not appear to be covered by any of the textbook authors. However, it appears many times in actual programs. It is apparently a misconception of how PERFORMs work and confusion over the fact that the "test before" can cause a PERFORM statement to be executed zero times.

RULE 17 - Do not use the PERFORM ... THRU construct.

There is much discussion over this particular construct and much disagreement among the various textbook authors. Some recommend always using it.

"The coding of the EXIT paragraph explicitly defines the logical end of the procedure."25

"The above code [PERFORM...THRU] reflects a common practice among many programmers. Paragraphs stand out more vividly as each terminates with an EXIT paragraph. In essence, each paragraph performed has a beginning and a clearly marked end (the EXIT paragraph) to which all sentences converge in the paragraph."26

Others give details about why it others may use it, but counter these examples with reasons why it should not be used.

"Programmers often use this form of PERFORM-THRU in order to facilitate the use of GO TO statements within the procedure; e.g., to provide for an early exit from the module by passing control to the EXIT paragraph. ... The same logic could be coded without the GO TO statement ... Note that with this arrangement of code, the final EXIT paragraph really isn't necessary. Indeed, this is generally true. If we organize our logic properly, the GO TO statement is superfluous."27

"In the late 1960s and the early 1970s, prior to the adoption of structured coding concepts, use of the PERFORM/THRU statement was very popular. Indeed, the programming standards for many installations recommended that the single-paragraph PERFORM never by used. ... The main disadvantage to multiple-paragraph modules is that the physical placement of paragraphs within the program becomes significant to program execution. Hence program bugs can be introduced. ... With structured code, GO TO statement usage is restricted and hence there is no reason to use dummy paragraphs [those with an EXIT in them] or multiple-paragraph modules. Because of the wide use of dummy paragraphs prior to the structured coding era, many older programs using multiple-paragraph modules are still in existence, however."28

RULE 18 - Always use the READ ... INTO and WRITE ... FROM forms of these verbs and define all record definitions in the WORKING-STORAGE SECTION.

Most of the textbooks make no recommendation on the use of the READ...INTO construct. Some simply note "The INTO clause is optional."29 This is probably since the initial programming examples in most textbooks do not use the INTO clause so as not to confuse the student. However, those texts that give lists of recommended coding constructs and/or programming style ideas tend to recommend that the INTO clause be used.

"Use READ INTO and WRITE FROM to do all the processing in the Working-Storage Section. This is suggested for two reasons ..."30

"STANDARD. All work on a file should be done in the WORKING-STORAGE. This means that in the PROCEDURE DIVISION, the programmer should READ INTO and WRITE FROM. If a file is worked on in WORKING-STORAGE alone, then the only fields that should be defined in the FILE SECTION are the ones referenced in the program. This will keep the maintenance programmer from referencing them in the program intentionally or unintentionally. In fact, the only fields from the file description (FD) entry referenced in the program will be the record key and the record name."31

RULE 19 - Consider counting "records processed" instead of "records read". At least ask yourself, "why am I counting?" and "what am I counting?".

Only three of the authors mention control counts, and only one actually gives an example of doing the counting. Weinberg has a short section of "Counting for Control", but much of the discussion centers around replacing "flags" with counters because "Flags carry information only about zero or nonzero; but control counts give a very precise picture."32

One author states, "Each file should maintain a record count, and this count should be displayed as part of the end-of-job procedure."33

This is a helpful suggestion, but still lacks a concrete example. Interestingly, the only example does not count "records read", but "records processed".34

           READ INPUT-FILE AT END MOVE 1 TO WS-EOF.
PERFORM 10-PROCESS-READ UNTIL WS-EOF = 1.
PERFORM 20-END-OF-FILE-CHORES.
.
.
10-PROCESS-READ.
ADD 1 TO KOUNT.
PERFORM 30-PROCESS-RECORD.
READ INPUT-FILE AT END MOVE 1 TO WS-EOF.
.
.

If this program were to count reads, then the following statement would have to be included after each of the read statements.

        IF WS-EOF NOT = 1 THEN ADD 1 TO KOUNT.

RULE 20 - Consider adding "PARM" overrides to allow for easy program testing, eliminated the need for "near-clones", etc.

Because most programming texts are oriented at getting the students to learn proper COBOL syntax and solve common problems, this topic is not covered in any of the introductory textbooks examined. However, one of the texts that included suggestions for experienced programmers made some reference to it.

"The flexibility of the program can be greatly increased by removing fixed values from the working storage of the program and providing for these values to be entered via control cards or parameter files during program initialization."35

While this is not as strong a statement as the rule above suggests, it is at least a step in that direction.

CHAPTER 17 ENDNOTES

1 - J. Wayne Spence. COBOL for the 80's. 559-578.

2 - Gerald M. Weinberg, et al. High Level COBOL Programming. 127ff.

3 - Michel Boillot and Mona Boillot. Understanding Structured COBOL. 468.

4 - Computer Partners, Inc. Handbook of COBOL Techniques. 25.

5 - Carl Feingold. Fundamentals of Structured COBOL Programming. 136-137.

6 - Gary S. Popkin. Comprehensive Structured COBOL. 365,414.

7 - Pacifico A. Lim. A Guide to Structured COBOL with Efficiency Techniques and Special Algorithms. 73ff.

8 - Pacifico A. Lim. 55-56.

9 - Computer Partners, Inc. 58-59.

10- Michel Boillot and Mona Boillot. 306.

11- Edward J. Coburn. Advanced Structured COBOL. 53.

12- Tyler Welburn. Advanced Structured COBOL: Batch, On-line, and Data-base Concepts. 446-447.

13- Gerard A. Paquette. Structured COBOL. 465.

14- Carl Feingold. 160-161.

15- Gary B. Shelly, et al. Structured COBOL, Pseudocode Edition. 6.7.

16- Michel Boillot and Mona Boillot. 177.

17- Pacifico A. Lim. 28.

18- Barry K. Nirmal. Programming Standards and Guidelines: COBOL edition. 129.

19- Carma L. McClure. Reducing COBOL Complexity through Structured Programming. 139.

20- Computer Partners, Inc. 38-39.

21- Gary S. Popkin. 135.

22- Tyler Welburn. Structured COBOL: Fundamentals and Style. 355-356.

23- Gary S. Popkin. 136.

24- Pacifico A. Lim. 28.

25- Barry K. Nirmal. 124.

26- Michel Boillot and Mona Boillot. 186-187.

27- Timothy R. Lister and Edward Yourdon. Learning to Program in Structured COBOL, Part 2. 46-47.

28- Tyler Welburn. Structured COBOL: Fundamentals and Style. 273-274.

29- Fritz A. McCameron. COBOL Logic and Programming. 80.

30- Carl Feingold. 745.

31- Barry K. Nirmal. 110.

32- Gerald M. Weinberg, et al. 163.

33- Computer Partners, Inc. 41.

34- Michel Boillot and Mona Boillot. 198.

Previous Chapter ----- Return to Index ----- Next Chapter