ANEW --SPWGIF-- \ Wil Baden 2003-02-22 \ ******************************************************************* \ * * \ * Wil Baden 2000-07-02 * \ * * \ * Structured Programming with GOTOs in Forth * \ * * \ * Donald E. Knuth, "Structured Programming with `goto` * \ * Statements" (1974), reprinted in _Literate Programming_ * \ * (1992), discusses situations where gotos are appropriate. In * \ * this page his examples have been transcribed from pseudo Algol * \ * to ortho Forth. * \ * * \ ******************************************************************* \\ Knuth's examples use arrays extensively. I define an array with the pattern: CREATE Array Size CELLS ALLOT The address of the _i_'th element _&Array[i]_ is: i 'th Array The value of the _i_'th element _Array[i]_ is -- i 'th Array @ `'th` is defined for 32-bit arithmetic as -- : 'th ( i "arrayname" -- addr ) S" 2 LSHIFT " EVALUATE BL WORD COUNT EVALUATE S" + " EVALUATE ; IMMEDIATE For 16-bit arithmetic, replace `2 LSHIFT` with `1 LSHIFT` or `2*`. A macro is used here and in other definitions for efficiency. With minimum peep-hole optimization, I expect `2 'th A` to become `A+8`. The definition of `OFF` hopes to be optimized. : OFF ( addr -- ) S" 0 SWAP ! " EVALUATE ; IMMEDIATE `++` increments the contents of an address. : ++ ( addr -- ) S" 1 SWAP +! " EVALUATE ; IMMEDIATE If you already have efficient versions, use them. A Searching Example Knuth's description Let's suppose that we want to search a table _A[0] ... A[m-1]_ of distinct values, in order to find where a given value _x_ appears; if it is not present in the table, we want to insert it as an additional entry. Let's suppose further that there is another array _B_, where _B[i]_ equals the number of times we have searched for the value _A[i]_. I have changed _A[1] ... A[m]_ to _A[0] ... A[m-1]_ here and similarly with other examples. : Example-1 ( x -- i ) 0 BEGIN ( x i) dup m @ < WHILE 2dup 'th A @ = IFGOTO found 1+ REPEAT \ Not found. m ++ 2dup 'th A ! dup 'th B OFF LABEL found NIP ( i) dup 'th B ++ ; In 1971 Knuth and Robert W. Floyd gave that as an "example of a typical program for which the ordinary capabilities of `while` and `if` statements are inadequate." To remove the goto, Knuth uses "short-circuit and". This and "short-circuit or" are defined here. : ANDIF ( x -- ) S" dup IF DROP " EVALUATE ; IMMEDIATE : ORIF ( x -- ) S" dup 0= IF DROP " EVALUATE ; IMMEDIATE : Example-1a ( x -- i ) 0 BEGIN ( x i) dup m @ < ANDIF 2dup 'th A @ = NOT THEN WHILE 1+ REPEAT dup m @ = IF m ++ 2dup 'th A ! dup 'th B OFF THEN NIP ( i) dup 'th B ++ ; He also introduces a test and normal branch at the end of the function. (The branch takes place except at the first occurrence of each _x_.) In Example-1 for every _A[i]_ that is passed over in the search, there are two tests and one branch. In Example-1a for every _A[i]_ that is passed over, there are two tests, a little extra processing, and a branch. Example-1 was important for stimulating the relaxing of structured programming restraints. In strictly structured programming ("strictured programming") a loop may have a single exit at the beginning or end, and a function may have a single exit at the end. Most of recent profane languages now allow multiple exits. In Classical Forth and Standard Forth a BEGIN-loop may have more than one exit. A goto-free slightly improved equivalent of Example-1 can be written -- : Example-1.2nd ( x -- i ) -1 BEGIN 1+ ( x i) dup m @ < WHILE 2dup 'th A @ = UNTIL \ Found. NIP ( i) dup 'th B ++ EXIT THEN ( x i) \ Not found. m ++ 2dup 'th A ! NIP ( i) 1 over 'th B ! ; Knuth points out that the technique in all versions of Example 1 is almost never a good way to search an array for x. Example 2 beats Example 1 because it makes the inner loop considerably faster. : Example-2 ( x -- i ) dup m @ 'th A ! -1 BEGIN 1+ ( x i) 2dup 'th A @ = UNTIL NIP ( i) dup m @ = IF m ++ dup 'th B OFF THEN dup 'th B ++ ; For every _A[i]_ passed over in the search, Example 1 does two tests and one branch. Example 2 does one test and one branch. In Example-2a, Knuth uses gotos and labels to double up the testing within Example 2. Once again in Forth, the gotos and labels are not needed. In Example-2a, for every two _A[i]_ passed over in the search, there are two tests and one branch. Knuth estimates a 12 percent saving. My tests showed 15 percent faster. : NOT ( x -- flag ) S" 0= " EVALUATE ; IMMEDIATE : Example-2a ( x -- i ) dup m @ 'th A ! -2 BEGIN 2 + ( x i) 2dup 'th A @ = NOT WHILE 2dup 1+ 'th A @ = UNTIL 1+ THEN NIP ( i) dup m @ = IF m ++ dup 'th B OFF THEN dup 'th B ++ ; ... Hash Coding In Example 3, Knuth solves the problem of Example 1 and 2 using hash coding. As in Example 1 he uses a goto and a label. Once again these are not needed with the multiple exit features of Forth. Here _x_ is never zero, and the _A_ array is initially erased. `#Array-Size` is somewhat larger than what the number of elements will be. `x Hash` returns a non-negative number less than `#Array-Size`. Variable _m_ disappears. : Example-3 ( x -- i ) dup Hash ( x i) BEGIN dup 'th A @ WHILE 2dup 'th A @ = NOT WHILE ORIF #Array-Size THEN 1- REPEAT NIP ( i) dup 'th B ++ EXIT THEN ( x i) 2dup 'th A ! NIP ( i) 1 over 'th B ! ; Knuth improves the efficiency of Example-3 by testing for a match first. Forth still does not need goto. : Example-3a ( x -- i ) dup Hash ( x i) BEGIN 2dup 'th A @ = NOT WHILE dup 'th A @ WHILE ORIF #Array-Size THEN 1- REPEAT 2dup 'th A ! dup 'th B OFF THEN NIP ( i) dup 'th B ++ ; Knuth improves the efficiency of Example-3 even more by eliminating the testing of the index for zero when the non-matching element is non-zero. This keeps _A[0]_ equal to 0. : Example-3b ( x -- i ) dup Hash ( x i) BEGIN 2dup 'th A @ = NOT WHILE dup 'th A @ IF 1- ELSE dup 0= IF DROP #Array-Size 1- ELSE 2dup 'th A ! NIP ( i) 1 over 'th B ! EXIT THEN THEN REPEAT ( x i) NIP ( i) dup 'th B ++ ; Example-3b is not easy to read, nor is Example-3. I'll keep them in my garage until I need them. "Structured Programming with `goto` Statements" was first published 26 years ago (1974). I'm sure that it was instrumental in extending the collection of politically correct control structures. It's ironic that this first set of examples can all be done in Forth without goto and as efficiently as the goto versions. In today's languages, these examples are not candidates for GOTO. It may be different in Knuth's next set of examples. Text Scanning Knuth's description -- Suppose we are processing a stream of text; and that we want to read and print the next character from the input; however, if that character is a slash ("/") we want to "tabulate" instead (i.e., to advance in the output to the next tab-stop position on the current line ); however, two consecutive slashes means a "carriage return" (i.e., to advance in the output to the beginning of the next line). After printing a period (".") we also want to insert an additional space in the output. : CASE? ( x y -- true | x false ) S" over = dup IF NIP THEN " EVALUATE ; IMMEDIATE : Example-4 ( -- ) Read-Char ( c) [CHAR] / CASE? IF ( ) Read-Char ( c) [CHAR] / CASE? IF ( ) CR GOTO done THEN ( c) Tabulate THEN dup EMIT [CHAR] . = IF SPACE THEN ( ) LABEL done ; Once again Forth does without goto. : Example-4 ( -- ) Read-Char ( c) [CHAR] / CASE? IF ( ) Read-Char ( c) [CHAR] / CASE? IF ( ) CR EXIT THEN ( c) Tabulate THEN dup EMIT [CHAR] . = IF SPACE THEN ( ) ; Knuth comments In practice we occasionally run into situations where a sequence of decisions is made via nested `if ... then ... else ...`s, and then two or more of the branches merge into one. We can manage such decision-table tasks without `goto`s by copying the common code into each place, or by defining it as a procedure, but this does not seem conceptually simpler than to make such cases `goto` a common part of the program. ] ... Tree Searching Knuth's description This is part of the well-known "tree search and insertion" scheme, where a binary search tree is being represented by three arrays. _A[i]_ denotes the information stored at node number _i_, and _L[i]_, _R[i]_ are the respective node numbers for the roots of that node's left and right subtrees; empty subtrees are represented by zero. The program searches down the tree until finding an empty subtree where is can be inserted. ... For convenience I have assumed in this example that _x_ is not already present in the search tree. : Example-5 ( x -- j ) 0 \ Head of Tree. ( x i) LABEL compare:left LABEL compare:right 2dup 'th A @ > dup 'th L @ IF 'th L @ GOTO compare:left ELSE Next-Node @ over 'th L ! GOTO insert THEN ELSE dup 'th R @ IF 'th R @ GOTO compare:right ELSE Next-Node @ over 'th R ! GOTO insert THEN THEN LABEL insert DROP ( x) Next-Node @ ( x j) 2dup 'th A ! NIP ( j) dup 'th L OFF dup 'th R OFF Next-Node ++ ; Knuth first eliminates `goto` by using a local variable. I'll use the return stack to hold it. This of course takes more processing, but I think it is a typical Forth approach. : Example-5a ( x -- j ) TRUE >R 0 BEGIN ( x i) R@ WHILE 2dup 'th A @ > IF dup 'th L @ IF 'th L @ ELSE Next-Node @ over 'th L ! FALSE R> DROP >R THEN ELSE dup 'th R @ IF 'th R @ ELSE Next-Node @ over 'th R ! FALSE R> DROP >R THEN THEN REPEAT R> DROP DROP ( x) Next-Node @ ( x j) 2dup 'th A ! NIP ( j) dup 'th L OFF dup 'th R OFF Next-Node ++ ; Knuth introduces C. T. Zahn's situation indicator as a new form of control structure. This can be emulated with `GOTO`. Here is an illustration. : Example-5b ( x -- j ) 0 BEGIN ( x i) 2dup 'th A @ > IF dup 'th L @ 0= IFGOTO left-leaf-hit 'th L @ ELSE dup 'th R @ 0= IFGOTO right-leaf-hit 'th R @ THEN AGAIN LABEL left-leaf-hit Next-Node @ over 'th L ! GOTO insert LABEL right-leaf-hit Next-Node @ over 'th R ! LABEL insert DROP ( x) Next-Node @ ( x j) 2dup 'th A ! NIP ( j) dup 'th L OFF dup 'th R OFF Next-Node ++ ; So far Knuth's examples have been concerned with removing gotos. Forth's multiple exits take care of most of his problems. Example-5b is a good candidate for GOTO. This is a good place to stop in Knuth's article. The rest of the examples are not conducive to being transcribed. The challenge to the reader is to give a goto-less version of Example-5b that's as easy to understand. Here is a direct transcription of Example-5b into Forth logic, using function call for `goto`s, and bottom-up logic replacing top-down logic. : Insert-Node ( x i -- j) DROP ( x) Next-Node @ ( x j) 2dup 'th A ! NIP ( j) dup 'th L OFF dup 'th R OFF Next-Node ++ ; : Left-Leaf-Hit ( x i -- same ) Next-Node @ over 'th L !, Insert-Node ; : Right-Leaf-Hit ( x i -- same ) Next-Node @ over 'th R ! Insert-Node ; : Example-5b ( x -- j ) 0 BEGIN ( x i) 2dup 'th A @ > IF dup 'th L @ 0= IF Left-Leaf-Hit EXIT THEN 'th L @ ELSE dup 'th R @ 0= IF Right-Leaf-Hit EXIT THEN 'th R @ THEN AGAIN ; A final note from Knuth. Reduction of Complication There is one remaining use of `goto` for which I have never seen a good replacement, and in fact it's a situation where I still think `goto` is the right idea. This situation typically occurs after a program has made a multiway branch to a rather large number of different but related cases. A little computation often suffices to reduce one case to another; and when we've reduced one problem to a simpler one, the most natural thing is for our program to `goto` the routine that solves the simpler problem. \\ // \\ // \\ // \\ // \\ // \\ // \\ // \\