ANEW --SPWGIF--                               \  Wil Baden  2003-02-22

\  *******************************************************************
\  *                                                                 *
\  *  Wil Baden 2000-07-02                                           *
\  *                                                                 *
\  *        Structured Programming with GOTOs in Forth               *
\  *                                                                 *
\  *  Donald E. Knuth, "Structured Programming with `goto`           *
\  *  Statements" (1974), reprinted in _Literate Programming_        *
\  *  (1992), discusses situations where gotos are appropriate. In   *
\  *  this page his examples have been transcribed from pseudo Algol *
\  *  to ortho Forth.                                                *
\  *                                                                 *
\  *******************************************************************
\\

Knuth's examples use arrays extensively.  I define an array with the 
pattern: 

    CREATE Array   Size CELLS ALLOT

The address of the _i_'th element _&Array[i]_ is:

    i 'th Array

The value of the _i_'th element _Array[i]_ is --

    i 'th Array @

`'th` is defined for 32-bit arithmetic as --

    : 'th  ( i "arrayname" -- addr )
        S" 2 LSHIFT " EVALUATE
        BL WORD COUNT EVALUATE
        S" + " EVALUATE
    ; IMMEDIATE

For 16-bit arithmetic, replace `2 LSHIFT` with `1 LSHIFT` or `2*`. 

A macro is used here and in other definitions for efficiency. With 
minimum peep-hole optimization, I expect `2 'th A` to become `A+8`. 

The definition of `OFF` hopes to be optimized.

    : OFF  ( addr -- )  S" 0 SWAP ! " EVALUATE ; IMMEDIATE

`++` increments the contents of an address.

    : ++  ( addr -- )  S" 1 SWAP +! " EVALUATE ; IMMEDIATE

If you already have efficient versions, use them.

        A Searching Example

Knuth's description

  Let's suppose that we want to search a table _A[0] ... A[m-1]_ of 
  distinct values, in order to find where a given value _x_ appears; 
  if it is not present in the table, we want to insert it as an 
  additional entry.  Let's suppose further that there is another 
  array _B_, where _B[i]_ equals the number of times we have searched 
  for the value _A[i]_. 

I have changed _A[1] ... A[m]_ to _A[0] ... A[m-1]_ here and 
similarly with other examples. 

    : Example-1       ( x -- i )
        0 BEGIN            ( x i)
            dup m @ <
        WHILE
            2dup 'th A @ = IFGOTO found
            1+
        REPEAT

        \  Not found.
        m ++
        2dup 'th A !
        dup 'th B OFF

        LABEL found
        NIP                ( i)
        dup 'th B ++ ;

In 1971 Knuth and Robert W. Floyd gave that as an "example of a 
typical program for which the ordinary capabilities of `while` and 
`if` statements are inadequate." 

To remove the goto, Knuth uses "short-circuit and".  This and 
"short-circuit or" are defined here. 

    : ANDIF  ( x -- )  S" dup IF DROP " EVALUATE ; IMMEDIATE

    : ORIF   ( x -- )  S" dup 0= IF DROP " EVALUATE ; IMMEDIATE

    : Example-1a      ( x -- i )
        0 BEGIN            ( x i)
            dup m @ <
            ANDIF
                2dup 'th A @ = NOT
            THEN
        WHILE  1+  REPEAT
        dup m @ = IF
            m ++
            2dup 'th A !
            dup 'th B OFF
        THEN
        NIP                 ( i)
        dup 'th B ++ ;

He also introduces a test and normal branch at the end of the 
function.  (The branch takes place except at the first occurrence of 
each _x_.) 

In Example-1 for every _A[i]_ that is passed over in the search, 
there are two tests and one branch.  In Example-1a for every _A[i]_ 
that is passed over, there are two tests, a little extra processing, 
and a branch. 

Example-1 was important for stimulating the relaxing of structured 
programming restraints.  In strictly structured programming 
("strictured programming") a loop may have a single exit at the 
beginning or end, and a function may have a single exit at the end.  
Most of recent profane languages now allow multiple exits. 

In Classical Forth and Standard Forth a BEGIN-loop may have more than 
one exit.  A goto-free slightly improved equivalent of Example-1 can 
be written -- 

    : Example-1.2nd    ( x -- i )
        -1 BEGIN 1+         ( x i)
            dup m @ <
        WHILE
            2dup 'th A @ =
        UNTIL
        \  Found.
            NIP             ( i)
            dup 'th B ++
            EXIT
        THEN                ( x i)
        \  Not found.
        m ++
        2dup 'th A !
        NIP                 ( i)
        1 over 'th B ! ;

Knuth points out that the technique in all versions of Example 1 is 
almost never a good way to search an array for x.  Example 2 beats 
Example 1 because it makes the inner loop considerably faster. 

    : Example-2      ( x -- i )
        dup m @ 'th A !

        -1 BEGIN 1+             ( x i)
            2dup 'th A @ =
        UNTIL

        NIP                     ( i)
        dup m @ = IF
            m ++
            dup 'th B OFF
        THEN
        dup 'th B ++ ;

For every _A[i]_ passed over in the search, Example 1 does two tests 
and one branch.  Example 2 does one test and one branch. 

In Example-2a, Knuth uses gotos and labels to double up the testing 
within Example 2.  Once again in Forth, the gotos and labels are not 
needed. 

In Example-2a, for every two _A[i]_ passed over in the search, there 
are two tests and one branch.  Knuth estimates a 12 percent saving. 
My tests showed 15 percent faster. 

    : NOT  ( x -- flag )  S" 0= " EVALUATE ; IMMEDIATE

    : Example-2a       ( x -- i )
        dup m @ 'th A !

        -2 BEGIN 2 +        ( x i)
            2dup 'th A @ = NOT
        WHILE
            2dup 1+ 'th A @ =
        UNTIL
            1+
        THEN

        NIP                 ( i)
        dup m @ = IF
            m ++
            dup 'th B OFF
        THEN
        dup 'th B ++ ;
...
        Hash Coding

In Example 3, Knuth solves the problem of Example 1 and 2 using hash 
coding.  As in Example 1 he uses a goto and a label. Once again these 
are not needed with the multiple exit features of Forth. 

Here _x_ is never zero, and the _A_ array is initially erased. 
`#Array-Size` is somewhat larger than what the number of elements 
will be. `x Hash` returns a non-negative number less than 
`#Array-Size`.  Variable _m_ disappears. 

    : Example-3        ( x -- i )
        dup Hash            ( x i)
        BEGIN
            dup 'th A @
        WHILE
            2dup 'th A @ = NOT
        WHILE
            ORIF  #Array-Size  THEN
            1-
        REPEAT
            NIP             ( i)
            dup 'th B ++
            EXIT
        THEN                ( x i)
        2dup 'th A !
        NIP                 ( i)
        1 over 'th B ! ;

Knuth improves the efficiency of Example-3 by testing for a
match first.  Forth still does not need goto.

    : Example-3a       ( x -- i )
        dup Hash            ( x i)
        BEGIN
            2dup 'th A @ = NOT
        WHILE
            dup 'th A @
        WHILE
            ORIF  #Array-Size  THEN
            1-
        REPEAT
            2dup 'th A !
            dup 'th B OFF
        THEN
        NIP                ( i)
        dup 'th B ++ ;

Knuth improves the efficiency of Example-3 even more by eliminating 
the testing of the index for zero when the non-matching element is 
non-zero.  This keeps _A[0]_ equal to 0. 

    : Example-3b       ( x -- i )
        dup Hash            ( x i)
        BEGIN
            2dup 'th A @ = NOT
        WHILE
            dup 'th A @ IF
                1-
            ELSE
            dup 0= IF DROP
                #Array-Size 1-
            ELSE
                2dup 'th A !
                NIP         ( i)
                1 over 'th B !
                EXIT
            THEN THEN
        REPEAT              ( x i)
        NIP                 ( i)
        dup 'th B ++ ;

Example-3b is not easy to read, nor is Example-3.  I'll keep them in 
my garage until I need them. 

"Structured Programming with `goto` Statements" was first published 
26 years ago (1974).  I'm sure that it was instrumental in extending 
the collection of politically correct control structures. 

It's ironic that this first set of examples can all be done in Forth 
without goto and as efficiently as the goto versions. 

In today's languages, these examples are not candidates for GOTO. 

It may be different in Knuth's next set of examples.

        Text Scanning

Knuth's description --


  Suppose we are processing a stream of text; and that we want to 
  read and print the next character from the input; however, if that 
  character is a slash ("/") we want to "tabulate" instead (i.e., to 
  advance in the output to the next tab-stop position on the current 
  line ); however, two consecutive slashes means a "carriage return" 
  (i.e., to advance in the output to the beginning of the next line). 
  After printing a period (".") we also want to insert an additional 
  space in the output. 

    : CASE?         ( x y -- true | x false )
        S" over = dup IF NIP THEN " EVALUATE ; IMMEDIATE

    : Example-4     ( -- )
        Read-Char                   ( c)
        [CHAR] / CASE? IF           ( )
            Read-Char               ( c)
            [CHAR] / CASE? IF       ( )
                CR
                GOTO done
            THEN                    ( c)
            Tabulate
        THEN
        dup EMIT
        [CHAR] . = IF  SPACE  THEN  ( )
        LABEL done ;

Once again Forth does without goto.

    : Example-4     ( -- )
        Read-Char                   ( c)
        [CHAR] / CASE? IF           ( )
            Read-Char               ( c)
            [CHAR] / CASE? IF       ( )
                CR
                EXIT
            THEN                    ( c)
            Tabulate
        THEN
        dup EMIT
        [CHAR] . = IF  SPACE  THEN  ( )
        ;

Knuth comments


  In practice we occasionally run into situations where a sequence of 
  decisions is made via nested `if ... then ... else ...`s, and then 
  two or more of the branches merge into one. We can manage such 
  decision-table tasks without `goto`s by copying the common code 
  into each place, or by defining it as a procedure, but this does 
  not seem conceptually simpler than to make such cases `goto` a 
  common part of the program. 
]
...

        Tree Searching

Knuth's description


  This is part of the well-known "tree search and insertion" scheme, 
  where a binary search tree is being represented by three arrays. 
  _A[i]_ denotes the information stored at node number _i_, and 
  _L[i]_, _R[i]_ are the respective node numbers for the roots of 
  that node's left and right subtrees; empty subtrees are represented 
  by zero.  The program searches down the tree until finding an empty 
  subtree where is can be inserted. ... For convenience I have 
  assumed in this example that _x_ is not already present in the 
  search tree. 

    : Example-5  ( x -- j )

        0  \  Head of Tree.   ( x i)
        LABEL compare:left  LABEL compare:right
        2dup 'th A @ >
            dup 'th L @ IF
                'th L @
                GOTO compare:left
            ELSE
                Next-Node @ over 'th L !
                GOTO insert
            THEN
        ELSE
            dup 'th R @ IF
                'th R @
                GOTO compare:right
            ELSE
                Next-Node @ over 'th R !
                GOTO insert
            THEN
        THEN

        LABEL insert DROP       ( x)
        Next-Node @             ( x j)
        2dup 'th A !
        NIP                     ( j)
        dup 'th L OFF
        dup 'th R OFF
        Next-Node ++ ;

Knuth first eliminates `goto` by using a local variable.  I'll use 
the return stack to hold it.  This of course takes more processing, 
but I think it is a typical Forth approach. 

    : Example-5a           ( x -- j )
        TRUE >R
        0 BEGIN                 ( x i)
            R@
        WHILE
            2dup 'th A @ > IF
                dup 'th L @ IF
                    'th L @
                ELSE
                    Next-Node @ over 'th L !
                    FALSE R> DROP >R
                THEN
            ELSE
                dup 'th R @ IF
                    'th R @
                ELSE
                    Next-Node @ over 'th R !
                    FALSE R> DROP >R
                THEN
            THEN
        REPEAT R> DROP DROP     ( x)
        Next-Node @             ( x j)
        2dup 'th A !
        NIP                     ( j)
        dup 'th L OFF
        dup 'th R OFF
        Next-Node ++ ;

Knuth introduces C. T. Zahn's situation indicator as a new form of 
control structure.  This can be emulated with `GOTO`. Here is an 
illustration. 

    : Example-5b           ( x -- j )
        0 BEGIN                 ( x i)
            2dup 'th A @ > IF
                dup 'th L @ 0= IFGOTO left-leaf-hit
                'th L @
            ELSE
                dup 'th R @ 0= IFGOTO right-leaf-hit
                'th R @
            THEN
        AGAIN

        LABEL left-leaf-hit
            Next-Node @ over 'th L !
        GOTO insert
        LABEL right-leaf-hit
            Next-Node @ over 'th R !

        LABEL insert DROP       ( x)
        Next-Node @             ( x j)
        2dup 'th A !
        NIP                     ( j)
        dup 'th L OFF
        dup 'th R OFF
        Next-Node ++ ;

So far Knuth's examples have been concerned with removing gotos. 
Forth's multiple exits take care of most of his problems.  Example-5b 
is a good candidate for GOTO. 

This is a good place to stop in Knuth's article.  The rest of the 
examples are not conducive to being transcribed. The challenge to the 
reader is to give a goto-less version of Example-5b that's as easy to 
understand. 

Here is a direct transcription of Example-5b into Forth logic, using 
function call for `goto`s, and bottom-up logic replacing top-down 
logic. 

    : Insert-Node        ( x i -- j)
        DROP                    ( x)
        Next-Node @             ( x j)
        2dup 'th A !
        NIP                     ( j)
        dup 'th L OFF
        dup 'th R OFF
        Next-Node ++ ;

    : Left-Leaf-Hit      ( x i -- same )
        Next-Node @ over 'th L !,
        Insert-Node ;

    : Right-Leaf-Hit     ( x i -- same )
        Next-Node @ over 'th R !
        Insert-Node ;

    : Example-5b         ( x -- j )
        0 BEGIN               ( x i)
            2dup 'th A @ > IF
                dup 'th L @ 0= 
                    IF  Left-Leaf-Hit  EXIT THEN
                'th L @
            ELSE
                dup 'th R @ 0= 
                    IF  Right-Leaf-Hit  EXIT THEN
                'th R @
            THEN
        AGAIN ;

A final note from Knuth.

        Reduction of Complication


  There is one remaining use of `goto` for which I have never seen a 
  good replacement, and in fact it's a situation where I still think 
  `goto` is the right idea.  This situation typically occurs after a 
  program has made a multiway branch to a rather large number of 
  different but related cases. A little computation often suffices to 
  reduce one case to another; and when we've reduced one problem to a 
  simpler one, the most natural thing is for our program to `goto` 
  the routine that solves the simpler problem. 

\\   //   \\   //   \\   //   \\   //   \\   //   \\   //   \\   //   \\