ANEW --SPWGIF-- \ Wil Baden 2003-02-22
\ *******************************************************************
\ * *
\ * Wil Baden 2000-07-02 *
\ * *
\ * Structured Programming with GOTOs in Forth *
\ * *
\ * Donald E. Knuth, "Structured Programming with `goto` *
\ * Statements" (1974), reprinted in _Literate Programming_ *
\ * (1992), discusses situations where gotos are appropriate. In *
\ * this page his examples have been transcribed from pseudo Algol *
\ * to ortho Forth. *
\ * *
\ *******************************************************************
\\
Knuth's examples use arrays extensively. I define an array with the
pattern:
CREATE Array Size CELLS ALLOT
The address of the _i_'th element _&Array[i]_ is:
i 'th Array
The value of the _i_'th element _Array[i]_ is --
i 'th Array @
`'th` is defined for 32-bit arithmetic as --
: 'th ( i "arrayname" -- addr )
S" 2 LSHIFT " EVALUATE
BL WORD COUNT EVALUATE
S" + " EVALUATE
; IMMEDIATE
For 16-bit arithmetic, replace `2 LSHIFT` with `1 LSHIFT` or `2*`.
A macro is used here and in other definitions for efficiency. With
minimum peep-hole optimization, I expect `2 'th A` to become `A+8`.
The definition of `OFF` hopes to be optimized.
: OFF ( addr -- ) S" 0 SWAP ! " EVALUATE ; IMMEDIATE
`++` increments the contents of an address.
: ++ ( addr -- ) S" 1 SWAP +! " EVALUATE ; IMMEDIATE
If you already have efficient versions, use them.
A Searching Example
Knuth's description
Let's suppose that we want to search a table _A[0] ... A[m-1]_ of
distinct values, in order to find where a given value _x_ appears;
if it is not present in the table, we want to insert it as an
additional entry. Let's suppose further that there is another
array _B_, where _B[i]_ equals the number of times we have searched
for the value _A[i]_.
I have changed _A[1] ... A[m]_ to _A[0] ... A[m-1]_ here and
similarly with other examples.
: Example-1 ( x -- i )
0 BEGIN ( x i)
dup m @ <
WHILE
2dup 'th A @ = IFGOTO found
1+
REPEAT
\ Not found.
m ++
2dup 'th A !
dup 'th B OFF
LABEL found
NIP ( i)
dup 'th B ++ ;
In 1971 Knuth and Robert W. Floyd gave that as an "example of a
typical program for which the ordinary capabilities of `while` and
`if` statements are inadequate."
To remove the goto, Knuth uses "short-circuit and". This and
"short-circuit or" are defined here.
: ANDIF ( x -- ) S" dup IF DROP " EVALUATE ; IMMEDIATE
: ORIF ( x -- ) S" dup 0= IF DROP " EVALUATE ; IMMEDIATE
: Example-1a ( x -- i )
0 BEGIN ( x i)
dup m @ <
ANDIF
2dup 'th A @ = NOT
THEN
WHILE 1+ REPEAT
dup m @ = IF
m ++
2dup 'th A !
dup 'th B OFF
THEN
NIP ( i)
dup 'th B ++ ;
He also introduces a test and normal branch at the end of the
function. (The branch takes place except at the first occurrence of
each _x_.)
In Example-1 for every _A[i]_ that is passed over in the search,
there are two tests and one branch. In Example-1a for every _A[i]_
that is passed over, there are two tests, a little extra processing,
and a branch.
Example-1 was important for stimulating the relaxing of structured
programming restraints. In strictly structured programming
("strictured programming") a loop may have a single exit at the
beginning or end, and a function may have a single exit at the end.
Most of recent profane languages now allow multiple exits.
In Classical Forth and Standard Forth a BEGIN-loop may have more than
one exit. A goto-free slightly improved equivalent of Example-1 can
be written --
: Example-1.2nd ( x -- i )
-1 BEGIN 1+ ( x i)
dup m @ <
WHILE
2dup 'th A @ =
UNTIL
\ Found.
NIP ( i)
dup 'th B ++
EXIT
THEN ( x i)
\ Not found.
m ++
2dup 'th A !
NIP ( i)
1 over 'th B ! ;
Knuth points out that the technique in all versions of Example 1 is
almost never a good way to search an array for x. Example 2 beats
Example 1 because it makes the inner loop considerably faster.
: Example-2 ( x -- i )
dup m @ 'th A !
-1 BEGIN 1+ ( x i)
2dup 'th A @ =
UNTIL
NIP ( i)
dup m @ = IF
m ++
dup 'th B OFF
THEN
dup 'th B ++ ;
For every _A[i]_ passed over in the search, Example 1 does two tests
and one branch. Example 2 does one test and one branch.
In Example-2a, Knuth uses gotos and labels to double up the testing
within Example 2. Once again in Forth, the gotos and labels are not
needed.
In Example-2a, for every two _A[i]_ passed over in the search, there
are two tests and one branch. Knuth estimates a 12 percent saving.
My tests showed 15 percent faster.
: NOT ( x -- flag ) S" 0= " EVALUATE ; IMMEDIATE
: Example-2a ( x -- i )
dup m @ 'th A !
-2 BEGIN 2 + ( x i)
2dup 'th A @ = NOT
WHILE
2dup 1+ 'th A @ =
UNTIL
1+
THEN
NIP ( i)
dup m @ = IF
m ++
dup 'th B OFF
THEN
dup 'th B ++ ;
...
Hash Coding
In Example 3, Knuth solves the problem of Example 1 and 2 using hash
coding. As in Example 1 he uses a goto and a label. Once again these
are not needed with the multiple exit features of Forth.
Here _x_ is never zero, and the _A_ array is initially erased.
`#Array-Size` is somewhat larger than what the number of elements
will be. `x Hash` returns a non-negative number less than
`#Array-Size`. Variable _m_ disappears.
: Example-3 ( x -- i )
dup Hash ( x i)
BEGIN
dup 'th A @
WHILE
2dup 'th A @ = NOT
WHILE
ORIF #Array-Size THEN
1-
REPEAT
NIP ( i)
dup 'th B ++
EXIT
THEN ( x i)
2dup 'th A !
NIP ( i)
1 over 'th B ! ;
Knuth improves the efficiency of Example-3 by testing for a
match first. Forth still does not need goto.
: Example-3a ( x -- i )
dup Hash ( x i)
BEGIN
2dup 'th A @ = NOT
WHILE
dup 'th A @
WHILE
ORIF #Array-Size THEN
1-
REPEAT
2dup 'th A !
dup 'th B OFF
THEN
NIP ( i)
dup 'th B ++ ;
Knuth improves the efficiency of Example-3 even more by eliminating
the testing of the index for zero when the non-matching element is
non-zero. This keeps _A[0]_ equal to 0.
: Example-3b ( x -- i )
dup Hash ( x i)
BEGIN
2dup 'th A @ = NOT
WHILE
dup 'th A @ IF
1-
ELSE
dup 0= IF DROP
#Array-Size 1-
ELSE
2dup 'th A !
NIP ( i)
1 over 'th B !
EXIT
THEN THEN
REPEAT ( x i)
NIP ( i)
dup 'th B ++ ;
Example-3b is not easy to read, nor is Example-3. I'll keep them in
my garage until I need them.
"Structured Programming with `goto` Statements" was first published
26 years ago (1974). I'm sure that it was instrumental in extending
the collection of politically correct control structures.
It's ironic that this first set of examples can all be done in Forth
without goto and as efficiently as the goto versions.
In today's languages, these examples are not candidates for GOTO.
It may be different in Knuth's next set of examples.
Text Scanning
Knuth's description --
Suppose we are processing a stream of text; and that we want to
read and print the next character from the input; however, if that
character is a slash ("/") we want to "tabulate" instead (i.e., to
advance in the output to the next tab-stop position on the current
line ); however, two consecutive slashes means a "carriage return"
(i.e., to advance in the output to the beginning of the next line).
After printing a period (".") we also want to insert an additional
space in the output.
: CASE? ( x y -- true | x false )
S" over = dup IF NIP THEN " EVALUATE ; IMMEDIATE
: Example-4 ( -- )
Read-Char ( c)
[CHAR] / CASE? IF ( )
Read-Char ( c)
[CHAR] / CASE? IF ( )
CR
GOTO done
THEN ( c)
Tabulate
THEN
dup EMIT
[CHAR] . = IF SPACE THEN ( )
LABEL done ;
Once again Forth does without goto.
: Example-4 ( -- )
Read-Char ( c)
[CHAR] / CASE? IF ( )
Read-Char ( c)
[CHAR] / CASE? IF ( )
CR
EXIT
THEN ( c)
Tabulate
THEN
dup EMIT
[CHAR] . = IF SPACE THEN ( )
;
Knuth comments
In practice we occasionally run into situations where a sequence of
decisions is made via nested `if ... then ... else ...`s, and then
two or more of the branches merge into one. We can manage such
decision-table tasks without `goto`s by copying the common code
into each place, or by defining it as a procedure, but this does
not seem conceptually simpler than to make such cases `goto` a
common part of the program.
]
...
Tree Searching
Knuth's description
This is part of the well-known "tree search and insertion" scheme,
where a binary search tree is being represented by three arrays.
_A[i]_ denotes the information stored at node number _i_, and
_L[i]_, _R[i]_ are the respective node numbers for the roots of
that node's left and right subtrees; empty subtrees are represented
by zero. The program searches down the tree until finding an empty
subtree where is can be inserted. ... For convenience I have
assumed in this example that _x_ is not already present in the
search tree.
: Example-5 ( x -- j )
0 \ Head of Tree. ( x i)
LABEL compare:left LABEL compare:right
2dup 'th A @ >
dup 'th L @ IF
'th L @
GOTO compare:left
ELSE
Next-Node @ over 'th L !
GOTO insert
THEN
ELSE
dup 'th R @ IF
'th R @
GOTO compare:right
ELSE
Next-Node @ over 'th R !
GOTO insert
THEN
THEN
LABEL insert DROP ( x)
Next-Node @ ( x j)
2dup 'th A !
NIP ( j)
dup 'th L OFF
dup 'th R OFF
Next-Node ++ ;
Knuth first eliminates `goto` by using a local variable. I'll use
the return stack to hold it. This of course takes more processing,
but I think it is a typical Forth approach.
: Example-5a ( x -- j )
TRUE >R
0 BEGIN ( x i)
R@
WHILE
2dup 'th A @ > IF
dup 'th L @ IF
'th L @
ELSE
Next-Node @ over 'th L !
FALSE R> DROP >R
THEN
ELSE
dup 'th R @ IF
'th R @
ELSE
Next-Node @ over 'th R !
FALSE R> DROP >R
THEN
THEN
REPEAT R> DROP DROP ( x)
Next-Node @ ( x j)
2dup 'th A !
NIP ( j)
dup 'th L OFF
dup 'th R OFF
Next-Node ++ ;
Knuth introduces C. T. Zahn's situation indicator as a new form of
control structure. This can be emulated with `GOTO`. Here is an
illustration.
: Example-5b ( x -- j )
0 BEGIN ( x i)
2dup 'th A @ > IF
dup 'th L @ 0= IFGOTO left-leaf-hit
'th L @
ELSE
dup 'th R @ 0= IFGOTO right-leaf-hit
'th R @
THEN
AGAIN
LABEL left-leaf-hit
Next-Node @ over 'th L !
GOTO insert
LABEL right-leaf-hit
Next-Node @ over 'th R !
LABEL insert DROP ( x)
Next-Node @ ( x j)
2dup 'th A !
NIP ( j)
dup 'th L OFF
dup 'th R OFF
Next-Node ++ ;
So far Knuth's examples have been concerned with removing gotos.
Forth's multiple exits take care of most of his problems. Example-5b
is a good candidate for GOTO.
This is a good place to stop in Knuth's article. The rest of the
examples are not conducive to being transcribed. The challenge to the
reader is to give a goto-less version of Example-5b that's as easy to
understand.
Here is a direct transcription of Example-5b into Forth logic, using
function call for `goto`s, and bottom-up logic replacing top-down
logic.
: Insert-Node ( x i -- j)
DROP ( x)
Next-Node @ ( x j)
2dup 'th A !
NIP ( j)
dup 'th L OFF
dup 'th R OFF
Next-Node ++ ;
: Left-Leaf-Hit ( x i -- same )
Next-Node @ over 'th L !,
Insert-Node ;
: Right-Leaf-Hit ( x i -- same )
Next-Node @ over 'th R !
Insert-Node ;
: Example-5b ( x -- j )
0 BEGIN ( x i)
2dup 'th A @ > IF
dup 'th L @ 0=
IF Left-Leaf-Hit EXIT THEN
'th L @
ELSE
dup 'th R @ 0=
IF Right-Leaf-Hit EXIT THEN
'th R @
THEN
AGAIN ;
A final note from Knuth.
Reduction of Complication
There is one remaining use of `goto` for which I have never seen a
good replacement, and in fact it's a situation where I still think
`goto` is the right idea. This situation typically occurs after a
program has made a multiway branch to a rather large number of
different but related cases. A little computation often suffices to
reduce one case to another; and when we've reduced one problem to a
simpler one, the most natural thing is for our program to `goto`
the routine that solves the simpler problem.
\\ // \\ // \\ // \\ // \\ // \\ // \\ // \\