Wil Baden 1999-10-19
If you are like most of us, and need a few random numbers for a game or such, then a single linear congruential sequence (LCS) should be good enough. With a little more effort we can get millions and millions of "random" numbers.
TEXTAccording to Knuth, Chapter 3 of The Art of Computer Programming, LCS was introduced by D. H. Lehmer in 1949.
Quoting Knuth,
We choose four magic integers,_
m, the modulus; 0 < m.
a, the multiplier; 0 <= a < m.
c, the increment; 0 <= c < m.
X[0], the starting value; 0 <= X[0] < m.
The desired sequence of random numbers X[n] is then obtained by setting
X[n+1] = (a*X[n] + c) mod m
Knuth gives the following principles (paraphrased) for selecting those numbers.
i. The "seed" number X[0] may be chosen arbitrarily.
ii. The number m should be large, say at least 2^30. Conveniently it may be the computer's word size, since that makes the computation quite efficient.
16-bit word size cannot satisfy this principle.
In Forth we can write a LCS with m as the word size:
VARIABLE RAND-X : RAND-NEXT ( -- u ) RAND-X @ a * c + DUP RAND-X ! ;
Another approach is to use for m an easily referenced large prime within the word size. For 32-bit arithmetic, 2^31-1, which is 2147483647, is a popular choice. The value of c should be 0.
: RAND-NEXT ( -- u ) RAND-X @ a m */MOD DROP DUP RAND-X ! ;
iii. If m is a power of 2, pick a so that a mod 8 is 5. If m is a power of 10, choose a so that a mod 200 is 21.
This with c as chosen below ensures a cycle of m values that pass a test of "potency".
In Starting Forth, a is 31421.
iv. The multiplier a should preferably be chosen between .01*m and .99*m, and its binary or decimal digits should not have a simple, regular pattern.
31421 for a 32-bit word size fails this principle -- it's too small.
Knuth recommends a "haphazard" constant like 3141592621. I call this "Pi21". It's enough digits of pi, with 21 tacked on, to fill a 32-bit word. I think Starting Forth's 31421 was chosen the same way for 16-bit words.
I remember Pi21 by "Now I want a large container of coffee-21".
v. The value of c is immaterial when a is a good multiplier, except that c must have no factor in common with m.
So 1 and a look like good values for c.
This gives:
: RAND-NEXT ( -- u ) RAND-X @ 3141562621 * 1+ DUP RAND-X ! ;
or
: RAND-NEXT ( -- u ) RAND-X @ 1+ 3141562621 * DUP RAND-X ! ;
vi. When m is the word size the least significant digits of random numbers are not very random, so decisions based on the number should always be influenced primarily by the most significant digits.
In other words, don't use MOD
to select a value. The
Starting Forth function is good when m is the word size.
: RAND-UNIF ( u -- n ) RAND-NEXT UM* NIP ;
Starting Forth calls this CHOOSE
.
vii. The randomness in t dimensions is only one part in the t-th root of m.
Don't use a LCS for simulations requiring high resolution.
viii. At most m/1000 numbers should be generated; otherwise the future will behave more and more like the past.
For a 32-bit word size a new scheme or a new multiplier should chosen at least every two million random numbers.
For a 16-bit word size a new scheme or a new multiplier should chosen every few 64 or 65 random numbers.
From this, you can see that you can't get a good single LCS for 16-bit arithmetic. This can be fixed by using multiple LCSs or other methods. Later we'll give a multiple LCS definition for 16-bit Forth.
Here is an exhibition of LCSs that have been popular. I have assigned names for ease of reference here.
Definitions that do not return 0 should not be initialized with 0.
Some of the tests that have been made on the sequences are named, but details are postponed till later.
PI-RAND
is LCS with the multiplier based on PI. This is
given in the summary of LCSs in the last section of
chapter 3 of Knuth's The Art of Computer Programming,
all editions.
: PI-RAND-NEXT ( -- 0..4294967295 ) RAND-X @ 3141592621 * 1+ DUP RAND-X ! ;
SJ-RAND
was proposed by Lewis, Goodman, and Miller in the
IBM Systems Journal, in 1969.
It was used in APL and IMLS subroutine library. It was also an option in SwiftForth 1. The main reason for continued use is that the square of a is less than modulus m and it can be implemented without arithmetic overflow. However such small multipliers have known defects. (16807 is 7^5.)
: SJ-RAND-NEXT ( -- 1..2147483646 ) RAND-X @ 16807 2147483647 */MOD DROP DUP RAND-NEXT ! ;
EASY-RAND
was nominated by George Marsaglia, 1972, as a
candidate for the best multiplier, perhaps because
69069 is easy to remember.
: EASY-RAND-NEXT ( -- 0..4294967295 ) RAND-X @ 69069 * 1+ DUP RAND-NEXT ! ;
BS-RAND
uses the best spectral primitive root for modulus
2147483647. G. S. Fishman found it by brute force in 1986.
: BS-RAND-NEXT ( -- 1..2147483646 ) RAND-X @ 62089911 2147483647 */MOD DROP DUP RAND-X ! ;
EP-RAND
is an efficiently portable multiplier found by
Fishman in 1988.
: EP-RAND-NEXT ( -- 1..2147483646 ) RAND-X @ 48271 2147483647 */MOD DROP DUP RAND-X ! ;
EP2-RAND
is an efficiently portable multiplier found by
L'Ecuyer in 1988.
: EP-RAND-NEXT ( -- 1..2147483338 ) RAND-X @ 40692 2147483647 248 - */MOD DROP DUP RAND-X ! ;
SF-RAND
is the random number generator from Brodie's
Starting Forth. Of course with 16-bit arithmetic,
65535 AND
may be omitted.
: SF-RAND-NEXT ( -- 0..32767 ) RAND-X @ 31421 * 6927 + 65535 AND DUP RAND-NEXT ! ;
C-RAND
is the default random number generator for the
Standard C Library.
C-RAND-NEXT
is C's rand
.
: C-RAND-NEXT ( -- 0..32767 ) RAND-X @ 1103515245 * 12345 + DUP RAND-X ! 16 RSHIFT 32767 AND ;
RANDU
is the egregious RANDU of the '60s and '70s. It
must be initialized to odd values only. Note that the
multiplier in hex is 10003. For any three successive
values, 9X - 6Y + Z is 0 mod 2147487648.
: RANDU-NEXT ( -- 1..2147483647 ) RAND-X @ 65539 * 2147483647 AND DUP RAND-X ! ; Implementations
In the m-is-word-size definitions, the n low-order bits cycle in a 2^n period, as mentioned in (6) above.
( n is 1 ) MARKER ONCE : GO CR 17 0 DO PI-RAND-NEXT 1 AND . LOOP ; GO ONCE 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ( n is 2 ) MARKER ONCE : GO CR 17 0 DO PI-RAND-NEXT 3 AND . LOOP ; GO ONCE 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 ( n is 3 ) MARKER ONCE : GO CR 17 0 DO PI-RAND-NEXT 7 AND . LOOP ; GO ONCE 7 4 5 2 3 0 1 6 7 4 5 2 3 0 1 6 7 Etcetera.
In m-is-word-size definitions, the second half of the period is the same as the first half with sign-bit inverted. The quarter-cyles are also the same except for the two top bits.
Thanks to the first edition of The Art of Computer
Programming, I've used the function PI-RAND
(with various
names) in different languages for 30 years. But things
are bigger and faster than they used to be. Typically I
now use random numbers to test something a million times.
So I suggest for a simple, powerful, random number
generator for today, to combine EP-RAND
and EP2-RAND
,
as proposed by Knuth.
VARIABLE RAND-X VARIABLE RAND-Y : RAND-NEXT ( -- 1..2147483647 ) RAND-X @ 48271 2147483647 */MOD DROP DUP RAND-X ! RAND-Y @ 40692 2147483399 */MOD DROP DUP RAND-Y ! - DUP 0> NOT IF 2147483647 + THEN ;
According to Knuth the period is about 74 quadrillion.
RAND-X
and RAND-Y
should be initialized independently to
non-zero values for best results.
RAND-X
does not have the problem with low-order bits.
1 RAND-X ! 1 RAND-Y ! ( n is 1 ) MARKER ONCE : GO CR 17 0 DO RAND-NEXT 1 AND . LOOP ; GO ONCE 1 1 0 1 0 1 1 1 0 1 1 0 0 0 1 1 0 ( n is 2 ) MARKER ONCE : GO CR 17 0 DO RAND-NEXT 3 AND . LOOP ; GO ONCE 3 2 0 3 3 3 0 3 0 0 2 1 2 0 3 2 2 ( n is 3 ) MARKER ONCE : GO CR 17 0 DO RAND-NEXT 7 AND . LOOP ; GO ONCE 5 7 5 3 5 6 0 3 5 2 5 5 3 4 3 2 4
With 16-bit arithmetic Forth, three LCSs should be combined. The following is adapted from a suggestion by L'Ecuyer in 1988.
VARIABLE RAND-X VARIABLE RAND-Y VARIABLE RAND-Z : RAND-NEXT ( -- 1..32363 ) RAND-X @ 157 32363 */MOD DROP DUP RAND-X ! RAND-Y @ 146 31727 */MOD DROP DUP RAND-Y ! - DUP 0> NOT IF 32363 + THEN RAND-Z @ 142 31657 */MOD DROP DUP RAND-Z ! - DUP 0> NOT IF 32363 + THEN ;
The exhibits here were originally presented in profane
languages -- outside of the temple of Forth. Some of those
languages do not have a double accumulator for
multiplication, and may fail with integer overflow. When
a times a is less than m and the remainder m
divided by a is less than the quotient then a m */MOD
DROP
can be replaced by the possibly more efficient:
[ m a / ] LITERAL /MOD [ m a MOD ] LITERAL * >R a * R> - DUP 0< IF m + THEN
All intermediate results will be in the official range of integers.
Many conversions from Fortran or another profane language to
Forth use a precomputed quotient and remainder. For a Forth
definition, */MOD DROP
can be used to simplify.
You have been warned not to use MOD
to select random numbers.
With a small range, the problem is neglible, but the following
shows the danger.
1 RAND-X ! 1 RAND-Y ! CREATE BINS 10 CELLS ALLOT : MIL 1000000 * ; MARKER ONCE : GO CR CR ." Distribution Using MOD " CR BINS 10 CELLS ERASE 20000 0 DO \ 9 Digit Number RAND-NEXT 1000 MIL MOD ( u) \ Into 10 Bins 100 MIL / CELLS BINS + 1 SWAP +! ( ) LOOP ; GO ONCE MARKER ONCE : GO 10 0 DO CR I CELLS BINS + @ DUP 4 .R SPACE 50 / [CHAR] * EMITS LOOP ; GO ONCE Distribution Using MOD 2805 ******************************************************** 2288 ********************************************* 1816 ************************************ 1871 ************************************* 1828 ************************************ 1807 ************************************ 1908 ************************************** 1940 ************************************** 1884 ************************************* 1853 ************************************* : RAND-UNIF ( +n -- u ) RAND-NEXT 2* UM* NIP ; MARKER ONCE : GO CR CR ." Distribution Using Multiply " CR BINS 10 CELLS ERASE 20000 0 DO \ 9 Digit Number 1000 MIL RAND-UNIF ( u) \ Into 10 Bins 100 MIL / CELLS BINS + 1 SWAP +! ( ) LOOP ; GO ONCE MARKER ONCE : GO 10 0 DO CR I CELLS BINS + @ DUP 4 .R SPACE 50 / [CHAR] * EMITS LOOP ; GO ONCE Distribution Using Multiply 1977 *************************************** 2042 **************************************** 1974 *************************************** 2024 **************************************** 2021 **************************************** 2007 **************************************** 1981 *************************************** 1991 *************************************** 2001 **************************************** 1982 ***************************************