www.jupiter-ace.co.uk

Previous Page > Index of General Forth Information > Going Forth - Part 2


Going Forth - part 2


Computing Today February 1982 page 61

D S Peckett GOING FORTH
Part two of our programming series delves into the way
you define new words for your language together with
comparative testing.

Last month we had a first look at FORTH. The language turned out to be very different from BASIC and to have many advantages, such as being much faster in operation. On the other hand, it became clear that you can really only appreciate FORTH if you already know something about computer programming, since its effective use does demand more knowledge of a computer's hardware than some other languages do.
In the first article, we concentrated on finding out just what FORTH is, and did not go into any great depth on how to use the language effectively. All the operations which we performed were in the immediate mode, we did not even try to write any programs. This month we will go on to have a closer look at the use of FORTH by studying how new words are defined. We will also look at the use of variables and constants, the way that conditional operators work and some of the language's looping structures.
If you missed the first part of this series, you should note the following conventions that we will be using:
a. Any FORTH function, be it an operator, 'subroutine' identifier, program control structure, etc is called a 'word'.
b. Normally, FORTH words will be obvious from the context but, whenever there might be any confusion, they will be enclosed in double quotes. The quotes are not part of the word.
c. In the sections which show a dialogue with the computer, the computer's output is underlined.

Creating FORTH
Words
One of the most remarkable features of FORTH is its ability to be user-extended by the addition of new words. A typical, newly- delivered, FORTH system contains about 100-150 usable words. Writing programs, however, depends heavily on using these words to define new words, which   can   be used   to define yet more words, etcetera!
Eventually, one hopes, the whole function of the program will be represented by a single word. Enter it, and the FORTH interpreter dips up and down through the dictionary, executing the hierarchy of words that went into defining the program.
So, how do we define a new word? By using the DEFINING WORDS ":" and ";" - the form of a new word definition is:

<wordname> <action> ;

For example, it might be that we often have to calculate the cube of the number on top of the stack. It is easy to define a new word to do the job:

: CUBE DUP DUP * * ; OK

Having done that, we can calculate and print a cube whenever we want, eg:

4 CUBE . 64 OK

This is identical to:

4 DUP DUP * * - 64 OK

The definition of CUBE above is called, for obvious reasons, a 'colon definition'. As in any FORTH statement, it is essential to leave at least one space between each word, and it is conventional to leave at least three after the word name in program listings in order to make them easier to follow.
Since a word can be any combination of ASCII characters, apart from spaces, it is possible to redefine existing FORTH words via colon definitions - the words take on the meaning of the last definition. If we were that daft, we could redefine, say, the 'PRINT' word "." to have the meaning of CUBE, Some, but not all, FORTH systems warn you if you try to re-define an existing word.
Once a   word like   CUBE   has been defined,   it   has exactly   the same status as any other word, and can be used in new definitions.
 
For example, we might want to regularly compute and print (3x3 + 2x - 7), where x can be entered from the keyboard. Define another word:

: CUBIC
DUP CUBE 3 * SWAP 2 * + 7 - . ;

and the job is done. Fig, 1 shows the action of CUBIC during:

10 CUBIC 3013 OK

You'll realise, I hope, that we could have defined CUBIC without CUBE by merely inserting the CUBE code into the CUBIC definition. In fact, a colon definition could be up to 1024 characters (ie one screen) long.

Defining Good Practice
It is good FORTH programming practice, however, to use lots of short definitions 4.. definition should rarely need to lie more than two to three lines long. By breaking the code up into lots of small blocks, it is much easier to debug a program, since each new word can, and should, be tested in isolation, using FORTH'S immediate mode. In fact, it is this ease of testing compiled program segments that helps to make FORTH program development remarkably quick and painless. The short definition approach has the additional benefit of making it much easier to read a program.
A word can be defined in either the immediate mode, as we have done here, or as part of a program. Either way, it has the same status, and can be used afterwards in either immediate or program modes.
Every time that you define a new word, extra code is added to the FORTH dictionary. Eventually, the dictionary could grow high enough
 


Fig. 2. Fetching a variable's value with the '@' function.
Computing Today February 1982 page 62

 
in memory to meet the stack coming down, when the result would be a crashed system.
For this reason, and particularly when you are developing new code, it can be useful to erase all your work from the dictionary, and reset the system pointers. The job is done

FORGET <word>

This erases <word>, and all the I words defined after it, from the dictionary. The normal convention is to start a FORTH session with the dummy definition:

: TASK ; OK

The work can then be erased by . FORGET TASK. FORTH programs often start with a TASK definition, and finish with FORGET TASK to free the system for other work.
You will soon know if you try to use a word that has been FORGOT-ten, because you will get an error message of the form:

CUBE CUBE?

When you write FORTH programs, as opposed to lines of immediate code, the colon definitions and so on are written, directly into SCREENs, the basic program storage units. FORTH incorporates a line-oriented, editor to make this lob easy. MMSFORTH also has a powerful screen-oriented editor for program construction and modification.
Whichever form of editor is used, once a system has been created, or loaded from tape or disc, it can either be edited, or else the LOAD word can be used to compile a screen, or series of screens. The compilation puts all the colon definitions into the dictionary, and immediately executes any non definition lines. Thus, if the screens(s) contains a program end with the programs name, the program is compiled and run in one operation.

Variables And Constants
So far, we have only looked at ways of putting numbers onto the stack and manipulating them once they are there. As far as possible, FORTH programs should use only the stack, since it is the best way of exploiting the speed of the language, but a few well-chosen variable names can also help to make the program easier to follow. The use of variables and constants can also help to avoid making the stack too complicated - if you are juggling lots of different number on the stack, things can get a little hairy - by providing somewhere to keep information  when it   is   not
being manipulated.
Like Pascal, and unlike BASIC, variables and constants must be defined before they are used for the first time; they can, however, be defined at any point in a program. Their definitions use the forms:

12 CONSTANT DOZEN OK
   0 VARIABLE SCORE OK

The number at the beginning of the definition sets the constant value, or the initial value of the variable (the value it has at the moment that it is defined). "CONSTANT" and "VARIABLE" are Defining Words that tell the compiler just what to do while, in the usual back-to-front way of FORTH, the parameter names come last.
You should always choose the name of the constant or variable to have real meaning and to help you to understand the program. There is no need to cut down on characters to save RAM because FORTH is compiled and the name of every word is saved in the same way, no matter how many characters it has. It is worth noting here that the way that the system stores word names can cause problems.
Every FORTH word is identified internally by two items of data — its first three characters and the total number of characters in the name. Thus '1TOTAL' and '2TOTAL' are different, but the system cannot distinguish between 'TOTAL1' and 'TOTAL2' - it will always use the one that was defined last. Alteatively, 'TOTAL1' and 'TOTAL1*' are different (but needlessly confusing). The name of a word can use any character except a space.
Having digressed, let's return to the subject of constants and variables. How do we use them? Constants are easy; whenever its name is used, then the value of the constant goes straight on to the top of the stack. Thus:

DOZEN . 12 OK

The action of variables is, however, rather more complicated. Whenever a variable name is used, then FORTH actually puts the variable's address on top of the stack. Remember also that data is normally stored in two-byte words; the address that is put on the stack will thus be that of the first of the pair of bytes. In a Z80-based system like MMSFORTH, that byte will hold the eight least-significant bits of the data, but you do not necessarily have to worry about that sort of detail. Suppose that the system compiled   'SCORE'
to lie at addresses 24000 and 24001. Then:

SCORE . 24000 OK

Obviously, the address at which a variable is stored is not a great deal of immediate use and so FORTH provides ways of manipulating the data.
"!" (pronounced 'store') puts the value that is second on the stack (2OS) into the address represented by the value that is on top of the stack (TOS). The two numbers are erased from the stack:

50 SCORE ! OK

leaves SCORE set to 50.
"?" treats the number (two bytes remember) that is TOS as an address and prints the two-byte number held at that address. Thus

SCORE ? 50 OK

The address is erased from the stack, although the value that is held in the variable is not altered.
Finally, "@" (pronounced 'fetch') treats the number on the top of the stack as an address, arid replaces it with the number held at that address. For example:

SCORE @ 2 * 100 OK (see Fig. 2)

To bring all these points together, suppose that the variable SCORE (any variable is also a FORTH word) must be incremented by 1, and the new value both printed and saved:

SCORE DUP @ 1 + DUP . SWAP ! 51 OK

Figure 3 shows what is happening note that the "." does not end the processing, which carries right on as soon as the number has been printed.
As another example of the use of variables, take the standard quadratic formula:
ax2 + bx + c

If a, b and c are held as variables A, B and C, how do we calculate this expression in FORTH? Assume that x is already on the stack; Vie can then use:

DUP A @ * B @ + * C @ + OK

This actually calculates the expression as ((A*X + B)*X+ C) - Fig. 4. shows what happens.
A warning from bitter experience - I must stress that using a variable's name only puts its address on the stack (unlike BASIC, Pascal, FORTRAN et al) — forget that and you will get some very strange results. It is also worth   remembering   that   the

Computing Today February 1982 page 63

GOING FORTH
that the operators such as "!" and "@" aren't limited to use with variables - they will take whatever is on top of the stack and treat it as an address. This can be useful, since it gives you access to anywhere in memory. For instance, a TRS-80 keeps a pointer to the top of memory at addresses 16561 and 16562; you can adjust it to whatever you want from FORTH by:

<nnnn> 16561 ! OK

where <nnnn> represents whatever number you choose. (Actually, you will probably crash the system by altering its pointers like this, but that's another story.) If you like, you can think of "!" and "@" as being equivalent to a NASCOM's 'DOKE' and 'DEEK' respectively.
Sometimes, of course, you may only wish to move a single byte and not the two-byte words we have met so far. FORTH meets this need with "C!, "C@" and "C?". The first one puts the lower byte of the data that is 2OS into the address at TOS, while the second and third respectively fetch and print the single byte pointed to by the TOS address. If "C@" is used to fetch a byte, it is actually placed on the stack as the lower byte of a two-byte word in which the higher byte is set to zero. These character-oriented words are equivalent to BASIC's 'POKE' and 'PEEK' and could be used, for example, to set a TRS-80's automatic printer Form/Feed to 80 lines/page:

80 16424 C ! OK

In practice, though, they are of most use in string handling, where each character is stored in a single byte.

Conditional Operators
In FORTH
Like any other computer languages FORTH has a number of conditional operators. They all operrate on the top one or two items in the stack, and replace the
tested item(s) with a single condition flag. The convention is that if the result of the test is TRUE, then the TOS contains '1'; if it is FALSE, then TOS is zero. In practice, most, if not all, FORTHs will interpret any non-zero number as TRUE.
The standard operators are:

"<". This compares the top two items on the stack and sets TRUE if the TOS is larger than the 20S. Figure 5 shows the action of:

5 6 7 8 <

">". This is the opposite of "<", setting TRUE if the TOS is smaller than the 20S. For example, Fig. 6. describes:

5 6 7 8 >

"=". Not surprisingly this sets TRUE if the top two items are equal. "0= ". This word operates on the TOS only, replacing it with '1' if it is equal to zero. Effectively, it has the colon definition:

0= 0 = ;

"0<". This is the last standard conditional operator. It gives the result TRUE if the TOS is negative.
These are the five basic operators, but it is obviously easy to define others if you need them. You might want:

: 0> 0 > ;
or  : <> = 0 = ;

If the "<>" definition is unclear, look at Fig. 7. for

5 6 7 8 <>

The "0=" inverts the TRUE/FALSE condition on TOS. Those of you who have met FORTH before will recognize that there are a number of other ways of doing this, but they use words which we have not yet seen.

Branches And Loops
In FORTH
Any realistic computer program
Fig. 7. In true comparative form you can also test for inequality between TOS and 20S.
makes use of conditional, branching, structures, and of iterative (looping) procedures. In addition, some languages, such as BASIC, force the use of unconditional jumps (GOTOs), leading to poorly-structured programs. FORTH is incapable of providing unconditional jumps, which must be a good thing, but does give the conditional IF...THEN...ELSE structure and three kinds of loop. Together, they make it an excellent language for writing structured software.
Two of the loop structures are conditional, and we will study them in next month's article. Like the IF structure, they rely on the use of conditional operators such as those we have just looked at.

Conditional Branching Structures: BASIC programmers will know the language's

IF <condition> THEN <operation 1>
ELSE <operation 2>

Its exact counterpart in FORTH is:

<condition> IF <operation 1> ELSE
<operation 2> THEN <continue>

You can see that this has the usual back-to-front sequence imposed by RPN.
The TOS is initially set to TRUE/FALSE by <condition>. If TRUE, then <operationl> is performed, but if the test gave FALSE, then <operation2> happens. "THEN" defines the end of the conditional sequence and, no matter what the result of the test, the program always resumes at <continue>. As in most languages, the 'ELSE <operation2>' part is optional.
An important point about this structure is  that, like   all   FORTH


Computing Today February 1982 page 64

GOING FORTH
looping and branching structures, it can only be used in a colon definition. IF, THEN and ELSE are defining words which must be compiled before they can be used; if you try to use them in the immediate mode, you will get an error message. Obviously, once you have defined a word using them, it can be used immediately.
As an example, let's define "10PRINT" to be a word that prints only numbers that are divisible by 10; if they are not, they are multiplied by 10 and then printed:

: 10PRINT DUP DUP 10 / 10 = IF .
ELSE 10 * THEN ;

The block of code before the IF sets up the condition, while preserving a copy of the original number — it relies for its action on FORTH's integer arithmetic.
Having defined the word, we can use it:

18720 10PRINT 18720 OK
537 10PRINT 5370 OK

There are neater ways to define '10PRINT' — how would you improve the definition?
Finite   Loops: There   are   few
programs which do riot contain a function that has to be repeated a defined number of times. In BASIC we use

100 FOR 1=1 TO 10 STEP 1
:
200 NEXT I

The equivalent FORTH construct is:

<upper limit> <start point> DO
<operation> LOOP <continue>

This uses an index, originally set to <start-point>, and increments it by 1 on each loop until it is equal to <upper-limit> , when it stops and picks up the program at <continue>. On each pass through the loop it performs <operation>, which will be executed at least once but will not occur on the iteration in which the index reaches or passes <upper-limit> - it goes up to one less. In this respect it is different from BASIC, so beware!
Inside the loop, it is possible to get a copy of the index by using the word I. If you are using nested loops, "J" will return the index of the loop immediately outside the one in which the word is used, A
warning: the "I"  and "J" mechanisms only work if the words are used on the same 'level' as the DO...LOOP. If you are defining extra words to satisfy <operation>, they should not use "I" or "J". Get the index into the stack before you use the extra words.
Although DO...LOOP must be used in a colon definition, <upperlimit> and <start-point> can he set up at any time — they are simply the top two items on the stack when DO executes.
For an example, let's define the word SQUAREPRINT which will print the numbers from 1 to any given value, and their squares, one pair to a line:

: SQUARSPRINT 1 + 1 CR DO I I I . *
. CR LOOP ;
Conclusion
Next month, we will really get into the construction of programs, with a closer look at how they are actually put together, and the best way of exploiting the features of the language. We will also see the two conditional loops that FORTH provides, and look in more detail at the significance of the fact that all FORTH systems use two stacks.