unroff Programmer's Manual, section 7.

7. How Troff Input is Processed

To be able to write non-trivial event handling procedures, it helps to have a look at how troff input is processed, especially since the parser of unroff works somewhat differently than ordinary troff. In particular, the parser cannot blindly rescan the result of handlers for escape sequences or special characters, as these handlers will probably generate text in the target language that cannot be interpreted as troff input any longer. Here is a brief overview of the parsing process.

Each input line is first scanned for references to troff strings and number registers (this scanning pass will later be referred to as the ``expansion phase''). For each `\*' or `\n' sequence found in the input line, unroff checks whether a handler for the string or number register has been defined with defstring or defnumreg, and if this is the case, replaces the string or number register reference by the result (or value) of the handler. Otherwise, if a handler for the escape sequence `\*' or `\n' proper has been defined, that handler is called. Otherwise the reference is left untouched and scanning resumes behind it[note 1] . Comments are recognized in this phase, too, by calling the handler for the `\"' escape sequence if there is one.

Next, the parser checks whether the result of the first phase is a request or macro invocation (that is, begins with a period or an apostrophe). If this is the case, the arguments are parsed mimicking the behavior of ordinary troff. The rules for macro arguments are employed if a handler has been defined for the token after the period with defmacro, else the rules for requests are used. The handler for the macro or request is then used, or applied to the arguments if it is a procedure.

If the input line does not contain a request or macro invocation, it is scanned a second time to take care of escape sequences and special characters (for lack of a better term, we will call this phase ``escape parsing''). Every escape character reference, special character, and inline equation is replaced by the result (or value) of the event handler registered for it, or left in place if there is no handler. Character translations defined by means of defchar are also executed in this phase.

Finally, the result of the escape parsing phase or of the request or macro invocation is checked whether it constitutes the end of a sentence, and if so, the handler for this event is called (actually, in the former case, the check is applied before and after the escape parsing and must succeed both times). As the final step the line is output, and any handlers for the line event are invoked.

An important thing to note is that the arguments passed to a handler defined for a request or macro are not scanned for escape sequences and special characters. Therefore event procedures must explicitly parse their arguments if desired by calling the Scheme primitive parse (which will be described in the next section). Consider, for example, an event procedure associated with a macro ``IP'':

(defmacro 'IP
  (lambda (IP tag . indent)
    ...))

and a call to the macro with an argument containing a special character:

.IP \(bu

As the argument to the event procedure is only scanned for strings and number registers, the variable tag will be bound to the string ``\(bu''. Applying parse to the argument will turn it into whatever is the target language representation for the special character ``\(bu'' (that is, the result of the event handler for the special character). Whether or not arguments will have to be parsed depends on the particular request or macro; the procedure implementing the request ``.tm'', for instance, will print its ``raw'' argument (a sample event handler for the request ``.tm'' is supplied by unroff).

Markup created by unroff 1.0, March 21, 1996, net@informatik.uni-bremen.de