unroff-html-ms - back-end to translate `ms' documents to HTML 2.0


unroff [ -fhtml ] [ -ms ] [ file | option... ]


When called with the -fhtml and -ms options, the troff translator unroff loads the back-end for converting ``ms'' documents to the Hypertext Markup Language (HTML) version 2.0.

Please read unroff(1) first for an overview of the Scheme-based, programmable troff translator and for a description of the generic options that exist in addition to -f and -m. The translation of basic troff requests, special characters, escape sequences, etc. as well as the HTML-specific options are described in unroff-html(1). For information about extending and programming unroff also refer to the Unroff Programmer's Manual.


The -ms extension provides a number of keyword/value options in addition to those listed in unroff(1) and unroff-html(1):
signature (string)
If non-empty, the value of this option together with a <hr> tag is appended to each HTML output file created. The substitute Scheme primitive (as described in the Programmer's Manual) is applied to the value of the option, so that date, time, environment variables, etc. can be interpolated.
split (integer)
This option specifies whether to split the output document into individual files for each major section. If a positive integer level is assigned to the option, a new output file is opened for each numbered header (.NH request) with a level equal to or numerically less than level. Use of this feature requires that the document option has bee set, as otherwise the HTML document is sent to standard output. The default value is 0, i.e. all sections will be written to a single file.
toc (boolean)
If true, a table of contents with a hypertext link for each section is generated automatically and inserted after the front matter (title, author information, abstract) and before the first section. Use of this feature requires a non-zero value for the split option. The default is to produce a table of contents if split is non-zero.
toc-header (string)
This option defines the contents of the <h2> header element prepended to an automatically generated table of contents. Its value is subject to a call to substitute. The default is the string ``Table of Contents''.
pp-indent (integer)
The number of non-breakable spaces (as specified by the predefined Scheme variable nbsp) to generate for a paragraph created by the .PP macro. The default is 3. This option, as well as signature, is typically set in the user-preferences file ~/.unroff, or in a document-specific Scheme file or at the beginning of the document proper.
footnotes-header (string)
The contents of the <h2> header element prepended to the footnotes section that is appended to the document if any footnotes were used, and that also appears in the automatically generated table of contents. As with all string option listed in this section, the substitute primitive is applied to the option's value. The default is the string ``Footnotes''.
footnote-reference (string)
This option controls the text generated by each use of the variable `\**', which produces a footnote (hypertext) reference. Its value is passed to a call to substitute with the current footnote number as another argument, so that the specifier ``%1%'' can be used to interpolate the footnote number. The default is the string ``[note %1%]''.
footnote-anchor (string)
This options specifies the footnote reference that appears at the beginning of each footnote proper if .FS was called without an argument. The option's value is passed to a call to substitute with the footnote number generated by the last use of `\**' as another argument. The default is ``[%1%]''.


unroff reads and parses an ''ms`` document composed of one ore more input files. As usual, the special file name `-' can be used to interpolate standard input. If no file name is given in the command line, unroff reads from standard input.

The resulting HTML document is sent to standard output, unless a file name prefix is assigned to the document option. In the latter case, the split option controls splitting of the output into separate files at section boundaries as described under OPTIONS above. A number of other features, such as footnotes, also require that the document option is supplied, as separate output files are created for them (regardless of the value of split). In any case, the name of each output file consists of the value of document, followed by an optional suffix, followed by the extension ``.html''.


To translate an ``ms'' document composed of several input files, unroff could be invoked like this:
unroff -fhtml -ms document=thesis split=2 intro.ms 1.ms 2.ms 3.ms app.ms
The names of all output files will have the prefix ``thesis'', and the resulting HTML document will be split into separate files at each level 1 section or level 2 section.


The following -ms macros are translated (in addition to any user-defined macros):

	.AB	.AE	.AI	.AU	.B	.B1	.B2
	.BD	.BX	.CD	.DE	.DS	.FA	.FE
	.FS	.I	.ID	.IP	.LD	.LG	.LP
	.NH	.PP	.PX	.QP	.R	.RE	.RS
	.RT	.SH	.SM	.TL	.UL	.UX	.XA
	.XE	.XS

These predefined strings and number registers are recognized:

	\*-	\*(DY	\*(MO	\*Q	\*U	\n(PN

In addition, a number of macros are either silently ignored or cause a warning to be printed, because their function either cannot be mapped to HTML 2.0 elements or assumes a page structure:

	.AM	.BT	.CM	.CT	.DA	.EF	.EH
	.HD	.KE	.KF	.KS	.ND	.NL	.OF
	.OH	.P1	.PT	.TM	.MC	.1C	.2C

The font switching macros are based on changes to the fonts `R', `I', and `B', as explained under FONTS in unroff-html(1). Of course, this fails if the fonts (which are mounted on startup) are unmounted by explicit .fp requests.

Upper or lower case letters are accepted as section numbers by .NH when the argument ``S'' is used to set new section numbers. This is useful for appendices and similar constructs.

The translation rule for .IP employs a heuristic to determine whether to generate a definition list or an unordered list: if the first in a sequence of indented paragraph macros is called with a tag consisting of one of the special character \(bu or \(sq, a definition list is begun, otherwise an unordered list. Since exdented[sic] paragraphs cannot be expressed in HTML 2.0, a warning message is printed when a call to the macro .XP is encountered.

All footnotes are concatenated and placed in a separate output file, and a corresponding section (with a user-defined header) holding the footnotes is appended to the document automatically. Use of the string `\**' generates a hypertext link to the beginning of the footnote created by the next call to .FS and .FE. The actual text generated by using `\**' as well as the footnote reference that appears in the footnote proper are controlled by two options as explained under OPTIONS above. A warning message is printed on termination if `\**' has been used but a corresponding footnote was not seen. As an alternative to `\**', the new request .FA can be used to produce a footnote anchor together with a hypertext link; the anchor is the argument to the macro (however, `\**' itself must not be used in a call to .FA).

Likewise, a hypertext reference is created for each use of the table of contents macros .XS and .XE (optionally accompanied by calls to .XA).


unroff(1), unroff-html(1), troff(1), ms(5 or 7).

Unroff Programmer's Manual.


Berners-Lee, Connolly, et al., HyperText Markup Language Specification--2.0, Internet Draft, Internet Engineering Task Force.


The macro .UL is currently mapped to a call to .I, as underlining is not supported by the HTML back-end of unroff 1.0.

Footnote references and requests such as .sp that cause non-character-level markup to be generated must not be used inside a numbered header.

When creating a hypertext anchor for .XS and .XE, there is nothing to put inside the <a> element; therefore a non-breaking space is used.

Changing the number register format of `NH' to get roman or alphabetic section numbers does not work, obviously.

Markup created by unroff 1.0,    March 21, 1996.