This manual describes the docspecs system, as implemented in Arboreal 4.0.

The docspecs system allows Arboreal to be extended to support new XML document types. A special XML file, called a docspecs file, contains information about XML document types. This information specifies how Arboreal handles containers, word tagging, metadata extraction, and tag rendering. Arboreal comes with a built-in docspecs file, which provides support for the Archimedes, CDLI, termlist, and Rome (the Arboreal outline format) documents. Minimal support is also provided for XHTML. The built-in docspecs can be accessed from within Arboreal via the ``magic'' URL:

Arboreal will also read a docspecs file that is supplied by the user. This file must be called docspecs.xml and placed in the user's arboreal directory.1 Information about all new document types should be put into this single file. Document types defined in the user-supplied docspecs will override those in the default docspecs.

