Standards for Electronic Publishing: SGML and ANSI Z39.59 By Evan Owens Journals Division, The University of Chicago Press Until fairly recently, paper was the universal format for text interchange between authors, editors, publishers, and readers. As we move into an electronic future, however, we find ourselves using a plethora of file formats: typesetting systems (e.g., Penta, XyVision, or TeX), proprietary word processors (e.g., WordPerfect or Microsoft Word), text interchange formats (e.g., ASCII, Rich-Text-Format, or Navy DIF), and markup languages (e.g., LaTeX or SGML). The choice of a standard format is critical because we expect to use electronic files not just as a new way to produce the paper journal but as the basis for new ways to store and distribute texts electronically. We have five years of the Astrophysical Journal stored on magnetic tape in a proprietary typesetting format; once a format is chosen, these tapes can be translated and used as the basis of an electronic archive to which new material can be added as it is published. A consensus has emerged in the publishing community that the foundation for electronic publishing will be the adoption of Standard Generalized Markup Language (SGML) otherwise known as ISO 8879 (1986). SGML was designed as a language for electronic documents and is well suited for text databases, hypertext, CD-ROM, and electronic books and journals. SGML documents are not dependent on any hardware, software, formatter, or operating system. The concept is important. Generalized markup is codes or tags that describe the content of the text such as "heading" or "title" or "footnote"; it is completely different from procedural markup, codes that describe the format such as "10 point Times Roman" or "indent 1 em." SGML provides a way to describe and validate the structure or hierarchy of a document (e.g., second level headings go inside first level headings, front matter comes before the body of the article, and so on) through a document type definition (DTD). This is important for text databases, because it makes it possible to search selected portions of the text; for example, the footnotes, display equations, headings, references, etc. To use SGML, one has to have a suitable SGML document type definition (DTD). There is an ANSI standard DTD for articles and books which resulted from a joint project of the Association of American Publishers (AAP), the Council on Library Resources, and other organizations; it also includes markup specifications for mathematics and tables. We are constructing a superset of the AAP DTDs to use for the Astrophysical Journal and other AAS publications. With a DTD available, one then creates documents using the specified names to tag the various parts of the manuscript. Ideally, the author would create an SGML document directly; however, SGML coding is tedious without special software. There are SGML-aware validating text editors available, some with WYSIWYG equation and table editors, but they are currently very expensive or not suited for general use. The situation should change for the better early next year, however, as new SGML products are expected from WordPerfect, Frame Technology, and others. Not everyone will want to write using an SGML editor, even when such tools are widely available. Fortunately, it is possible to translate from one format to another. At the University of Chicago Press we now accept files for some of our journals in common word processor formats and then translate, edit, code, and translate again to typesetting systems. For the Astrophysical Journal, we plan to accept manuscripts in the AAS version of LaTeX and translate to SGML; this translation works particularly well because of the structured markup present in a properly coded LaTeX manuscript. Plain TeX files can not be translated as they contain procedural rather than generalized markup codes. We will then supply SGML files to our typesetters and get SGML files back for archival use.