This is a set of rules for how to make a Document Type Defintion (DTD). HTML is a DTD, anyone can write one. It's a set of instructions that says "when I want to indicate that Xanadu is a place talked about by Coleridge--instead of an obnoxious song, for instance--I will write this:<poemgeography>Xanadu</poemgeography>'". Right now, WordPerfect 8 for Windows creates it (I'm beta-testing WP 9, and it is way cool) and a plug-in called i4i works with MSWord, and of course we mustn't forget Adobe's Framemaker 5.5+SGML (see links suggested on the main page for a full range of XML/SGML software), and there are various gizmos for other software. Internet Explorer 4+ reads it (5.0 beta lets the bells and whistles work), as does the Panorama browser. Future Microsoft software text outputs will include XML.
The value
of this is that it enables searching to be
more precise (i.e., "Where is
the Xanadu of Kubla Khan discussed") for
research. There are a multitude
of other uses. Another advantage of SGML
is that it uses the simplest
computer coding (it's hard to include a
virus), it is not proprietary (no
one owns it), and it preserves content
for even the simplest of machines
(it's also 2000 compliant).
Because anyone can
write a DTD, this has been a problem.
Scholars set up TEI-ML
(Text Encoding Initiative Mark-up Language) some
years ago, but it's too
big--it tries to do everything.
There are
lots of other "ML's" out
there too. It is impossible to write software to
cover all the variables.
Hence the value of XML.
It's
also very hard to print SGML. It
only identifies the
kind of information in a document, not
how it looks. This
requires a very complicated "style sheet" (in
computerese) called a DSL (Document
Style Language) file, which
says--in effect--any time I start a new
chapter, center it, put it in
bold, and font size such and
such.