This is a set of rules for how to make a Document Type Defintion (DTD). HTML is a DTD, anyone can write one. It's a set of instructions that says "when I want to indicate that Xanadu is a place talked about by Coleridge--instead of an obnoxious song, for instance--I will write this: '<poemgeography>Xanadu</poemgeography>'". Right now, WordPerfect 8 for Windows creates it, and there are various gizmos for other software. Internet Explorer 4+ reads it, as does the Panorama browser. Future Microsoft software text outputs will be in SGML.
The value
of this is that it enables searching to be more precise (i.e., "Where is
the Xanadu of Kubla Khan discussed") for research. THere are a multitude
of other uses. Another advantage of SGML is that it uses the simplest
computer coding (it's hard to include a virus), it is not proprietary (no
one owns it), and it preserves content for even the simplest of machines
(it's also 2000 compliant).
Because anyone can
write a DTD, this has been a problem.
Scholars set up TEI-ML (Text Encoding Initiative Mark-up Language) some
years ago, but it's too big--it tries to do everything. VT has made
ETD-ML (but this means everyone must agree to use it for dissertations)
which is similar to TEI, but omits some things and adds others. There's
lots of other "ML's" out there too. It is impossible to write software to
cover all the variables. Hence the value of XML.
It's also very hard to print SGML. It only identifies the
kind of information in a document, not how it looks. This
requires a very complicated "style sheet" (in computerese) called a DSL (Document Style Language) file, which
says--in effect--any time I start a new chapter, center it, put it in
bold, and font size such and such.