In short look at this sentence:
In Carlo Michelstaedter's
Persuasione
e rettorica, there is a highly original treatment of
modernity
and extremes of fin de siècle angst.
We see three italicized text portions:
Conventional wordprocessing has made such ambiguity "standard."
Anyone
who has had to reformat a document from one publisher--or
software--to
another knows how frustrating this can be. XML makes the
identification
of information much more specific. So, in the example above (which is
done with standard HTML), the
phrases have really only been marked (in what the computer reads)
with ambiguous italics :
<i> highly original
</i>
This kind of marking assumes every reader knows what italicizing
signifies, and that all computers can read it (which, we all know,
is
not often the case!!). With XML we
would have a different marking that you would tell the computer
about each
part (in sort of the same way you'd tell it to italicize
something):
Aside from the obvious increase in precision, there are other advantages. For instance, anything you ever write using these tags (or ones like them for your discipline, for instance) can be reformatted--however many documents you've done this way--simply by telling one file to "make all titles italic, all foreign underlined," etc. You never have to re-format the content itself, just tell your computer what you want it to do with all "title" parts, or "emphasis" parts, etc. Thus your original content is always "safe" from later re-formatting. You don't have to risk damaging your composition just to change its rendition. Plus XML takes up less disk space than most word-processing documents, are Year 2000 safe, and ANYONE can read them on their computer.
You can also use this for more precise searching. You can choose all occasions of Shakespeare as author, or distinguish between a search for Coleridge's Xanadu and the song by Olivia Newton-John. Check out the links above to learn more.
XML,
unlike
SGML, has a few differences so that browsers can
read it, and individuals
can literally make their own tags for labelling new kinds of data or
discoveries. This is a set of
rules for how to make a Document
Type Definition (DTD). HTML is a DTD, and, generally speaking, anyone can
write one. It's a
set
of instructions that says "when I want to indicate that Xanadu is a place
talked about by Coleridge--instead of an obnoxious song, for instance--I
will write this:
<poemgeography>Xanadu</poemgeography>
Right now, WordPerfect 8 for Windows creates it (I'm beta-testing WP 9, and it is way cool) and a plug-in called S45 by i4i works with MSWord, and of course we mustn't forget Adobe's Framemaker 5.5+SGML (see links suggested on the main page for a full range of XML/SGML software), and there are various gizmos for other software. Internet Explorer 4+ reads it (5.0 beta lets the bells and whistles work), as does the Panorama browser.
With new tools, XML can be converted to PDF, RTF, or HTML for various forms of reading. Disciplines such as Astonomy, Bioinformatics, and mathematics have already made their own set of XML tags.
Disciplines can make their own specific XML tags--even particular departments can--without need of international agreement. This is because XML does what SGML does not: allows any tag to be used as long as it is used consistently, is always closed (with a </tagname> marker), has its attributes in quote marks, and is consistent with upper/lowercase use.
Formating for printing, PDF, RTF
(rich text for most word processors), or HTML can be done with JADE
written by visionary Jim Clark (who also spearheaded DSSSL standards) in order
to create a printable copy, like PDF and the use of XSL style sheets, and
there is an automated free mechanism for this. One can also get html
with the XMLStyler from Arbortext, see the
main page links for points of
departure. Of course, Corel, Adobe, and MSWord/with plugin allow multiple
print or html outputs.