Explanation of DTD (Document Type Definition)


A DTD is an SGML-compliant or XML-compliant (it's written according to the SGML or XML standard) set of rules for marking text content. It specifies that every time I mean that the word "Xanadu" is from a poem (not a song by Olivia Newton-John), I will write 'Xanadu'. Everyone in my field--say the Humanities--will also do this.

DTD's can be made by anyone for any purpose. Getting everyone to agree on these, however, is at times hard, also new discoveries can't be tagged until all agree-- EXCEPT with XML, where tags can continally be added to DTD's for individual or group use. It's like getting everyone to speak Esperanto. In addition, a DTD just says how I want to identify content, not how it looks. To determine how it looks, I have to write a DSL. (Document Style Language) file, which says--in effect--any time I start a new chapter, center it, put it in bold, and font size such and such.

Scholars made a DTD called TEI (Text Encoding Initiative), but--sort of like a Microsoft product--it's huge, it tries to do too much for too many people, and ultimately fails to be efficient. Archivists have "EAD-ML" and so on. This is all well and good if a bunch of folks agree, but that still doesn't solve finding software to read it. XML solves this.