DTD (Document Type Definition) is a scripting language used to define the legal building blocks of an XML document. It further defines the document structure with a list of legal elements and attributes. Scripting in DTD requires a prior knowledge of XML. A DTD can be declared inline inside an XML document, or as an external reference. Following is an example of internal DTD declaration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
<?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jane</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body> </note> |
In the above example, !DOCTYPE note defines that the root element of this document is note, !ELEMENT note defines that the note element contains four elements: “to,from,heading,body”, !ELEMENT heading defines the heading element to be of type “#PCDATA”, and !ELEMENT body defines the body element to be of type “#PCDATA”.
As an example of external declaration, consider the following piece of code:
1 2 3 4 5 6 7 8 |
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jane</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> |
With the usage of DTD, each of the XML files can carry a description of its own format and any application can use a standard DTD to verify that the received data from the outside world is valid.
Seen from a DTD point of view, all XML documents (and HTML documents) are made up by five main building blocks which are elements, attributes, entities, PCDATA and CDATA. Elements are the main building blocks of both XML and HTML documents while Attributes provide extra information about elements. Special characters such as <, > and & etc. are dealt with under the building block of entities. PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for entities and markup. Finally, CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.
In a future tutorial, we would discuss these building blocks of DTD in a bit detail and focus upon the benefits of using DTD in place of simple XML.