Last
month we saw what eXtensible Markup Language (XML) is and where it can be used. With a
small example of an XML document that laid out the structure for a small video library
collection, I tried to explain how an XML document looks and works. This month, we’ll
actually use this document in an application, which will let us reuse this document in
different ways.
To reiterate a few points, XML defines the structure of data in a document. To put it
in perspective, imagine a DOC file to be equivalent to an HTML file, wherein you see the
formatting and the content of the document. An XML equivalent would be the RTF file of the
same document, which defines the structure rather than the formatting. (This is NOT a true
representation of the difference, it’s only meant to quickly put things in
perspective.) XML has a well-defined syntax, but almost no keywords of its own. This is
because it’s more of a "meta-language" that lets you define structures on
your own.
Document-type declarations
To create a valid XML document, it should have DTD (document-type declaration)
associated with it. The DTD is analogous to declaration of variables in any programming
language before using them. However, DTDs are optional. You can create a well-formed XML
document that follows all the rules of XML syntax. All valid documents are well formed
also.
DTDs have been around for quite some time. HTML also has DTDs. In fact, there are
multiple DTDs for each version of HTML that has been released, called the loose, standard,
and strict editions for HTML 2, 3, 3.2, and 4. Typically, browsers don’t need to use
the DTD as the syntax is hard coded into the HTML engine. What the browsers don’t
know, they ignore.
However, with XML this is not a very clever thing to do. After all, one cannot know in
advance the kind of keywords that XML will have or even the order in which they should
appear. DTDs solve this problem. They define the "entities" that can appear in
the document, their attributes, child elements, and the kind of data each of them can
carry.
Let’s quickly create a small DTD for a video library.
This DTD specifies that the (root) element called "movies" has a child-
element "video". It then specifies that this element has further children
elements called "title", "date", "type", and
"actors" in that order. Video also has a required attribute called
"id" which holds character data. Following this reasoning you can make out that
the "title", "date", and "type" elements all contain parsed
character data, and type also has a required attribute called "category". The
"actors" element further has details of an individual "actor" within.
To attach a DTD to an XML document, use the
tag just below the
declaration, like this:
"video. dtd">.
That’s all there is to it. Now, any XML compliant application will be able to
understand how to process the document. DTDs are usually much more complex than this. Take
a look at the HTML 4 DTD to understand how comprehensive DTDs can be.
Formatting contents of an XML document
So you’ve got your document laid out in XML. But now what do you do to display it?
If you’ve IE 5 (the first browser to fully support XML, although IE 4 does have some
support) you can simply open the XML file in it. IE 5 has an XML parser that checks that
XML is well formed. That is, it doesn’t contain any syntax errors like missing end
tags, incorrect nesting, etc. If it does find an error, it displays the line, which might
be the cause of the error with the reason. If the document is well formed, it displays the
entire document in a collapsible list. You can expand or collapse items in the tree by
simply clicking the text or the small + or - sign next to each.
But this isn’t what you really want to show to your visitors, is it? IE5''''''''s
rendering is quite Spartan. But with a little ingenuity you can spruce up the document in
any way you please. For this, you’ll need to use a technology called the Extensible
Stylesheet Language (XSL).
XSL is an application of XML. This means that XSL follows all the rules that XML
imposes but has a set of keywords and language constructs as well. XSL is used, in the
same way, with XML as cascading style sheets (CSS) is used with HTML. Both define the
style or presentation attributes of a document (refer our August 1998 issue for Cascading
Style Sheets).
XSL is very flexible. By defining different XSL files for a single XML document, you
can get a variety of different outputs, like different views of the data in HTML, or even
a set of different XML documents.
Now that we’ve seen what XSL is, we’ll use it next month–on an XML
document.