In discussions of XML (Extensible Markup Language), especially with
programmers, one of the hugely overlooked areas is the XSL Transformations
Language (XSLT). Part of the problem may be perception: The XSL
Transformations Language lets you create a style sheet that processes an
XML document. Within the XSLT document, or style sheet, you scan an XML
document for elements, attributes, and so on, and process them as they're
encountered. Many programmers feel that the same tasks can be accomplished
using an API such as SAX (Simple API for XML) or the DOM (Document Object
Model).
XSLT and programming APIs like the DOM and SAX have overlapping
features that might let you accomplish similar tasks. XSLT, however, has
benefits programmers may be overlooking. One such benefit goes to the
heart of XML's goal: the separation of data from program logic and
presentation code. Because XSLT doesn't carry the baggage of platform- or
application-specific code, your separated data can be converted to any
format you choose, combined with application code, and redistributed on
any device, including Palm devices, desktops, and Internet-capable phones.
More important, you can simultaneously share different views of the same
data. In fact, clean data can be retargeted for devices not yet
created?and repurposed in ways your boss never imagined.
The whole point of XSLT is to convert data easily to formats such as
HTML, or to convert XML from one vocabulary to another. If you have
several conversions to make, you can define an assortment of style sheets
that transform your data in different
ways.
Unfortunately, static style sheets can cause proliferation problems.
That is, even simple applications can sometimes lead to the development of
hundreds of style sheets. As a first step in solving that problem, we'll
show you how to use the DOM to generate XSLT style sheets dynamically from
script. With some additional program logic, you should be able to replace
possibly hundreds of style sheets with a single script.
Why Generate XSLT Dynamically?
The problem with static style sheets is that when you have several
variables, you can wind up with hundreds of style sheets. For example,
consider a Web site that uses XSLT to customize the output of HTML to
different browsers. For each browser type and version, you'd need a static
style sheet that generates HTML for that browser. Thus, you would have one
style sheet that generates HTML customized for Microsoft Internet Explorer
5, another that outputs HTML for Netscape Navigator 4, and so on. You
would need a dozen style sheets just to support the two major browsers (IE
1 through 5.5 and Navigator 1 through 5).
The example above involves only one XML document type, and a Web site
supports many different types of documents (a table of contents page,
database page, article page, form input page, and so on). Assuming we can
limit the number of document types to ten, you now need 120 (10 x 12)
style sheets to support the rendering of all of your documents in these
two browsers.
As an added catch, suppose my site allows visitors to select one of
four different interfaces, or themes. Each interface requires a style
sheet that renders the same document, but with a different look and feel
within the browser. So to support the two major browsers with ten document
types and four themes requires some 400 style sheets. This situation can
result in serious headaches when you're making even a simple change to a
document type's structures, which in turn will affect the style
sheets.
Interestingly, the bulk of each style sheet is nearly identical to the
other style sheets. For example, rendering themes means importing a
different Cascading Style Sheet. So the only line that changes in the XSLT
document is the one that links in the CSS file. A program or script could
combine the techniques presented here with Case or Switch statements to
handle the different document types, browser types, and themes. Each
Switch statement must insert only the minor modifications required by the
style sheet. Of course, this is a server-side application and is beyond
the scope of this article.
Transforming XML
Before you begin writing (or generating) your transformations, it's
important to understand how style sheets are processed. Basically, an XML
parser takes a marked-up document and creates a treelike structure in
memory. The tree contains the elements, attributes, entities, notations,
and processing instructions of the document. At this point, you can walk
the tree, then access and modify these markup components in document order
using an XML API such as the DOM or SAX. You can, however, also invoke an
XSLT processor to take these objects, convert (or "transform") them, and
theoretically output them in almost any manner imaginable. The XSLT
processor takes the tree generated by the XML processor (the source tree)
and, with all of the transformed elements, creates a new tree (the result
tree).
The file Rocket.XML is a sample XML document, and Stylesheet.xsl is an
XSLT style sheet that can transform the document into HTML. As you can see
from the xml declaration, an XSLT style sheet is also a well-formed XML
document, which can be used to create new style sheets dynamically (that
is, at runtime). The basic structure of the style sheet consists of a root
template (as noted by
) and a series of additional
elements. Within the root template, we want to process the
root node and its immediate child nodes. So it uses to grab
the headline, deck, byline, and aBody elements and place them in the HTML
transformation. The additional templates are used to process descendants
of the aBody element.
By the way, I'm using a little trick in the style sheet to render XML
elements. In the transformation, I've included an HTML
tag, which
associates a CSS style sheet with our HTML output. I wrap the headline,
deck, byline, and so on in
tags to render these transformed elements. A
tag's CLASS attribute corresponds to a style that I've created in the
CSS style sheet. So when the XML document is processed, the style sheet
transforms the XML elements into HTML, and style rules from CSS are used
to apply formatting to the resulting HTML. If this is done from a server
transformation, the browser sees only HTML, never realizing that an XML
document is being served. You can use this technique to render any XML
document using CSS. This, in a nutshell, is how you can use CSS to render
any XML document.
Dynamic Style Sheets
Since a style sheet is a well-formed, valid document, you can load a
style sheet into the DOM just like any other XML document. We'll use this
feature to load a basic style sheet into a DOM object, then use DOM
methods to begin constructing a style-sheet instance. Specifically, our
goal is to create a program that automatically generates the style sheet.
Once the style sheet is created, you can save it out to disk or
immediately apply it to an XML document.
So that you can easily test this example, I've written it as a
JavaScript within an HTML. But with some minor tweaks, it can be ported to
an Active Server Page. Even better, the same techniques (with some minor
exceptions) can be used with any XML parser and programming language if
you are using the DOM.
To begin working with the DOM, you must create a new DOM object, then
load an XML document into that object. From that point, you can use DOM
methods to walk the document tree, access nodes, query properties, modify
nodes, create new ones, and so on. Unfortunately, the means for creating
and loading a DOM object are specific to the parser you are using. In this
example, I'm using the MSXML parser, so this part of the code will
necessarily be Microsoft-specific. The rest of the code presented,
however, should work with any DOM Level 1-compliant parser.
In addition, the code for creating a DOM object under the Microsoft
parser depends on whether you're working on the client or server side. For
example, to create a new DOM object in a client application, you would use
a new ActiveXObject() and assign the result to a variable. This is the
technique shown in our example in the file Genxsl.hta. If you are creating
a DOM object on the server side?from an Active Server Page, for
example?then you must use Server.CreateObject() instead.
Genxsl is an HTML document that includes a JavaScript function called
Parse(). Notice that the tag of the HTML includes ONLOAD="Parse()". This
invokes the Parse() function as soon as the HTML document is loaded into
the browser. Within Parse(), the first step is to create a new DOM object
as outlined above. But you can see that I have included three different
calls to create a DOM object, two of which have been commented out:
// Create a Document object and report the results.
// The following
instantiate different versions of the MSXML parser
// Version 1 ProgID:
Microsoft.XMLDOM
// Version 2 ProgID: MSXML2.DOMDocument
// Version
3 ProgID: MSXML2.DOMDocument.3.0
// var xslDocument =
new
ActiveXObject("Microsoft.XMLDOM");
// var xslDocument =
new
ActiveXObject("MSXML2.DOMDocument");
var xslDocument =
new
ActiveXObject("MSXML2.DOMDocument.3.0");
As you can see in the comments, each of these calls will invoke a
different version of the MSXML parser. As presented, Genxsl is running
Version 3 of the MSXML parser, assuming you have Version 3 installed on
your machine. (See http://msdn.microsoft.com/xml for the latest release of
the MSXML parser.) If you want to run MSXML Version 1, simply comment out
the reference to the Version 3 parser and uncomment the code that
instantiates the Version 1 parser.
Either way, the result of this operation is assigned to the xslDocument
variable. So the next step is to populate xslDocument with the skeleton
XSLT file Template.xsl. This is done using the load() method. As with DOM
object creation, the method for populating a DOM object is specific to the
parser you are using. This line will vary depending on the parser you're
using.
Building the Template Rules
With the xslDocument populated, we can now use DOM methods to build our
style sheet. Because the skel-eton XSLT document contains only an empty
element, the first task is to create the root template.
Genxsl uses the DOM's createElement() method to create a new element
called xsl:template. Creating this new element doesn't automatically
insert it in the document tree, so I call appendChild() to insert
xsl:template as the last child of
. Once the element is in
the tree, Genxsl uses setAttribute() to insert the match="/"
attribute-value pair into the
element, which is identified
as the root node.
At this point, we want to begin adding the HTML code that will be used
in our transformation. Originally, I tried to insert HTML tags as text.
But when the document is processed, markup characters are replaced with
predefined entities. For example, the tag is rewritten as . Then I
realized that I could simply insert HTML tags as if they were XML markup.
The processor does not care what the markup is; it believes everything is
XML. Thus, Genxsl simply uses createElement() again to create the HTML,
BODY, and LINK tags.
The challenge this time is inserting these HTML elements into the
document tree. For example, is a subelement of , which in turn is a
subelement (or more appropriately, the content) of
. When
inserting the element, Genxsl walks from the root node (documentElement)
to the last child (lastChild) and appends the newly created element to
this node's list of subnodes. So appending as a subelement of is a bit
easier. Genxsl simply calls appendChild().
Next, the LINK tag is generated, and DIV tags are inserted in the
document. As mentioned above, the LINK tag brings in a Cascading Style
Sheet, and the
tags make use of its styles to format the generated HTML.
The last step, at least for this simple case, is to insert our XML
content into the
tag. In XSLT, this is done using . Genxsl again calls
createElement() to create the for each piece of data we wish
to return, sets the select attribute to correspond with the XML data we
want to present, and inserts the element into the document tree at the
point where
element occurs.
Finally, our script in Genxsl writes the resulting document tree out to
disk, which lets you view the results quickly and easily. To test this
example, simply link the style sheet to your XML document using an processing instruction, then launch the XML document in
Internet Explorer to view the resulting transformation.
Running the Example
To run this example, copy all of the files you downloaded into a
working directory. Don't worry about Stylesheet.xsl, since Genxsl.hta will
automatically generate it. Note that we must use an .hta file extension
(rather than .html), because this script writes Stylesheet.xsl out to
disk. Because of browser security, you'll receive a JavaScript error if
you give your file an .html file extension.
With this setup, you should be able to launch Genxsl.hta. In a few
moments, Stylesheet.xsl will appear. Now you can launch Rocket.xml. The
document contains a style sheet declaration that links Stylesheet.xsl, so
Rocket.xml should appear in your browser, as shown in the screenshot.
Conclusion
Beyond the maintenance nightmare static style sheets introduce, the
approach presented here can be used to replace the large number of static
style sheets you must write with a single script. Another application of
dynamic style sheet generation might be to convert older style sheets to
conform to the newer XSL standards. Even if you're not developing style
sheets, the techniques presented here will be instructive for anyone
creating complex XML documents using the DOM.
Michael Floyd is the author of Building Web Sites with XML, from
Prentice Hall, and teaches Beyond HTML's training course Dynamic XML. He
is also the architect of the Rocket XML framework. He can be reached at
mfloyd@lifestylesSantaCruz.com.