XSL 101
Alright, let's dive right in since getting hands dirty with code is the only way to really learn the digital ropes of anything.
Suppose that we have a list of books that a person has read during the course of a year. It is represented as XML
and looks like something along the lines of:
<?xml version="1.0" encoding="UTF-8"?>
<Books>
<Book>
<Name>In Search Of Memory</Name>
<Author>Eric Kandel</Author>
</Book>
<Book>
<Name>Lord of the Flies</Name>
<Author>William Golding</Author>
</Book>
</Books>
Not an exceptionally prolific reader but at least the choice of books is excellent!
Since examining crude XML
files is far from enjoyable, we would like to generate a more visually pleasing representation in the form of a PDF. There should be a simple header, page numbers in the footer, each book on an individual page and the document shall end with a cheery That's all folks! proclamation. It turns out that XSL
allows us to achieve this in no time. (No wonder since it has been designed specifically for the purpose of manipulating XML
input...)
Even though the following code is rather simple, it enables us to explore some of the core concepts of designing XSL
style sheets. Brace yourself, here it comes:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format">
<!--
- Main template.
-->
<xsl:template match="/">
<fo:root font-family="Georgia" font-size="10">
<fo:layout-master-set>
<!-- page master -->
<fo:simple-page-master
master-name="page"
margin="25"
page-height="842"
page-width="595">
<fo:region-before
region-name="header"
extent="50"
display-align="after"/>
<fo:region-body
region-name="body"
margin-top="60"
margin-bottom="100"/>
<fo:region-after
region-name="footer"
extent="100"/>
</fo:simple-page-master>
</fo:layout-master-set>
<!-- page sequence typically defines a run of pages (section, chapter, etc.) -->
<fo:page-sequence master-reference="page">
<!-- header -->
<fo:static-content flow-name="header">
<fo:block
border-bottom="1 solid black"
text-align="center">Books header</fo:block>
</fo:static-content>
<!-- footer -->
<fo:static-content flow-name="footer">
<fo:block
border-top="1 dotted black"
text-align="right">A beautiful footer</fo:block>
</fo:static-content>
<!-- body -->
<fo:flow flow-name="body">
<!-- iterate over the books -->
<xsl:apply-templates select="/Books/Book"/>
<!-- last page -->
<fo:block>That's all folks!</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
<!--
- Individual book template,
-->
<xsl:template match="/Books/Book">
<!-- book block, page break inserted after each one -->
<fo:block page-break-after="always">
<fo:block
font-weight="bold">Book #<xsl:value-of select="position()" /></fo:block>
<fo:block>Name: <xsl:value-of select="./Name" /></fo:block>
<fo:block>Author: <xsl:value-of select="./Author" /></fo:block>
</fo:block>
</xsl:template>
</xsl:stylesheet>
If you feel a little overwhelmed right now, don't worry, we will go through all the individual parts in detail.
Setup & Layout
- It starts with an
XML doctype
declaration. AllXML
- which, by definition, implies allXSL
files too - must start with that declaration in order to be considered valid. - The
<xsl:stylesheet>
represents the root tag of the style sheet.xmlns
attribute prefix denotes an XML Namespace declaration. In this style sheet, we are only using two namespaces: xsl namespace itself and the fo namespace whose main purpose of existence are visual feats, such as layout definition and element styling. These are, by far, the most common namespaces to occur in style sheets. <xsl:template match="/">
marks the beginnning of the main template of the style sheet.- The master set layout, invoked by the
<fo:layout-master-set>
tag, describes the masters used in the document. For all intents and purposes, masters are basically page templates but let's stick to the name masters so as not to confuse them with actual XSL templates. - Individual masters are defined inside the master set layout, typically using the
<fo:simple-page-master>
element. It is here where you can specify page height, width, margins, etc. The name of the master, useful for later reference, is specified in themaster-name
attribute. As you can see, each master can also contain various<fo-region-...
elements, assigning specific roles to certain parts of the code. - A
page sequence
is a run of pages with similar properties. In our case, it is the entire document. Typically, a page sequnce references a specific master, inheriting its settings, via themaster-reference
attribute. In our case, the master's name is, simply,page
.
Content
Give me some content finally, I hear you screaming! Alright, alright, calm down, here it comes, just look at the contents of the <fo:page-sequence>
element
- The first
<fo:static-content>
is linked to thefo:region-before
via itsflow-name
attribute. This implies that it is featured as the header of every page. - The second
<fo:static-content>
, again via itsflow-name
attribute, references thefo:region-after
block. Meaning that we got ourselves a footer. <fo:flow flow-name="body">
is the body of the document. It consists of flow objects arranged on its pages, the real content.- As you can see, it only contains an enigmatic
<xsl:apply-templates>
element and then the final incantation alluding to Road Runners.
Templates
To understand what's going on here, we need to take a short detour from our dissection of the style sheet code.
The <xsl:apply-templates>
element does exactly what its name suggests: it applies a template to the element(s) supplied via the select
attribute. (Basically, it is a for each
loop.) What template? Glad you asked! The template that matches the provided selection. In our case, that happens to be the one specified at the bottom of our style sheet.
<xsl:template match="/Books/Book">
declares a template that will match any number of Books/Book
elements. This template will be executed for every single book, i.e. twice in our case.
- The first
<fo:block>
is just a wrapper for the content. Thanks to its attribute, a page break is inserted after every individual book listing. - As you can see, the
position()
function provides us with the position - often also referred to as an index - in the "for loop" template application. In other words, it tells us what iteration we are on in the current context (1-indexed). - The last two
<fo:block>
elements are the only parts of the style sheet that actually use the providedXML
data. Using the<xsl:value-of>
element, they obtain and print the name and author of each book.
More advanced concepts
Now that you are familiar with the very basics of XSL
style sheet design, you might want to explore some of the more advanced concepts. Check out the left menu for the part this documentation covers. It is highly recommended you read them in the supplied order, naturally.