Normalization (Tag Grouping)

Suppose you have the following data set:

<Books>
  <!-- book #1 -->
  <Published>June</Published>

  <!-- book #2 -->
  <Published>June</Published>
  <Book>Book 2</Book>

  <!-- book #3 -->
  <Author>Book 3 - Author</Author>
  <Published>June</Published>
  <Book>Book 3</Book>

  <!-- book #4 -->
  <Pages>127</Pages>
  <Published>June</Published>

  <!-- book #5 -->
  <Author>Book 5 - Author</Author>
  <Pages>124</Pages>
  <Published>June</Published>
  <Book>Book 5</Book>
</Books>

And you would like to iterate over all the individual books.

Now, as a human, you have no trouble distinguishing one book from another. After all, helpful comments and line breaks have been inserted. Even without them, you would be able to determine what tag belongs to which book. Eventually, anyway.

As an XSL engine, this is a bit trickier. In the past, we have used combinations of following-sibling and preceding-sibling selectors but there is a catch. When you use them for filtering, they just compare values of nodes, not the actual nodes. In the example above, the only tag present for all books is Published and it has an identical value for all the books. And this seriously messes up the results.

We have stumbled upon this issue recently and found one clean(ish) way to solve this problem without pre-parsing the data beforehand. The sole requirement for this solution to work is that the order of the tags in each item-to-be is fixed. Any number of tags can be missing, but the tags need to be in a fixed order so that boundaries can be determined successfully. Here it comes, including abundant comments and all necessary helper methods too. It might seem rather long but it is actually fairly simple to understand.

Utility methods:

<!--
 - More complicated expressions in a `if $cond then ... else ...`
 - sometimes do not get evaluated properly when in `xsl:value-of`.
 - This function makes sure you can write e.g.
 - `<xsl:value-of select="utils:ifElse(1, 'one', 'other')" />`.
-->

<xsl:function name="utils:ifElse">
  <xsl:param name="condition" />
  <xsl:param name="true" />
  <xsl:param name="false" />

  <xsl:value-of select="if ($condition) then $true else $false" />
</xsl:function>

<!--
 - Get a maximum from 2 values.
-->

<xsl:function name="utils:max">
  <xsl:param name="a" />
  <xsl:param name="b" />

  <xsl:value-of
    select="utils:ifElse(number($a) &gt;= number($b), number($a), number($b))" />
</xsl:function>

"Algorithm":

<!--
 - Get position of a tag in an item.
-->
<xsl:function name="local:getTagPosition">
  <xsl:param name="element" />
  <xsl:value-of select="number($tagOrder/*[name() = name($element)]/@id)" />
</xsl:function>

<!--
 - Determine whether the given tag (position) still pertains
 - to the currently processed item. Do not forget to cast
 - this to a boolean-like value (e.g. using `number()`) for
 - it to properly function in conditional statements.
-->
<xsl:template name="local:belongsToCurrentItem">
  <xsl:param name="tagPosition" />
  <xsl:param name="data" />
  <xsl:param name="max" select="0" />

  <xsl:choose>
    <!-- still some data left -> continue -->
    <xsl:when test="count($data)">
      <xsl:variable name="next" select="local:getTagPosition($data[1])" />
      <xsl:variable name="nextMax" select="utils:max($max, $next)" />

      <xsl:choose>
        <!-- non-increasing sequence -> this is the next item already -->
        <xsl:when test="$next &lt;= $max">0</xsl:when>
        <!-- checked tag's position lower than the max -> this is the next item already -->
        <xsl:when test="number($tagPosition) &lt;= number($nextMax)">0</xsl:when>
        <!-- continue the search -->
        <xsl:otherwise>
          <xsl:call-template name="local:belongsToCurrentItem">
            <xsl:with-param name="tagPosition" select="$tagPosition" />
            <xsl:with-param name="data" select="$data[position() > 1]" />
            <xsl:with-param name="max" select="$nextMax" />
          </xsl:call-template>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:when>
    <!-- no data left -> return true as this is the max -->
    <xsl:otherwise>1</xsl:otherwise>
  </xsl:choose>
</xsl:template>

<xsl:function name="local:belongsToCurrentItem">
  <xsl:param name="tagPosition" />
  <xsl:param name="data" />

  <xsl:call-template name="local:belongsToCurrentItem">
    <xsl:with-param name="tagPosition" select="$tagPosition" />
    <xsl:with-param name="data" select="$data" />
  </xsl:call-template>
</xsl:function>

And finally, the usage:

<!--
 - Specify the tag order in any given item.
 -
 - You need to list all the possible tags but naturally,
 - not all tags need to be present in every item to be parsed.
-->
<xsl:variable name="tagOrder">
  <Author id="1" />
  <Pages id="2" />
  <Published id="3" />
  <Book id="4" />
</xsl:variable>

<!--
 - Collect values for a single item.
-->

<xsl:template name="books">
  <xsl:param name="data" />

  <xsl:if test="count($data)">
    <!-- get values for the next item -->
    <xsl:variable name="item">
      <Item>
        <xsl:for-each select="$data">
          <xsl:variable name="current" select="." />
          <xsl:variable name="position" select="position()" />

          <!-- add this only if it still pertains to the current item -->
          <xsl:if test="number(local:belongsToCurrentItem(
            local:getTagPosition($current),
            $data[position() &lt; $position])
          )">
            <xsl:element name="{name($current)}">
              <xsl:value-of select="$current" />
            </xsl:element>
          </xsl:if>
        </xsl:for-each>
      </Item>
    </xsl:variable>

    <!-- print it -->
    <fo:table-row height="60">
      <fo:table-cell>
        <fo:block><xsl:value-of select="$item/Item/Book" /></fo:block>
      </fo:table-cell>

      <fo:table-cell>
        <fo:block><xsl:value-of select="$item/Item/Pages" /></fo:block>
      </fo:table-cell>

      <fo:table-cell>
        <fo:block><xsl:value-of select="$item/Item/Author" /></fo:block>
      </fo:table-cell>

      <fo:table-cell>
        <fo:block><xsl:value-of select="$item/Item/Published" /></fo:block>
      </fo:table-cell>
    </fo:table-row>

    <!-- call recursively -->
    <xsl:call-template name="books">
      <xsl:with-param name="data" select="$data[position() > count($item/Item/*)]" />
    </xsl:call-template>
  </xsl:if>
</xsl:template>

results matching ""

    No results matching ""