XPath

XPath uses path expressions to select nodes or node-sets in an XML document. As you will see, these expressions closely resemble what you use when working with traditional file systems.

Quick note on terminology: In this chapter, children nodes denote direct children of their parent nodes, i.e. with no intermediate nodes in between the two. Descendants mean both (direct) children and indirect descendants of their respective parents, i.e. with any number of intervening nodes in between.

Syntax

Here is the specs of the most basic XPath syntax:

Expression	Description
`nodename`	Select all children nodes of the current node, whose name is equal to `nodename`.
`/root/nodename`	`/` denotes absolute path. Select all nodes specified by the given path.
`parent//descendant`	Select all nodes with the name `descendant` who are descendants of the `parent`, no matter on which level, i.e. both children and descendants.
`.`	Select the current node. Applicable mostly in templates, `for-each` loops, etc.
`..`	Select the parent of the current node. Applicable mostly in templates, `for-each` loops, etc.
`nodename/@attribute`	Select the `attribute` of the node specified by the preceding expression.

As you can see, a single slash / at the beginning of an expression denotes an absolute path and a single slash / everywhere else denotes a direct child of the result of the preceding expression. Double slash // implies any descendant, i.e. both direct and indirect children.

These are the basics, let's look at some filtering predicates:

Expression	Description
`nodename[1]`	Select the first child element of the current node whose name equals `nodename`.
`nodename[last() - 1]`	Select the penultimate child element of the current node whose name is `nodename`.
`nodename[position()<4]`	Select the first three child elements of the current node whose name is equal to `nodename`.
`nodename[@attribute='value']`	Select all child elements of the current node whose name is equal to `nodename` and whose `attribute` value is equal to `value`.

As you can see, you can use position and conditional predicates in (square) brackets to filter the results of the preceding expression.

Learn by example

Enough of this theoretical obscure syntax, right? Let's have the following sample XML data as there is nothing like learning XPath while actually seeing the results of your queries:

<books>
  <book>
    <author>Brian Christian</author>
    <title data-type="string">The Most Human Human</title>
    <pages data-type="number">303</pages>
  </book>
  <book>
    <author>Aldous Huxley</author>
    <title data-type="string">Brave New World</title>
    <pages data-type="number">268</pages>
  </book>
  <book>
    <author>Ian Goodfellow</author>
    <title data-type="string">Deep Learning</title>
    <pages data-type="number">787</pages>
  </book>
</books>

The following table shows the results of some of the typical XPath queries you might (not want to, in come cases) be using with the data above:

Expression	Result	Comment
`/books/book[1]/author`	"Brian Christian"	Author of the first book, using absolute path.
`books/book[2]/author`	"Aldous Huxley"	Author of the second book, using path relative to the current node, in this case the document itself.
`/book[3]/author`	"" (empty)	Empty selection as no such absolute path exists in the XML document.
`book[1]/author`	"" (empty)	Empty selection as current node is the document which has no children of type `book`: its only child is `books`.
`//book[3]/author`	"Ian Godfellow"	Author of the 3rd `book` node anywhere in the document.
`//book[3]/author[1]`	"Ian Godfellow"	1st `author` of the 3rd `book` anywhere in the document which, incidentally, is the only one.
`//book[3]/author[2]`	"" (empty)	Empty selection as the 3rd `book` in the document has no 2nd `author` node.
`/books/book[1]/*[@data-type='string']`	The Most Human Human	All child nodes of the 1st `book` whose attribute `data-type` has the value of "string". That happens to be the book's `title`.
`/books/book[1]/title/@data-type`	string	Attribute `data-type` of the `title` of the 1st `book`.

Relative vs. Absolute: Which One to Use?

Absolute xpaths are more error-prone as even slight changes in DOM may render them invalid or make them refer to a wrong element. On the other hand, they are unambiguous by definition and perform better on larger data sets.

As a rule of thumb, we recommend using absolute paths as they are clearer and should changes in data structure occur, style sheets need to be revised regardless of whether they mostly use relative or absolute paths.

XPath

XPath

Syntax

Learn by example

Relative vs. Absolute: Which One to Use?

results matching ""

No results matching ""