Chapter 11. Querying documents

Table of Contents

Running XPath expressions
What is XPath
<oXygen/>'s XPath console

Running XPath expressions

What is XPath

XPath is a language for addressing specific parts of an XML document. XPath, like the Document Object Model (DOM), models an XML document as a tree of nodes. An XPath expression is a mechanism for navigating through and selecting nodes from the XML document. An XPath expression is in a way analogous to a Structured Query Language (SQL) query used to select records from a database.

XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes. XPath defines a way to compute a string-value for each type of node.

XPath defines a library of standard functions for working with strings, numbers and Boolean expressions.

Examples:

child: : * Select all children of the root node.

.//name Select all elements having the name "name", descendants of the current node.

/catalog/cd[price>10.80]Selects all the cd elements that have a price element with a value larger than 10.80

To find out more about XPath, the following URL is recommended: http://www.w3.org/TR/xpath

<oXygen/>'s XPath console

To use XPath effectively requires at least an understanding of the XPath Core Function Library. If you have this knowledge the <oXygen/> XPath expression field part of the current editor toolbar can be used to aid you in XML document development.

In <oXygen/> a XPath 1.0 or XPath 2.0 expression is typed and executed on the current document from the menu XMLXPath (Ctrl+Shift+X (Cmd+Shift+X on Mac OS)) or from the toolbar button . Both XPath 2.0 basic and XPath 2.0 schema aware expressions can be executed in the XPath console. XPath 2.0 schema aware also takes into account the Saxon EE XML Schema version option.

The content completion assistant that helps in entering XPath expressions in attributes of XSLT stylesheets elements is also available in the XPath console and offers always proposals dependent of the current context of the cursor inside the edited document. The set of XPath functions proposed by the assistant depends on the XPath version selected from the drop-down menu of the XPath button (1.0 or 2.0).

In the following example the cursor is on a person element and the content completion assistant offers all the child elements of the person element and all XPath 2.0 functions:

Figure 11.1. Content Completion in the XPath console

Content Completion in the XPath console

The evaluation of the XPath expression tries to resolve the locations of documents referred in the expression through the XML catalogs which are configured in Preferences and the current XInclude preferences, for example when evaluating the collection(URIofCollection) function (XPath 2.0). If you need to resolve the references from the files returned by the collection() function with an XML catalog set up in the <oXygen/> preferences you have to specify in the query which is the parameter of the collection() function the name of the class of the XML catalog enabled parser for parsing these collection files. The class name is ro.sync.xml.parser.CatalogEnabledXMLReader and you specify it like this:

let $docs := collection(iri-to-uri(
    "file:///D:/temp/test/XQuery-catalog/mydocsdir?recurse=yes;select=*.xml;
    parser=ro.sync.xml.parser.CatalogEnabledXMLReader"))

The results of an XPath query are returned in the Message Panel. Clicking a record in the result list highlights the nodes within the text editor panel with a character level precision. Results are returned in a format that is a valid XPath expression:

- [FileName.xml] /node[value]/node[value]/node[value] -

Figure 11.2. XPath results highlighted in editor panel with character precision

XPath results highlighted in editor panel with character precision

When using the grid editor, clicking a result record will highlight the entire node.

Figure 11.3. XPath results highlighted in the Grid Editor

XPath results highlighted in the Grid Editor

Note

XPath 2.0 basic queries are executed using Saxon 9 PE engine. XPath 2.0 schema aware queries are executed using Saxon EE engine.

The popup menu of the history list of the XPath dialog contains the action Remove for removing the selected expression from the history list.

Example 11.1. XPath Utilization with DocBook DTD

The example is taken from a DocBook book based on the DocBook XML DTD. The book contains a number of chapters. DocBook defines that chapters as have a <chapter> start tag and matching </chapter> end tag to close the element. To return all the chapter nodes of the book enter //chapter into the XPath expression field, then Enter. This will return all the chapter nodes of the DocBook book, in the Message Panel. If your book has six chapters, their will be six records in the result list. Each record when clicked will locate and highlight the chapter and all sibling nodes contained between the start and end tags of the chapter.

If you used XPath to query for all example nodes contained in the section 2 node of a DocBook XML document you would use the following XPath expression //chapter/sect1/sect2/example. If an example node is found in any section 2 node, a result will be returned to the message panel. For each occurrence of the element node a record will be created in the result list.

In the example an XPath query on the file oxygen.xml determined that:

- [oxygen.xml] /chapter[1]/sect1[3]/sect2[7]/example[1]

Which means:

In the file oxygen.xml, first chapter, third section level 1, seventh section level 2, the example node found is the first in the section.


Note

If your project is comprised of a main file with ENTITY references to other files, you can use XPath to return all the name elements of a certain type by querying the main file. The result list will query all referenced files.

Important

If the document defines a default namespace then <oXygen/> will bind this namespace to the first free prefix from the list: default, default1, default2, etc. For example if the document defines the default namespace xmlns="something" and the prefix default is not associated with a namespace then you can match tags without prefix in a XPath expression typed in the XPath console by using the prefix default. For example to find all the level elements when the root element defines a default namespace you should execute in the XPath console the expression:

//default:level

To define default mappings between prefixes that can be used in the XPath console and namespace URIs go to the XPath Options user preferences panel and enter the mappings in the Default prefix-namespace mappings table. The same preferences panel allows also the configuration of the default namespace used in XPath 2.0 expressions entered into the XPath toolbar and the creation of different results panels for XPath queries executed on different XML documents.

To apply a XPath expression relative to the element on which the caret is positioned use the action XML editor contextual menuXML DocumentCopy XPath (Ctrl+Shift+.) (also available on the context menu of the main editor panel) to copy the XPath expression of the current element or attribute to the clipboard and the Paste action of the contextual menu of the XPath console to paste this expression in the console. Then add your relative expression and execute the resulting complete expression.

The popup menu available on right click in the Expression panel of the XPath expressions dialog offers the usual edit actions (Cut, Copy, Paste, Select All)

On Windows the context menu can be displayed with the mouse on a right click or with the keyboard by pressing the special context menu key available on Windows keyboards.