Validating Documents

A Valid XML document is a Well Formed XML document, which also conforms to the rules of a schema which defines the legal elements of an XML document. The schema type can be: XML Schema, Relax NG (full or compact syntax), Schematron, Document Type Definition (DTD) or Namespace Routing Language (NRL).

The purpose of the schema is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements.

The <oXygen/> Validate document function ensures that your document is compliant with the rules defined by an associated DTD, XML Schema, Relax NG or Schematron schema. XML Schema or Relax NG Schema can embed Schematron rules. For Schematron it is possible to select the validation phase.

A line with a validation error or warning will be marked in the editor panel by underlining it with a red color. Also a red sign will mark the position in the document of that line on the right side ruler of the editor panel. The same will happen for a validation warning, only the color will be yellow instead of red.

The ruler on the right of the document is designed to display the errors found during the validation process and also to help the user to locate them more easily. The ruler contains the following areas:

If you don't change the active editor and you don't switch to other application the schema associated to the current document is parsed and cached at the first validate action and is reused by the next Validate document actions without reparsing it. This increases the speed of the validate action starting with the second execution if the schema is large or is located on a remote server on the Web. To reset the cache and reparse the schema you have to use the Reset cache and validate action.

Use one of the actions for validating the current document:

The button Validation options available on the Validate toolbar allows quick access to the validation options of the built-in validator in the <oXygen/> user preferences.

Also you can select several files in the Project panel and validate them with one click by selecting the action Validate selection or the action Validate selection with ... available from the contextual menu of the Project view.

If there are too many validation errors and the validation process is long you can limit the maximum number of reported errors.

Validation of an XML document against an XML Schema containing a type definition with a minOccurs or maxOccurs attribute having a value larger than 256 limits the value to 256 and issues a warning about this restriction in the Message panel at the bottom of the <oXygen/> window. Otherwise for large values of the minOccurs and maxOccurs attributes the validator fails with an OutOfMemory error which practically makes <oXygen/> unusable without a restart of the entire application.

Status messages from every validation action are logged into the Information view.

 Validate as you type

<oXygen/> can be configured to mark validation errors in the edited document as you modify it using the keyboard. If you enable the Validate as you type option any validation errors and warnings will be highlighted automatically in the editor panel after the configured delay from the last key typed, with underline markers in the editor panel and small rectangles on the right side ruler of the editor panel, in the same way as for manual validation invoked by the user.

 

Figure 4.28. Automatic validation of the edited document

Automatic validation of the edited document

If the error message is long and it is not displayed completely in the error line at the bottom of the editing area double-clicking on the error icon at the left of the error line or on the error line displays an information dialog with the full error message. The arrow buttons of the dialog enable the navigation to other errors issued by the validation as you type feature.

 

Figure 4.29. Full error message for validate as you type errors

Full error message for validate as you type errors

<oXygen/> logs status messages of the validation action into the Information view.

 

Example 4.2. Validate document error message

In our example we will use the case where a Docbook listitem element does not match the rules of the docbookx.dtd. In this case running Validate document will return the following error.

E The content of element type "listitem" must
match"(calloutlist|glosslist|itemizedlist|orderedlist|segmentedlist|
simplelist|variablelist| caution|important|note|tip|warning|
literallayout|programlisting|programlistingco|screen|
screenco|screenshot|synopsis|cmdsynopsis|
funcsynopsis|classsynopsis|fieldsynopsis| constructorsynopsis|
destructorsynopsis|methodsynopsis|formalpara|para|simpara|
address|blockquote|graphic|graphicco|mediaobject|
mediaobjectco|informalequation| informalexample|
informalfigure|informaltable|equation|example|
figure|table|msgset|procedure|sidebar|qandaset|anchor|
bridgehead|remark|highlights|abstract|authorblurb|epigraph|
indexterm|beginpage)+".
                            

As you can see, this error message is a little more difficult to understand, so understanding of the syntax or processing rules for the Docbook XML DTD's "listitem" element is required. However, the error message does give us a clue as to the source of the problem, but indicating that "The content of element type "listitem" must match".

Luckily most standards based DTD's, XML Schema's and Relax NG schemas are supplied with reference documentation. This enables us to lookup the element and read about it. In this case we would want to learn about the child elements of "listitem" and their nesting rules. Once we have correctly inserted the required child element and nested it in accordance with the XML rules, the document will become valid on the next validation test.