<oXygen/> XML Editor User Guide

Editing Documents

While editing a document is a simple procedure, there are some points of which you should be aware and which should make your editing more productive.

Working with Unicode

Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. Unicode is an internationally recognized standard, adopted by industry leaders. The Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646.

It is supported in many operating systems, all modern browsers, and many other products. The emergence of the Unicode Standard, and the availability of tools supporting it, are among the most significant recent global software technology trends. Incorporating Unicode into client-server or multi-tiered applications and websites offers significant cost savings over the use of legacy character sets.

As a modern XML Editor, <oXygen/> provides support for the Unicode standard. Enabling your XML application to be targeted across multiple platforms, languages and countries without re-engineering. Internally, the <oXygen/> Editor uses 16bit characters covering the Unicode Character set.

On loading documents of the type XML, XSL, XSD and DTD, <oXygen/> reads the document prolog to determine the specified encoding type. This is then used to instruct the Java Encoder to load support for and save using the code chart specified. In the event that the encoding type cannot be determined, <oXygen/> will prompt and display the "Available Java Encodings" dialog. The "Available Java Encodings" dialog provides a list of all encodings supported by the Java platform.

Figure 4.11. Available Java Encodings Dialog

Available Java Encodings Dialog

While in most cases you will use UTF-8, simply changing the encoding name will cause the file to be saved using the new encoding. The appendix Unicode Character Encoding provides a Matrix that matches common names with Java Names. It also explains what you should type in the XML prolog to cause the document to be saved as the required encoding.

To edit document written in Japanese or Chinese, you will need to change the font to one that supports the specific characters (a Unicode font). For the Windows platform, use of "Arial Unicode MS" or "MS Gothic" is recommended. Do not expect Wordpad or Notepad to handle these encodings. Use Explorer or Word to eventually examine XML documents.

[Note]Note

The naming convention used under Java does not always correspond to the common names used by the Unicode standard. For instance, while in XML you will use encoding="UTF-8", in Java the same encoding has the name "UTF8".

Streamline with Tag-Insight

<oXygen/>'s intelligent Tag-Insight feature is an content assistant that enables rapid, in-line identification and insertion of structured language elements, attributes and in some cases their parameter options.

The Tag-Insight assistant is automatically displayed whenever the < character is entered into a document or by pressing CTRL+Space on a partial element or attribute name. Moving the focus to highlight an element and pressing the Enter key or the Tab key, inserts both the start and end parts of the highlighted element in to the document. If the feature Add Element Content of Tag-Insight is enabled all the elements that the new element must contain, as specified in the DTD or XML Schema, are inserted automatically in the document. The Tag-Insight assistant can also add optional content and first choice particle, as specified in the DTD or XML Schema, for the element if the two options are enabled. After inserting, the cursor is positioned directly before the > character of the start tag, if the element has attributes, in order to enable rapid insertion of any attributed supported by the element, or after the > char of the start tag if the element has no attributes. Pressing the space bar, directly after element insertion will again display the assistant. In this instance the attributes supported by that element will be displayed. If an attribute supports a fix set of parameters, the assistant will display the list of valid parameter. If the parameter setting is user defined and therefore variable, the assistant will be closed to enable manual insertion. The values of the attributes can be learned from the same elements in the current document.

If the XSD or DTD for the document contains element, attributes or attributes values annotations, these will be presented when the content completion window is displayed, if the coressponding option is enabled. In a XSD annotations are put in a <xs:annotation> element:

<xs:annotation> <xs:documentation>Description of the element.</xs:documentation> </xs:annotation>

For DTD <oXygen/> define a mechanism for annotation using comments:

<!--doc:Description of the element. -->

The content assistant can be invoked at any time by pressing CTRL+Space and the context-sensitive list of proposals will be shown in any position of the caret in the edited document in which element, attribute or attribute value insertion makes sense. Such positions are: anywhere within a tag name or at the beginning of a tag name in an XML document, XML Schema, DTD or Relax NG (full or compact syntax) schema, anywhere within an attribute name or at the beginning of an attribute name in any XML document with an associated grammar, and within attribute values or at the beginning of attribute values in XML documents where lists of possible values have been defined for that element in the grammar associated with the document.

Figure 4.12. Tag-Insight Assistant

Tag-Insight Assistant

The content of the Tag-Insight assistant is dependent on the element structure given in a given DTD, XML Schema, Relax NG (full or compact syntax) schema or NRL schema.

The number and type of elements displayed by the assistant is sensitive to the current position of the cursor in the structured document . The child elements displayed within a given element are defined by the structure of the specified DTD, XML Schema, Relax NG (full or compact syntax) schema or NRL schema. All elements that can't be child elements of the current element according to the specified schema are filtered out.

If the schema for the edited document defines attributes of type ID and IDREF the content assistant will display for IDREF attributes a list of all the ID values already present in the document for an easy insertion of a valid ID value at the cursor position in the document. This is available for documents that use DTD, XML Schema and Relax NG schema.

For documents that use a XML Schema or Relax NG schema the content assistant offers proposals for attributes and elements values that have as type an enumeration of tokens.

The DTD, XML Schema, Relax NG schema or NRL schema used to populate the Tag-Insight assistant is specified in the following methods, in order of precedence:

  • From the file specified in the external subset of the document prolog. In this case <oXygen/> reads the prolog and resolves the location of the DTD, XML Schema, Relax NG schema or NRL schema.

  • From the file specified in the <oXygen/> Tag-Insight dialog. <oXygen/> will read the Tag-Insight settings when the prolog fails to provide or resolve the location of a DTD, XML Schema or Relax NG schema.

Creating DTDs

When working with documents that do not specify a DTD, or for which the DTD is not known or does not exist, <oXygen/> is able to learn and translate it to a DTD, which in turn can be saved to a file in order to provide a DTD. In addition to being useful for quick creation of a DTD that will be capable of providing an initialization source for the Tag-Insight assistant. This feature can also be used to produce DTDs for documents containing personal or custom element types.

Procedure 4.8. To create a DTD:

  1. Open the structured document from which a DTD will be created.

  2. Select Document->Learn Structure (Ctrl+Shift+L). <oXygen/> will learn the document structure, when finished displaying words "Learn Complete" in the Message Pane of the Editor Status bar.

  3. Select Document->Save Structure (Ctrl+Shift+S) to save the DTD currently stored in memory to file.

[Note]Note

The resulting DTD is only valid for documents containing the elements and structures defined by the document used as the input for creating the DTD. If new element types or structures are defined in a document, they must be added to the DTD in order for successful validation.

Working with XML Catalogs

When Internet access is not available one or more XML catalogs can be added to the list in the dialog below and the local copies of the DTD, XML Schema, Relax NG schema and/or NRL schema files will be used. When you add or delete an XML catalog to the list of XML catalogs in the Options -> Preferences -> XML Catalog pane you must restart the application so that the changes take effect.

If "Use default catalog" option is checked <oXygen/> will use the built-in catalogs for DocBook, TEI and XHTML documents located in the frameworks subdirectory of the installation directory. Otherwise <oXygen/> will use the catalogs specified in the list.

The Prefer option is used to specify whether <oXygen/> will try to resolve first the PUBLIC or SYSTEM reference using the specified XML catalogs. If a PUBLIC reference is not mapped in any of the catalogs then a SYSTEM reference is looked up. The verbosity level specifies the types of output messages displayed and can have one of the values: debug, warn, info, error and fatal.

If the user has added no XML catalogs to this list then <oXygen/> will add by default the built-in catalogs for DocBook and TEI documents located in the frameworks/docbook and frameworks/tei subdirectories of the installation directory.

Formatting and Indenting Documents (Pretty Print)

In structured markup languages, the whitespace between elements that is created by use of the Space bar, Tab or multiple line breaks insertion from use of the Enter, is not recognized by the parsing tools. Often this means that when structured markup documents are opened, they are arranged as one long, unbroken line, what seems to be a single paragraph.

While this is perfectly acceptable practice, it makes editing difficult and increases the likelihood of errors being introduced. It also makes the identification of exact error positions difficult. Formatting and Indenting, also called "Pretty Print", enables such documents to be neatly arranged, in a manner that is consistent and promotes easier reading on screen and in print output.

Pretty print is in no way associated with the layout or formatting that will be used in the transformed document. This layout and formatting is supplied by the XSL style sheet specified at time of transformation.

Procedure 4.9. To format and indent a document:

  1. Open or focus on the document that is to be formatted and indented.

  2. Selecting Document->Format and Indent (Ctrl+Shift+P). While in progress the Status Panel will indicate "Pretty print in progress". On completion, this will read "Pretty print successful" and the document will be arranged.

[Note]Note

Pretty Print can format empty elements as an auto-closing markup tag (ex. <a/>) or as a regular tag (ex. <a></a> ). It can preserve the order or attributes or order them alphabetically. Also the user may specify a list of elements for which white spaces are preserved exactly as before Pretty print and a one with elements for which white space is stripped. These can be configured from Options-> Preferences -> Editor -> Format.

Pretty Print requires that the structured document is "Well Formed". If the document is not "Well Formed" an error message is displayed. The message will usually indicate that a problem has been found in the form and will hint to the problem type. It will not highlight the general position of the error, to do this run "Well Formed" function by selecting Document->Check document form (Ctrl+Shift+W).

Using XPath Expressions

XPath is a language for addressing specific parts of an XML document. XPath, like the Document Object Model (DOM), models an XML document as a tree of nodes. An XPath expression is a mechanism for navigating through and selecting nodes from the XML document. An XPath expression is in a way analogous to a Structured Query Language (SQL) query used to select records from a database.

XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes. XPath defines a way to compute a string-value for each type of node.

XPath defines a library of standard functions for working with strings, numbers and Boolean expressions.

Examples:

child: : * Select all children of the root node.

.//name Select all elements having the name "name", descendants of the current node.

/catalog/cd[price>10.80]Selects all the cd elements that have a price element with a value larger than 10.80

To use XPath effectively requires, at least an understanding of the XPath Core Function Library. Once you have this knowledge the <oXygen/> XPath expression field part of the Editor toolbar can be used to aid you in XML document development.

If the edited document defines a default namespace then <oXygen/> will bind this namespace to the first free prefix from the list: default, default1, default2, etc. For example if the document defines the default namespace xmlns="something" and the prefix default is not associated with a namespace then you can match unprefixed tags in an XPath expression by using the prefix default.

To find out more about XPath, the following URL is recommended: http://www.w3.org/TR/xpath

In <oXygen/> the results of an XPath query are returned in the Message Panel. Clicking a record in the result list highlights the nodes within the editing panel.

Results are returned in a format that itself is a valid XPath expression:

- [FileName.xml] /node[value]/node[value]/node[value] -

Example 4.3. XPath Utilization with DocBook DTD

Our example is taken from a DocBook book based on the DocBook XML DTD. The book contains a number of chapters. DocBook defines that chapters as have a <chapter> start tag and matching </chapter> end tag to close the element. To return all the chapter nodes of the book enter //chapter into the XPath expression field, then Enter. This will return all the chapter nodes of the DocBook book, in the Message Panel. If your book has six chapters, their will be six records in the result list. Each record when clicked will locate and highlight the chapter and all sibling nodes contained between the start and end tags of the chapter.

If we used XPath to query for all example nodes contained in the section 2 node of a DocBook XML document we would use the following XPath expression //chapter/sect1/sect2/example. If an example node is found in any section 2 node, a result will be returned to the message panel. For each occurrence of the element node a record will be created in the result list.

In our example an XPath query on the file oxygen.xml determined that:

- [oxygen.xml] /chapter[1]/sect1[3]/sect2[7]/example[1]

Which means:

In the file oxygen.xml, first chapter, third section level 1, seventh section level 2, the example node found is the first in the section.

[Note]Note

If your project is comprised of a main file with ENTITY references to other files, you can use XPath to return all the name elements of a certain type by querying the main file. The result list will query all referenced files.

Using Check Spelling

The Check Spelling option enables you to perform the check spelling on the current document:

Figure 4.13. Check Spelling Dialog

Check Spelling Dialog

Complete the dialog as follows:

Unrecognized Word

Contains the word that cannot be found in the selected dictionary. The word is also highlighted in the XML document.

Replace with

The character string which is suggested to replace the unrecognized word.

Guess

Displays a list of words suggested to replace the unknown word. Double clicking a word in this list automatically inserts it in the document and continues the spell checking process.

Dictionary

Displays a list with the available dictionaries.

Replace

Replaces the currently highlighted word in the XML document, with the selected word in the "Replace with" field.

Replace All

Replaces all occurrences of the currently highlighted word in the XML document, with the selected word in the "Replace with" field.

Ignore

Allows you to continue checking the document while ignoring the first occurrence of the unknown word. The same word will be flagged again if it appears in the document.

Ignore all

Ignores all instances of the unknown word in the whole document.

Learn

Includes the unrecognized word in the list of valid words so that the spell checker will no longer consider it for correction.

Options

Sets the configuration options of the Spell Checker.

Begin at caret position

When checked, the spell checker begins checking from the current cursor position.

OK

Closes the Spell Checker dialog.

Figure 4.14. Options Dialog

Options Dialog

The Options dialog contains the global check spelling options:

Case sensitive

When checked, operations ignore capitalization errors.

Ignore mixed case words

When checked, operations do not check words containing case mixing (e.g. "SpellChecker").

Ignore word with digits

When checked, the Spell Checker do not check words containing digits (e.g. "b2b").

Ignore Duplicates

When checked, the Spell Checker do not signal two successive identical words as an error.

Ignore URL

When checked, ignores words looking like URL or file names (e.g. "www.oxygenxml.com" or "c:\boot.ini") .

Check punctuation

When checked, punctuation checking is enabled: misplaced white space and wrong sequences, like a dot following a comma, are detected.

Enable auto replace

Enables the "Replace Always" feature.

Allow compounds words

When checked, all words formed by concatenating two legal words with an hyphen are accepted. If the language allows it, two words concatenated without hyphen are also accepted.

Allow general prefixes

When checked, a word formed by concatenating a registered prefix and a legal word is accepted. For example if "mini-" is a registered prefix, accepts "mini-computer".

Allow file extensions

When checked, accepts any word ending with registered file extensions (e.g. "myfile.txt", "index.html" etc.).

Suggestion

This option indicates the type of spell checker accuracy, which may be: "Favour speed over quality", "Normal" and "Favour quality over speed".

<oXygen/> provides dictionaries only for the languages English (EN, GB, CA, US), French (FR, BE, CA, CH) and German in the form of .dar files located in the directory [oXygen-install-dir]/dicts. A pre-built dictionary can be added by copying the corresponding .dar archive to the same directory and restarting <oXygen/>. A dictionary can be built with the tool available at http://www.xmlmind.com/spellchecker/dictbuilder.shtml.

Learned words are stored into an persistent learned-words dictionary with the .tdi extensions located in [user-home-dir]/spell directory ([user-home-dir]/.spell directory on Mac OS X). There is one dictionary for each language-country variant combination. If the Learn button is pressed by mistake the only possibility to delete the learned word from the learned-words dictionary is to edit this dictionary manually and restart <oXygen/> because the spell-check component does not allow its editing by the user interface.

[Note]Note

The Czech check spelling dictionary may be downloaded from http://www.kosek.cz/sw/xxe/cs.dar

Using Search and Replace

The Search and Replace option enables you to perform the following operations on the current document:

  • find occurrences of a word or string of characters including white spaces and highlight the position in the editor.

  • replace occurrences of target defined in the "Find" field with a word or string of characters, including white spaces, defined in the "Replace" field.

  • find all occurrences of a word or string of characters including white spaces and return a result list to the Message Panel.

  • replace all occurrences of a word or string of characters including white spaces.

Figure 4.15. Find/Replace Dialog

Find/Replace Dialog

Complete the dialog as follows:

Text to find

The target character string to search for.

Replace with

The character string with which to replace the target. It may contain '{$NEWLINE}' which at the replace time will insert a new line character.

Find

Execute a find operation for the next occurrence of the target and stop.

Replace

Execute a replace operation for the target and stop.

Find all

Executes a find operation and returns all results to the Message Panel.

Case sensitive

When checked, operations are case sensitive.

Whole words only

When checked only whole occurrences of a word will be included in the operation.

Find in tags

When checked, operation will include content of the start and end tags of the XML elements.

Regular Expression

When checked allows using any regular expression in PERL syntax.

Search from file start

Starts the operation from start of file, position 0:0.

Wrap around

Continues the find from the start of the document after reaching the end.

Using Search and Replace in Files Dialog

The Search and Replace in Files option enables you to perform the same operations as the Search/Replace option on any number of files located in a given path.

Figure 4.16. Search/Replace in Files

Search/Replace in Files

Complete the dialog as follows:

Text to Find

The target character string to search for.

Case Sensitive

When checked, operations are case sensitive.

Whole words only

When checked only whole occurrences of a word will be included in the operation.

Find in tags

When checked, operation will include content of the start and end tags of the XML elements.

Regular Expression

When checked allows using any regular expression in PERL syntax.

Replace with

The character string with which to replace the target. It may contain '{$NEWLINE}' which at the replace time will insert a new line character.

Make Backups with extension

In the replace process <oXygen/> makes backup files of the modified files. The default extension is *bak, but you can change extension as you prefer.

Specified Path

Choose the search path

Path of current file

Use the path of the current file

Project Files (File Filter)

Search the files from the current project using the specified file filter.

Selected project files

Search only in the selected files of the current opened project

[Note]Note

The search is performed only on local files. If you have added to the project remote files from an FTP or Webdav server these will be skipped from the search.

Find All

Executes a find operation and returns the result list to the Message Pane

Replace All

Replaces all occurrences of the target contained in the specified files.

[Warning]Use this option with caution.

Global search and replace across all project files does not open the files containing the targets, nor does it prompt on a per occurrence basis, to confirm that a replace operation must be performed. As the operation simply matches the string defined in the find field, this may result in replacement of matching strings that were not originally intended to be replaced.

Using Go To Dialog

The Go to ... option enables you to go to a precise location in the current edited file specified by line and column or by offset relative to the beginning of the file.

Figure 4.17. Go to

Go to

Complete the dialog as follows:

Line

The destination line in the current document.

Column

The destination column in the current document.

Offset

The destination offset relative to the beginning of document.

Working with Large Documents

The problem

Let's consider the case of documenting a large project. It is likely to be several people involved. The resulting document can be few megabytes in size. How to deal with this amount of data in such a way the work parallelism would not be affected ?

Fortunately, XML provides a solution for this. It can be created a master document, with references to other documents, containing the document sections. The users can edit individually the sections, then apply FOP or XSLT over the master and obtain the result files, let say PDF or HTML.

  • The master should declare the DTD to be used and the external entities - the sections. A sample document is:

                                    <?xml version="1.0" encoding="UTF-8"?> 
                                    <!DOCTYPE book SYSTEM "../xml/docbookx.dtd" [ <!ENTITY testing SYSTEM "testing.xml" > ]> 
                                    <book> 
                                    <chapter> ...                
                                

    At a certain point in the master document there can be inserted the section "testing.xml" entity:

    ... &testing; ...

  • The document containing the section must not define again the DTD.

    <section> ... here comes the section content ... </section>

    [Note]Note

    The indicated DTD and the element names ( "section", "chapter" ) are used here only for illustrating the inclusion mechanism. You can use any DTD and element names you need.

Using the project panel

When you have a large number of files to edit and to organize, you may use the project panel.

[Note]Note

The operations can be accessed using the toolbar buttons.

Creating a project

Choose File / New Project to create a new project. Make sure the project panel is visible by checking the View / Show Project item. (A check mark should be displayed in the menu.)

Creating project folders

We can organize the project as a collection of folders. These are logical folders, they do not have any connection with directories on the disk. Right click on the icon of the project, in the project panel. A popup menu will be shown.

Figure 4.18. Project panel popup menu

Project panel popup menu

Choose the first option, "New folder". Enter a name of the folder.

Figure 4.19. Project panel new folder dialog

Project panel new folder dialog
Adding files to a project

To add one or more files to the newly created folder, right click on it, and choose "Add file".

A shortcut for adding the edited file to the selected folder is to press the right-most button from the project panel toolbar.

Figure 4.20. Project panel toolbar

Project panel toolbar
Removing files or project folders

Right click on the item you want to remove. Choose the remove option.

Setting a schema for the Tag-Insight

In case you are editing document fragments, for instance the chapters from a book each one in a separate file, you can activate the Code Completion for these fragments in two ways:

Setting a default DTD

As explained above, when splitting a large document, only one file will contain the Document Type Definition (the DTD) and will include the others. The included sections cannot define again the DTD because the main document will not be valid.

[Important]Important

The editor is creating the Tag-Insight lists by analysing the specified DTD and the current context (the position in the editor). If you change the DTD you can observe that the list of tags to be inserted is changing.

Figure 4.21. Tag-Insight driven by a Docbook DTD

Tag-Insight driven by a Docbook DTD

To offer Tag-Insight on the included files, you can specify a DTD or XML Schema to be used when the documents do not specify one.

Changing the default DTD

From the Options menu, Preferences dialog, choose Tag-Insight/Default.

The displayed panel has two radio buttons: one for DTD and the other for XML Schema.

If you are creating documentation with Docbook then is a good choice to set the docbookx.dtd file.

Figure 4.22. Tag-Insight configuration dialog

Tag-Insight configuration dialog
Setting a Processing Instruction

The same effect is obtained by configuring a processing instruction that specifies the DTD to be used. The advantage of this method is that you can configure the TagInsight for each file. The processing instruction must be added at the beginning of the document, just after the XML prologue: <?oxygen DTDSystemID="system" DTDPublicID="public"?>

[Note]Note

The system and public values must be the same as for a DOCTYPE declaration.

Creating a included file - a section.

Select File / New. Choose the XML type, but with no DTD.

Make sure that in the Tag-Insight option you have chosen the correct DTD. Now you can type in the edited document the root element of your section. For example, if you are using docbook it can be "<chapter></chapter>" or "<section></section>". Now if you are moving the cursor between the tags and press "<", you will see the list of inserable element names.

Figure 4.23. Tag-Insight list over a document with no DTD

Tag-Insight list over a document with no DTD
[Note]Note

The validation will not work on a included file, as no DTD is set. The validation can be done only from the master file. At this point you can only check the document to be well-formed.

Quick Document Browsing Using Bookmarks

The concept of bookmark is the same as in other IDEs: the user can mark a position in one edited document so that he can quickly return after further editing and browsing through one or more documents opened at the same time. Up to nine distinct bookmarks can be placed in any opened document. Configurable shortcut key strokes are available for placing bookmarks and for quick return to any of the marked positions.

Figure 4.24. Editor Bookmarks

Editor Bookmarks

The key strokes can be configured from Options-> Preferences->Menu shortcut keys.

A bookmark can be placed from Edit-> Bookmarks->Create, from Edit-> Bookmarks->Bookmarks quick creation and by clicking in the margin of the editing area, to the left of the line number area, reserved for bookmarks.

Quickly switching to a position marked by a bookmark can be done by Edit-> Bookmarks->Go to.

Folding of the XML Elements

XML documents are organized as a tree of elements. When working on a large document you can collapse some elements leaving in the focus only the ones you need to edit. Expanding and collapsing works on individual elements: expanding an element leaves the child elements unchanged.

Figure 4.25. Folding of the XML Elements

Folding of the XML Elements

An unique feature of <oXygen/> is the fact that the folds are persistent: the next time you will open the document the folds are restored to the last state so you won't have to collapse the uninteresting parts again.

You can use folding by clicking on the special marks displayed in the left part of the document editor, from the context menu or from the DocumentFolding menu.