In Text mode, you can decide how the XML file is formatted and indented. In the other modes, and when you switch between modes, Oxygen XML Author Eclipse plugin automatically formats and indents the XML.
You can trigger a format and indent operation for your XML document (in Text mode) using one of the following actions:
Format and
Indent toolbar button - Formats and indents the current document.Various settings affect how Oxygen XML Author Eclipse plugin formats and indents XML. Many of these settings have to do with how whitespace is handled.
XML documents are text files that describe complex documents. Some of the white space (spaces, tabs, line feeds, etc.) in the XML document belongs to the document it describes (such as the space between words in a paragraph) and some of it belongs to the XML document (such as a line break between two XML elements). Whitespace belonging to the XML file is called insignificant whitespace. The meaning of the XML would be the same if the insignificant whitespace were removed. Whitespace belonging to the document being described is called significant whitespace.
Knowing when whitespace is significant or insignificant is not always easy. For instance, a paragraph in an XML document might be laid out like this:
<p>NO Free man shall be taken or imprisoned, or be stripped of his Freedom,
or Liberties, or free Customs, or be outlawed, or exiled, or any otherwise
destroyed; nor will we not pass upon him, nor condemn him, but by lawful
judgment of his Peers, or by the <xref
href="http://en.wikipedia.org/wiki/Law_of_the_land" format="html"
scope="external">Law of the land</xref>.
We will sell to no man, we will not deny to any man either Justice or Right.</p>
By default, XML considers a single whitespace between words to be significant, and all other whitespace to be insignificant. The paragraph above could have been written on one line because the XML parser would see it as exactly the same paragraph since all multiple consecutive whitespaces will be replaced with a single whitespace. Removing the insignificant space in markup like this is called normalizing space.
In some cases, all the spaces inside an element should be treated as significant. For example, in a code sample:
<codeblock>
class HelloWorld
{
public static void main(String args[])
{
System.out.println("Hello World");
}
}
</codeblock>
Here every whitespace character between the <codeblock> tags should be treated
as significant.
When Oxygen XML Author Eclipse plugin formats and indents an XML document, it introduces or removes insignificant whitespace to produce a layout with reasonable line lengths and elements indented to show their place in the hierarchy of the document. To correctly format and indent the XML source, Oxygen XML Author Eclipse plugin needs to know when to treat whitespace as significant and when to treat it as insignificant. However it is not always possible to tell this from the XML source file alone. To determine what whitespace is significant, Oxygen XML Author Eclipse plugin assigns each element in the document to one of four categories:
In the ignore space category, all whitespace is considered insignificant. This generally applies to content that consists only of elements nested inside other elements, with no text content.
In the normalize space category, a single whitespace character between character strings is considered significant and all other spaces are considered insignificant. Therefore, all consecutive whitespaces will be replaced with a single space. This generally applies to elements that contain text content only.
<p>The file is located in <i>HOME</i>/<i>USER</i>/hello.
This is a <strong>big</strong>
<emphasis>deal</emphasis>.
</p>In this example, whitespace should not be introduced around the i tags as it would
introduce extra significant whitespace into the document. The space between the end
<strong> tag and the beginning <emphasis>
tag should be normalized to a single space, not zero spaces.
In the preserve space category, all whitespace in the element is regarded as significant. No changes are made to the spaces in elements in this category. However, child elements may be in another category, and may be treated differently.
Attribute values are always in the preserve space category. The spaces between attributes in an element tag are always in the default space category.
Oxygen XML Author Eclipse plugin evaluates several pieces of information to assign an element to one
of these categories. An element is always assigned to the most restrictive category (from
Ignore to Preserve) that it is assigned to by any of the sources Oxygen XML Author Eclipse plugin
consults. For instance, if the element is named on the Default
elements list (as described below) but it has an
@xml:space="preserve" attribute in the source file, it will be assigned to
the preserve space category. If an element has the @xml:space="default"
attribute in the source, but is listed on the Mixed content elements
list, it will be assigned to the mixed content category.
To assign elements to these categories, Oxygen XML Author Eclipse plugin consults information from the following sources:
@xml:space attribute, the element is
promoted to the appropriate category based on the value of the attribute.whitespace: pre setting to an element, it is promoted to
the preserve space category.display property
set to inline then the node is promoted to the mixed content
category.display
property set to inline then the node is promoted to the mixed
content category.display
property set to table then the node is assigned to the ignore
space category. xs:string,
the element will be promoted to the preserve space category because the string
built-in type has the whitespace facet with the value preserve. If an element is listed in the Preserve space tab of the Element Spacing list in the XML formatting preferences, it is promoted to the preserve space category.
If an element is listed in the Default space tab of the Element Spacing list in the XML formatting preferences, it is promoted to the default space category
If an element is listed in the Mixed content tab of the Element Spacing list in the XML formatting preferences, it is promoted to the mixed content category.
If an element contains mixed content, that is, a mix of text and other elements, it is promoted to the mixed content category. (Note that, in accordance with these rules, this happens even if the schema declares the element to have element only content.)
If an element contains text content, it is promoted to the default space category.
In general, an element can only be promoted to a more restrictive category (one that treats
more whitespace as significant). However, there is one exception. In
Author mode, if an element is marked as mixed content in the
schema, but the actual element contains no text content, it can be demoted to the space
ignore category if all of its child elements are displayed as blocks by the
associated CSS (that is, they have a CSS property of display: block). For
example, in some schemas, a section or a table entry can be defined as having mixed content
but in many cases they contain only block
elements. In these cases, any whitespace they contain cannot be
significant and they can be treated as space ignore elements. This exception can be turned
on or off using the Schema-Aware Editing option in the
Schema-Aware preferences page.
You can control how Oxygen XML Author Eclipse plugin formats and indents XML documents. This can be particularly important if you store your XML document in a version control system, as it allows you to limit the number of trivial changes in spacing between versions of an XML document. The following preference pages include options that control how XML documents are formatted:
Oxygen XML Author Eclipse plugin formats and indents a document, or part of it, on the following occasions: