com.xmlmind.util
Class XMLText

java.lang.Object
  extended by com.xmlmind.util.XMLText

public final class XMLText
extends java.lang.Object

A collection of utility functions (static methods) related to XML characters and XML text.


Method Summary
static boolean checkText(java.lang.String text)
          Returns false if specified text contains non-XML characters.
static java.lang.String collapseWhiteSpace(java.lang.String value)
          Replaces successive XML space characters by a single space character (' ') then removes leading and trailing space characters if any.
static java.lang.String compressWhiteSpace(java.lang.String value)
          Replaces successive XML space characters ('\t', '\r', '\n', ' ') by a single space character (' ').
static void escapeXML(char[] chars, int offset, int length, java.lang.StringBuilder escaped)
          Escapes specified character array (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc).
static void escapeXML(char[] chars, int offset, int length, java.lang.StringBuilder escaped, int maxCode)
          Escapes specified character array (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc).
static java.lang.String escapeXML(java.lang.String string)
          Escapes specified string (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc).
static void escapeXML(java.lang.String string, java.lang.StringBuilder escaped)
          Escapes specified string (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc).
static java.lang.String filterText(java.lang.String text)
          Returns a copy of specified text after removing all non-XML characters (if any).
static boolean isName(java.lang.String s)
          Tests if specified string is a lexically correct Name.
static boolean isNameChar(char c)
          Tests if specified character can used in an Name at a position other the first one.
static boolean isNameOtherChar(char c)
          Tests if specified character, even if not authorized as the first character of an Name, can be one of the other characters of an Name.
static boolean isNameStartChar(char c)
          Tests if specified character can used as the start of an Name.
static boolean isNCName(java.lang.String s)
          Tests if specified string is a lexically correct NCName.
static boolean isNCNameChar(char c)
          Tests if specified character can used in an NCName at a position other the first one.
static boolean isNCNameOtherChar(char c)
          Tests if specified character, even if not authorized as the first character of an NCName, can be one of the other characters of an NCName.
static boolean isNCNameStartChar(char c)
          Tests if specified character can used as the start of an NCName.
static boolean isNmtoken(java.lang.String s)
          Tests if specified string is a lexically correct NMTOKEN.
static boolean isPITarget(java.lang.String s)
          Tests if specified string is a lexically correct target for a process instruction.
static boolean isXMLChar(char c)
          Tests if specified character is a character which can be contained in a XML document.
static boolean isXMLSpace(char c)
          Tests if specified character is a XML space ('\t', '\r', '\n', ' ').
static boolean isXMLSpace(java.lang.CharSequence chars)
          Tests whether specified character sequence only contains XML space ('\t', '\r', '\n', ' ').
static java.lang.String quoteXML(java.lang.String string)
          Escapes specified string (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc) then puts the escaped string between quotes (").
static void quoteXML(java.lang.String string, java.lang.StringBuilder quoted)
          Escapes specified string (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc) then puts the escaped string between quotes (").
static java.lang.String replaceWhiteSpace(java.lang.String value)
          Replaces sequence "\r\n" and characters '\t', '\r', '\n' by a single space character ' '.
static java.lang.String[] splitList(java.lang.String s)
          Splits specified string at XML whitespace character boundaries ('\t', '\r', '\n', ' ').
static java.lang.String unescapeXML(java.lang.String text)
          Unescapes specified string.
static void unescapeXML(java.lang.String text, int offset, int length, java.lang.StringBuilder unescaped)
          Unescapes specified string.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

isXMLSpace

public static boolean isXMLSpace(char c)
Tests if specified character is a XML space ('\t', '\r', '\n', ' ').

Parameters:
c - character to be tested
Returns:
true if test is successful; false otherwise

isXMLSpace

public static boolean isXMLSpace(java.lang.CharSequence chars)
Tests whether specified character sequence only contains XML space ('\t', '\r', '\n', ' ').

Parameters:
chars - character sequence to be tested
Returns:
true if chars is empty or only contains XML space; false otherwise

isXMLChar

public static boolean isXMLChar(char c)
Tests if specified character is a character which can be contained in a XML document.

Parameters:
c - character to be tested
Returns:
true if test is successful; false otherwise

checkText

public static boolean checkText(java.lang.String text)
Returns false if specified text contains non-XML characters. Otherwise, return true.


filterText

public static java.lang.String filterText(java.lang.String text)
Returns a copy of specified text after removing all non-XML characters (if any). Moreover, this function always replaces '\r' and "\r\n" by '\n'.

Parameters:
text - text to be filtered
Returns:
filtered text

isNCNameStartChar

public static boolean isNCNameStartChar(char c)
Tests if specified character can used as the start of an NCName.

Corresponds to: Letter | '_'.

See Also:
isNCNameOtherChar(char), isNCNameChar(char)

isNCNameOtherChar

public static boolean isNCNameOtherChar(char c)
Tests if specified character, even if not authorized as the first character of an NCName, can be one of the other characters of an NCName.

Corresponds to: Digit | '.' | '-' | CombiningChar | Extender.

See Also:
isNCNameStartChar(char), isNCNameChar(char)

isNCNameChar

public static boolean isNCNameChar(char c)
Tests if specified character can used in an NCName at a position other the first one.

Corresponds to: Letter | Digit | '.' | '-' | '_' | CombiningChar | Extender.

See Also:
isNCNameStartChar(char), isNCNameOtherChar(char)

isNCName

public static boolean isNCName(java.lang.String s)
Tests if specified string is a lexically correct NCName.

Parameters:
s - string to be tested
Returns:
true if test is successful; false otherwise

isNameStartChar

public static boolean isNameStartChar(char c)
Tests if specified character can used as the start of an Name.

Corresponds to: Letter | '_' | ':'.

See Also:
isNameOtherChar(char), isNameChar(char)

isNameOtherChar

public static boolean isNameOtherChar(char c)
Tests if specified character, even if not authorized as the first character of an Name, can be one of the other characters of an Name.

Corresponds to: Digit | '.' | '-' | ':' | CombiningChar | Extender.

See Also:
isNameStartChar(char), isNameChar(char)

isNameChar

public static boolean isNameChar(char c)
Tests if specified character can used in an Name at a position other the first one.

Corresponds to: Letter|Digit | '.' | '-' | '_' | ':' | CombiningChar|Extender.

See Also:
isNameStartChar(char), isNameOtherChar(char)

isName

public static boolean isName(java.lang.String s)
Tests if specified string is a lexically correct Name.

Parameters:
s - string to be tested
Returns:
true if test is successful; false otherwise

isNmtoken

public static boolean isNmtoken(java.lang.String s)
Tests if specified string is a lexically correct NMTOKEN.

Parameters:
s - string to be tested
Returns:
true if test is successful; false otherwise

isPITarget

public static boolean isPITarget(java.lang.String s)
Tests if specified string is a lexically correct target for a process instruction.

Note that Names starting with "xml" (case-insensitive) are rejected.

Parameters:
s - string to be tested
Returns:
true if test is successful; false otherwise

collapseWhiteSpace

public static java.lang.String collapseWhiteSpace(java.lang.String value)
Replaces successive XML space characters by a single space character (' ') then removes leading and trailing space characters if any.

Parameters:
value - string to be processed
Returns:
processed string

compressWhiteSpace

public static java.lang.String compressWhiteSpace(java.lang.String value)
Replaces successive XML space characters ('\t', '\r', '\n', ' ') by a single space character (' ').

Parameters:
value - string to be processed
Returns:
processed string

replaceWhiteSpace

public static java.lang.String replaceWhiteSpace(java.lang.String value)
Replaces sequence "\r\n" and characters '\t', '\r', '\n' by a single space character ' '.

Parameters:
value - string to be processed
Returns:
processed string

splitList

public static java.lang.String[] splitList(java.lang.String s)
Splits specified string at XML whitespace character boundaries ('\t', '\r', '\n', ' '). Returns list of parts.

Parameters:
s - string to be split
Returns:
list of parts

quoteXML

public static java.lang.String quoteXML(java.lang.String string)
Escapes specified string (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc) then puts the escaped string between quotes (").

Parameters:
string - string to be escaped and quoted
Returns:
escaped and quoted string

quoteXML

public static void quoteXML(java.lang.String string,
                            java.lang.StringBuilder quoted)
Escapes specified string (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc) then puts the escaped string between quotes (").

Parameters:
string - string to be escaped and quoted
quoted - buffer used to store escaped and quoted string (characters are appended to this buffer)

escapeXML

public static java.lang.String escapeXML(java.lang.String string)
Escapes specified string (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc).

Parameters:
string - string to be escaped
Returns:
escaped string

escapeXML

public static void escapeXML(java.lang.String string,
                             java.lang.StringBuilder escaped)
Escapes specified string (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc).

Parameters:
string - string to be escaped
escaped - buffer used to store escaped string (characters are appended to this buffer)

escapeXML

public static void escapeXML(char[] chars,
                             int offset,
                             int length,
                             java.lang.StringBuilder escaped)
Escapes specified character array (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc).

Parameters:
chars - character array to be escaped
offset - specifies first character in array to be escaped
length - number of characters in array to be escaped
escaped - buffer used to store escaped string (characters are appended to this buffer)

escapeXML

public static void escapeXML(char[] chars,
                             int offset,
                             int length,
                             java.lang.StringBuilder escaped,
                             int maxCode)
Escapes specified character array (that is, '<' is replaced by "&#60;", '&' is replaced by "&#38;", etc).

Parameters:
chars - character array to be escaped
offset - specifies first character in array to be escaped
length - number of characters in array to be escaped
escaped - buffer used to store escaped string (characters are appended to this buffer)
maxCode - characters with code > maxCode are escaped as &#code;. Pass 127 for US-ASCII, 255 for ISO-8859-1, otherwise pass Integer.MAX_VALUE.

unescapeXML

public static java.lang.String unescapeXML(java.lang.String text)
Unescapes specified string. Inverse operation of escapeXML(java.lang.String).

Parameters:
text - string to be unescaped
Returns:
unescaped string

unescapeXML

public static void unescapeXML(java.lang.String text,
                               int offset,
                               int length,
                               java.lang.StringBuilder unescaped)
Unescapes specified string. Inverse operation of escapeXML(java.lang.String).

Parameters:
text - string to be unescaped
offset - specifies first character in string to be unescaped
length - number of characters in string to be unescaped
unescaped - buffer used to store unescaped string (characters are appended to this buffer)