Class NumericCharacterReference
- All Implemented Interfaces:
CharSequence, Comparable<Segment>
A numeric character reference can be one of two types:
- Decimal Character Reference
- A numeric character reference specifying the unicode code point in decimal notation.
This is signified by the absence of an 'x' character after the '#', (eg ">"). - Hexadecimal Character Reference
- A numeric character reference specifying the unicode code point in hexadecimal notation.
This is signified by the presence of an 'x' character after the '#', (eg ">").
Static methods to encode and decode strings
and single characters can be found in the CharacterReference superclass.
NumericCharacterReference instances are obtained using one of the following methods:
- See Also:
-
Field Summary
Fields inherited from class CharacterReference
INVALID_CODE_POINT -
Method Summary
Modifier and TypeMethodDescriptionstatic Stringencode(CharSequence unencodedText) Encodes the specified text, escaping special characters into numeric character references.static StringencodeDecimal(CharSequence unencodedText) Encodes the specified text, escaping special characters into decimal character references.static StringencodeHexadecimal(CharSequence unencodedText) Encodes the specified text, escaping special characters into hexadecimal character references.Returns the correct encoded form of this numeric character reference.static StringgetCharacterReferenceString(int codePoint) Returns the numeric character reference encoded form of the specified unicode code point.Returns a string representation of this object useful for debugging purposes.booleanIndicates whether this numeric character reference specifies the unicode code point in decimal format.booleanIndicates whether this numeric character reference specifies the unicode code point in hexadecimal format.Methods inherited from class CharacterReference
appendCharTo, decode, decode, decodeCollapseWhiteSpace, encode, encodeWithWhiteSpaceFormatting, getChar, getCodePoint, getCodePointFromCharacterReferenceString, getDecimalCharacterReferenceString, getDecimalCharacterReferenceString, getEncodingFilterWriter, getHexadecimalCharacterReferenceString, getHexadecimalCharacterReferenceString, getUnicodeText, getUnicodeText, isTerminated, parse, reencode, requiresEncodingMethods inherited from class Segment
charAt, compareTo, encloses, encloses, equals, getAllCharacterReferences, getAllElements, getAllElements, getAllElements, getAllElements, getAllElements, getAllElementsByClass, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTagsByClass, getAllTags, getAllTags, getBegin, getChildElements, getEnd, getFirstElement, getFirstElement, getFirstElement, getFirstElement, getFirstElementByClass, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTagByClass, getFormControls, getFormFields, getMaxDepthIndicator, getNodeIterator, getRenderer, getRowColumnVector, getSource, getStyleURISegments, getTextExtractor, getURIAttributes, hashCode, ignoreWhenParsing, isWhiteSpace, isWhiteSpace, length, parseAttributes, subSequence, toStringMethods inherited from interface CharSequence
chars, codePoints, getChars, isEmpty
-
Method Details
-
isDecimal
public boolean isDecimal()Indicates whether this numeric character reference specifies the unicode code point in decimal format.A numeric character reference in decimal format is referred to in this library as a decimal character reference.
- Returns:
trueif this numeric character reference specifies the unicode code point in decimal format, otherwisefalse.- See Also:
-
isHexadecimal
public boolean isHexadecimal()Indicates whether this numeric character reference specifies the unicode code point in hexadecimal format.A numeric character reference in hexadecimal format is referred to in this library as a hexadecimal character reference.
- Returns:
trueif this numeric character reference specifies the unicode code point in hexadecimal format, otherwisefalse.- See Also:
-
encode
Encodes the specified text, escaping special characters into numeric character references.Each character is encoded only if the
requiresEncoding(char)method would returntruefor that character.This method encodes all character references in decimal format, and is exactly the same as calling
encodeDecimal(CharSequence).To encode text using both character entity references and numeric character references, use the
CharacterReference.encode(CharSequence)method instead.To encode text using hexadecimal character references only, use the
encodeHexadecimal(CharSequence)method instead.- Parameters:
unencodedText- the text to encode.- Returns:
- the encoded string.
- See Also:
-
encodeDecimal
Encodes the specified text, escaping special characters into decimal character references.Each character is encoded only if the
requiresEncoding(char)method would returntruefor that character.To encode text using both character entity references and numeric character references, use the
CharacterReference.encode(CharSequence)method instead.To encode text using hexadecimal character references only, use the
encodeHexadecimal(CharSequence)method instead.- Parameters:
unencodedText- the text to encode.- Returns:
- the encoded string.
- See Also:
-
encodeHexadecimal
Encodes the specified text, escaping special characters into hexadecimal character references.Each character is encoded only if the
requiresEncoding(char)method would returntruefor that character.To encode text using both character entity references and numeric character references, use the
CharacterReference.encode(CharSequence)method instead.To encode text using decimal character references only, use the
encodeDecimal(CharSequence)method instead.- Parameters:
unencodedText- the text to encode.- Returns:
- the encoded string.
- See Also:
-
getCharacterReferenceString
Returns the correct encoded form of this numeric character reference.The returned string uses the same radix as the original character reference in the source document, i.e. decimal format if
isDecimal()istrue, and hexadecimal format ifisHexadecimal()istrue.Note that the returned string is not necessarily the same as the original source text used to create this object. This library recognises certain invalid forms of character references, as detailed in the
decode(CharSequence)method.To retrieve the original source text, use the
toString()method instead.- Example:
CharacterReference.parse(">").getCharacterReferenceString()returns ">"
- Specified by:
getCharacterReferenceStringin classCharacterReference- Returns:
- the correct encoded form of this numeric character reference.
- See Also:
-
getCharacterReferenceString
Returns the numeric character reference encoded form of the specified unicode code point.This method returns the character reference in decimal format, and is exactly the same as calling
CharacterReference.getDecimalCharacterReferenceString(int codePoint).To get either the character entity reference or numeric character reference, use the
CharacterReference.getCharacterReferenceString(int codePoint)method instead.To get the character reference in hexadecimal format, use the
CharacterReference.getHexadecimalCharacterReferenceString(int codePoint)method instead.- Examples:
NumericCharacterReference.getCharacterReferenceString(62)returns ">"NumericCharacterReference.getCharacterReferenceString('>')returns ">"
- Returns:
- the numeric character reference encoded form of the specified unicode code point.
- See Also:
-
getDebugInfo
Description copied from class:SegmentReturns a string representation of this object useful for debugging purposes.- Overrides:
getDebugInfoin classSegment- Returns:
- a string representation of this object useful for debugging purposes.
-