Class UCharacterProperty
- java.lang.Object
-
- com.ibm.icu.impl.UCharacterProperty
-
public final class UCharacterProperty extends Object
Internal class used for Unicode character property database.
This classes store binary data read from uprops.icu. It does not have the capability to parse the data into more high-level information. It only returns bytes of information when required.
Due to the form most commonly used for retrieval, array of char is used to store the binary data.
UCharacterPropertyDB also contains information on accessing indexes to significant points in the binary data.
Responsibility for molding the binary data into more meaning form lies on UCharacter.
- Since:
- release 2.1, february 1st 2002
-
-
Field Summary
Fields Modifier and Type Field Description static UCharacterPropertyINSTANCEstatic charLATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_Latin capital letter i with dot abovestatic charLATIN_SMALL_LETTER_DOTLESS_I_Latin small letter i with dot abovestatic charLATIN_SMALL_LETTER_I_Latin lowercase ichar[]m_scriptExtensions_Script_Extensions dataTrie2_16m_trie_Trie dataVersionInfom_unicodeVersion_Unicode versionstatic intSCRIPT_MASK_Integer properties mask and shift values for scripts.static intSCRIPT_X_MASKScript_Extensions: mask includes Scriptstatic intSCRIPT_X_WITH_COMMONstatic intSCRIPT_X_WITH_INHERITEDstatic intSCRIPT_X_WITH_OTHERstatic intSRC_BIDIFrom ubidi_props.c/ubidi.icustatic intSRC_CASEFrom ucase.c/ucase.icustatic intSRC_CASE_AND_NORMFrom ucase.c/ucase.icu as well as unorm.cpp/unorm.icustatic intSRC_CHARFrom uchar.c/uprops.icu main triestatic intSRC_CHAR_AND_PROPSVECFrom uchar.c/uprops.icu main trie as well as properties vectors triestatic intSRC_COUNTOne more than the highest UPropertySource (SRC_) constant.static intSRC_NAMESFrom unames.c/unames.icustatic intSRC_NFCFrom normalizer2impl.cpp/nfc.nrmstatic intSRC_NFC_CANON_ITERFrom normalizer2impl.cpp/nfc.nrm canonical iterator datastatic intSRC_NFKCFrom normalizer2impl.cpp/nfkc.nrmstatic intSRC_NFKC_CFFrom normalizer2impl.cpp/nfkc_cf.nrmstatic intSRC_NONENo source, not a supported property.static intSRC_PROPSVECFrom uchar.c/uprops.icu properties vectors triestatic intTYPE_MASKCharacter type mask
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description UnicodeSetaddPropertyStarts(UnicodeSet set)intdigit(int c)intgetAdditional(int codepoint, int column)Gets the unicode additional properties.VersionInfogetAge(int codepoint)Get the "age" of the code point.static intgetEuropeanDigit(int ch)Returns the digit values of characters like 'A' - 'Z', normal, half-width and full-width.intgetIntPropertyMaxValue(int which)intgetIntPropertyValue(int c, int which)static intgetMask(int type)Gets the type maskintgetMaxValues(int column)Get the the maximum values for some enum/int properties.intgetNumericValue(int c)intgetProperty(int ch)Gets the main property value for code point ch.intgetSource(int which)intgetType(int c)doublegetUnicodeNumericValue(int c)booleanhasBinaryProperty(int c, int which)voidupropsvec_addPropertyStarts(UnicodeSet set)
-
-
-
Field Detail
-
INSTANCE
public static final UCharacterProperty INSTANCE
-
m_trie_
public Trie2_16 m_trie_
Trie data
-
m_unicodeVersion_
public VersionInfo m_unicodeVersion_
Unicode version
-
LATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_
public static final char LATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_
Latin capital letter i with dot above- See Also:
- Constant Field Values
-
LATIN_SMALL_LETTER_DOTLESS_I_
public static final char LATIN_SMALL_LETTER_DOTLESS_I_
Latin small letter i with dot above- See Also:
- Constant Field Values
-
LATIN_SMALL_LETTER_I_
public static final char LATIN_SMALL_LETTER_I_
Latin lowercase i- See Also:
- Constant Field Values
-
TYPE_MASK
public static final int TYPE_MASK
Character type mask- See Also:
- Constant Field Values
-
SRC_NONE
public static final int SRC_NONE
No source, not a supported property.- See Also:
- Constant Field Values
-
SRC_CHAR
public static final int SRC_CHAR
From uchar.c/uprops.icu main trie- See Also:
- Constant Field Values
-
SRC_PROPSVEC
public static final int SRC_PROPSVEC
From uchar.c/uprops.icu properties vectors trie- See Also:
- Constant Field Values
-
SRC_NAMES
public static final int SRC_NAMES
From unames.c/unames.icu- See Also:
- Constant Field Values
-
SRC_CASE
public static final int SRC_CASE
From ucase.c/ucase.icu- See Also:
- Constant Field Values
-
SRC_BIDI
public static final int SRC_BIDI
From ubidi_props.c/ubidi.icu- See Also:
- Constant Field Values
-
SRC_CHAR_AND_PROPSVEC
public static final int SRC_CHAR_AND_PROPSVEC
From uchar.c/uprops.icu main trie as well as properties vectors trie- See Also:
- Constant Field Values
-
SRC_CASE_AND_NORM
public static final int SRC_CASE_AND_NORM
From ucase.c/ucase.icu as well as unorm.cpp/unorm.icu- See Also:
- Constant Field Values
-
SRC_NFC
public static final int SRC_NFC
From normalizer2impl.cpp/nfc.nrm- See Also:
- Constant Field Values
-
SRC_NFKC
public static final int SRC_NFKC
From normalizer2impl.cpp/nfkc.nrm- See Also:
- Constant Field Values
-
SRC_NFKC_CF
public static final int SRC_NFKC_CF
From normalizer2impl.cpp/nfkc_cf.nrm- See Also:
- Constant Field Values
-
SRC_NFC_CANON_ITER
public static final int SRC_NFC_CANON_ITER
From normalizer2impl.cpp/nfc.nrm canonical iterator data- See Also:
- Constant Field Values
-
SRC_COUNT
public static final int SRC_COUNT
One more than the highest UPropertySource (SRC_) constant.- See Also:
- Constant Field Values
-
m_scriptExtensions_
public char[] m_scriptExtensions_
Script_Extensions data
-
SCRIPT_X_MASK
public static final int SCRIPT_X_MASK
Script_Extensions: mask includes Script- See Also:
- Constant Field Values
-
SCRIPT_MASK_
public static final int SCRIPT_MASK_
Integer properties mask and shift values for scripts. Equivalent to icu4c UPROPS_SHIFT_MASK- See Also:
- Constant Field Values
-
SCRIPT_X_WITH_COMMON
public static final int SCRIPT_X_WITH_COMMON
- See Also:
- Constant Field Values
-
SCRIPT_X_WITH_INHERITED
public static final int SCRIPT_X_WITH_INHERITED
- See Also:
- Constant Field Values
-
SCRIPT_X_WITH_OTHER
public static final int SCRIPT_X_WITH_OTHER
- See Also:
- Constant Field Values
-
-
Method Detail
-
getProperty
public final int getProperty(int ch)
Gets the main property value for code point ch.- Parameters:
ch- code point whose property value is to be retrieved- Returns:
- property value of code point
-
getAdditional
public int getAdditional(int codepoint, int column)Gets the unicode additional properties. Java version of C u_getUnicodeProperties().- Parameters:
codepoint- codepoint whose additional properties is to be retrievedcolumn- The column index.- Returns:
- unicode properties
-
getAge
public VersionInfo getAge(int codepoint)
Get the "age" of the code point.
The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character.
This can be useful to avoid emitting code points to receiving processes that do not accept newer characters.
The data is from the UCD file DerivedAge.txt.
This API does not check the validity of the codepoint.
- Parameters:
codepoint- The code point.- Returns:
- the Unicode version number
-
hasBinaryProperty
public boolean hasBinaryProperty(int c, int which)
-
getType
public int getType(int c)
-
getIntPropertyValue
public int getIntPropertyValue(int c, int which)
-
getIntPropertyMaxValue
public int getIntPropertyMaxValue(int which)
-
getSource
public final int getSource(int which)
-
getMaxValues
public int getMaxValues(int column)
Get the the maximum values for some enum/int properties.- Returns:
- maximum values for the integer properties.
-
getMask
public static final int getMask(int type)
Gets the type mask- Parameters:
type- character type- Returns:
- mask
-
getEuropeanDigit
public static int getEuropeanDigit(int ch)
Returns the digit values of characters like 'A' - 'Z', normal, half-width and full-width. This method assumes that the other digit characters are checked by the calling method.- Parameters:
ch- character to test- Returns:
- -1 if ch is not a character of the form 'A' - 'Z', otherwise its corresponding digit will be returned.
-
digit
public int digit(int c)
-
getNumericValue
public int getNumericValue(int c)
-
getUnicodeNumericValue
public double getUnicodeNumericValue(int c)
-
addPropertyStarts
public UnicodeSet addPropertyStarts(UnicodeSet set)
-
upropsvec_addPropertyStarts
public void upropsvec_addPropertyStarts(UnicodeSet set)
-
-