Package com.ibm.icu.impl
Class Normalizer2Impl
- java.lang.Object
-
- com.ibm.icu.impl.Normalizer2Impl
-
public final class Normalizer2Impl extends Object
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classNormalizer2Impl.Hangulstatic classNormalizer2Impl.ReorderingBufferWritable buffer that takes care of canonical ordering.static classNormalizer2Impl.UTF16Plus
-
Field Summary
Fields Modifier and Type Field Description static intCOMP_1_LAST_TUPLEstatic intCOMP_1_TRAIL_LIMITstatic intCOMP_1_TRAIL_MASKstatic intCOMP_1_TRAIL_SHIFTstatic intCOMP_1_TRIPLEstatic intCOMP_2_TRAIL_MASKstatic intCOMP_2_TRAIL_SHIFTstatic intIX_COUNTstatic intIX_EXTRA_DATA_OFFSETstatic intIX_LIMIT_NO_NOstatic intIX_MIN_COMP_NO_MAYBE_CPstatic intIX_MIN_DECOMP_NO_CPstatic intIX_MIN_MAYBE_YESstatic intIX_MIN_NO_NOstatic intIX_MIN_YES_NOstatic intIX_MIN_YES_NO_MAPPINGS_ONLYstatic intIX_NORM_TRIE_OFFSETstatic intIX_RESERVED3_OFFSETstatic intIX_SMALL_FCD_OFFSETstatic intIX_TOTAL_SIZEstatic intJAMO_Lstatic intJAMO_VTstatic intMAPPING_HAS_CCC_LCCC_WORDstatic intMAPPING_HAS_RAW_MAPPINGstatic intMAPPING_LENGTH_MASKstatic intMAPPING_NO_COMP_BOUNDARY_AFTERstatic intMAX_DELTAstatic intMIN_CCC_LCCC_CPstatic intMIN_NORMAL_MAYBE_YESstatic intMIN_YES_YES_WITH_CC
-
Constructor Summary
Constructors Constructor Description Normalizer2Impl()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddCanonIterPropertyStarts(UnicodeSet set)voidaddLcccChars(UnicodeSet set)voidaddPropertyStarts(UnicodeSet set)booleancompose(CharSequence s, int src, int limit, boolean onlyContiguous, boolean doCompose, Normalizer2Impl.ReorderingBuffer buffer)voidcomposeAndAppend(CharSequence s, boolean doCompose, boolean onlyContiguous, Normalizer2Impl.ReorderingBuffer buffer)intcomposePair(int a, int b)intcomposeQuickCheck(CharSequence s, int src, int limit, boolean onlyContiguous, boolean doSpan)Very similar to compose(): Make the same changes in both places if relevant.intdecompose(CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)voiddecompose(CharSequence s, int src, int limit, StringBuilder dest, int destLengthEstimate)Decomposes s[src, limit[ and writes the result to dest.Appendabledecompose(CharSequence s, StringBuilder dest)voiddecomposeAndAppend(CharSequence s, boolean doDecompose, Normalizer2Impl.ReorderingBuffer buffer)voiddecomposeShort(CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)Normalizer2ImplensureCanonIterData()Builds the canonical-iterator data for this instance.booleangetCanonStartSet(int c, UnicodeSet set)Returns true if there are characters whose decomposition starts with c.intgetCC(int norm16)static intgetCCFromYesOrMaybe(int norm16)intgetCompQuickCheck(int norm16)StringgetDecomposition(int c)Gets the decomposition for one code point.intgetFCD16(int c)Returns the FCD data for code point c.intgetFCD16FromBelow180(int c)Returns the FCD data for U+0000<=cintgetFCD16FromNormData(int c)Gets the FCD value from the regular normalization data.intgetNorm16(int c)Trie2_16getNormTrie()StringgetRawDecomposition(int c)Gets the raw decomposition for one code point.booleanhasCompBoundaryAfter(int c, boolean onlyContiguous, boolean testInert)booleanhasCompBoundaryBefore(int c)booleanhasDecompBoundary(int c, boolean before)booleanhasFCDBoundaryAfter(int c)booleanhasFCDBoundaryBefore(int c)booleanisAlgorithmicNoNo(int norm16)booleanisCanonSegmentStarter(int c)Returns true if code point c starts a canonical-iterator string segment.booleanisCompNo(int norm16)booleanisDecompInert(int c)booleanisDecompYes(int norm16)booleanisFCDInert(int c)Normalizer2Implload(String name)Normalizer2Implload(ByteBuffer bytes)intmakeFCD(CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)voidmakeFCDAndAppend(CharSequence s, boolean doMakeFCD, Normalizer2Impl.ReorderingBuffer buffer)booleansingleLeadMightHaveNonZeroFCD16(int lead)Returns true if the single-or-lead code unit c might have non-zero FCD data.
-
-
-
Field Detail
-
MIN_CCC_LCCC_CP
public static final int MIN_CCC_LCCC_CP
- See Also:
- Constant Field Values
-
MIN_YES_YES_WITH_CC
public static final int MIN_YES_YES_WITH_CC
- See Also:
- Constant Field Values
-
JAMO_VT
public static final int JAMO_VT
- See Also:
- Constant Field Values
-
MIN_NORMAL_MAYBE_YES
public static final int MIN_NORMAL_MAYBE_YES
- See Also:
- Constant Field Values
-
JAMO_L
public static final int JAMO_L
- See Also:
- Constant Field Values
-
MAX_DELTA
public static final int MAX_DELTA
- See Also:
- Constant Field Values
-
IX_NORM_TRIE_OFFSET
public static final int IX_NORM_TRIE_OFFSET
- See Also:
- Constant Field Values
-
IX_EXTRA_DATA_OFFSET
public static final int IX_EXTRA_DATA_OFFSET
- See Also:
- Constant Field Values
-
IX_SMALL_FCD_OFFSET
public static final int IX_SMALL_FCD_OFFSET
- See Also:
- Constant Field Values
-
IX_RESERVED3_OFFSET
public static final int IX_RESERVED3_OFFSET
- See Also:
- Constant Field Values
-
IX_TOTAL_SIZE
public static final int IX_TOTAL_SIZE
- See Also:
- Constant Field Values
-
IX_MIN_DECOMP_NO_CP
public static final int IX_MIN_DECOMP_NO_CP
- See Also:
- Constant Field Values
-
IX_MIN_COMP_NO_MAYBE_CP
public static final int IX_MIN_COMP_NO_MAYBE_CP
- See Also:
- Constant Field Values
-
IX_MIN_YES_NO
public static final int IX_MIN_YES_NO
- See Also:
- Constant Field Values
-
IX_MIN_NO_NO
public static final int IX_MIN_NO_NO
- See Also:
- Constant Field Values
-
IX_LIMIT_NO_NO
public static final int IX_LIMIT_NO_NO
- See Also:
- Constant Field Values
-
IX_MIN_MAYBE_YES
public static final int IX_MIN_MAYBE_YES
- See Also:
- Constant Field Values
-
IX_MIN_YES_NO_MAPPINGS_ONLY
public static final int IX_MIN_YES_NO_MAPPINGS_ONLY
- See Also:
- Constant Field Values
-
IX_COUNT
public static final int IX_COUNT
- See Also:
- Constant Field Values
-
MAPPING_HAS_CCC_LCCC_WORD
public static final int MAPPING_HAS_CCC_LCCC_WORD
- See Also:
- Constant Field Values
-
MAPPING_HAS_RAW_MAPPING
public static final int MAPPING_HAS_RAW_MAPPING
- See Also:
- Constant Field Values
-
MAPPING_NO_COMP_BOUNDARY_AFTER
public static final int MAPPING_NO_COMP_BOUNDARY_AFTER
- See Also:
- Constant Field Values
-
MAPPING_LENGTH_MASK
public static final int MAPPING_LENGTH_MASK
- See Also:
- Constant Field Values
-
COMP_1_LAST_TUPLE
public static final int COMP_1_LAST_TUPLE
- See Also:
- Constant Field Values
-
COMP_1_TRIPLE
public static final int COMP_1_TRIPLE
- See Also:
- Constant Field Values
-
COMP_1_TRAIL_LIMIT
public static final int COMP_1_TRAIL_LIMIT
- See Also:
- Constant Field Values
-
COMP_1_TRAIL_MASK
public static final int COMP_1_TRAIL_MASK
- See Also:
- Constant Field Values
-
COMP_1_TRAIL_SHIFT
public static final int COMP_1_TRAIL_SHIFT
- See Also:
- Constant Field Values
-
COMP_2_TRAIL_SHIFT
public static final int COMP_2_TRAIL_SHIFT
- See Also:
- Constant Field Values
-
COMP_2_TRAIL_MASK
public static final int COMP_2_TRAIL_MASK
- See Also:
- Constant Field Values
-
-
Method Detail
-
load
public Normalizer2Impl load(ByteBuffer bytes)
-
load
public Normalizer2Impl load(String name)
-
addLcccChars
public void addLcccChars(UnicodeSet set)
-
addPropertyStarts
public void addPropertyStarts(UnicodeSet set)
-
addCanonIterPropertyStarts
public void addCanonIterPropertyStarts(UnicodeSet set)
-
getNormTrie
public Trie2_16 getNormTrie()
-
ensureCanonIterData
public Normalizer2Impl ensureCanonIterData()
Builds the canonical-iterator data for this instance. This is required before any ofisCanonSegmentStarter(int)orgetCanonStartSet(int, UnicodeSet)are called, or else they crash.- Returns:
- this
-
getNorm16
public int getNorm16(int c)
-
getCompQuickCheck
public int getCompQuickCheck(int norm16)
-
isAlgorithmicNoNo
public boolean isAlgorithmicNoNo(int norm16)
-
isCompNo
public boolean isCompNo(int norm16)
-
isDecompYes
public boolean isDecompYes(int norm16)
-
getCC
public int getCC(int norm16)
-
getCCFromYesOrMaybe
public static int getCCFromYesOrMaybe(int norm16)
-
getFCD16
public int getFCD16(int c)
Returns the FCD data for code point c.- Parameters:
c- A Unicode code point.- Returns:
- The lccc(c) in bits 15..8 and tccc(c) in bits 7..0.
-
getFCD16FromBelow180
public int getFCD16FromBelow180(int c)
Returns the FCD data for U+0000<=c-
singleLeadMightHaveNonZeroFCD16
public boolean singleLeadMightHaveNonZeroFCD16(int lead)
Returns true if the single-or-lead code unit c might have non-zero FCD data.
-
getFCD16FromNormData
public int getFCD16FromNormData(int c)
Gets the FCD value from the regular normalization data.
-
getDecomposition
public String getDecomposition(int c)
Gets the decomposition for one code point.- Parameters:
c- code point- Returns:
- c's decomposition, if it has one; returns null if it does not have a decomposition
-
getRawDecomposition
public String getRawDecomposition(int c)
Gets the raw decomposition for one code point.- Parameters:
c- code point- Returns:
- c's raw decomposition, if it has one; returns null if it does not have a decomposition
-
isCanonSegmentStarter
public boolean isCanonSegmentStarter(int c)
Returns true if code point c starts a canonical-iterator string segment.ensureCanonIterData()must have been called before this method, or else this method will crash.- Parameters:
c- A Unicode code point.- Returns:
- true if c starts a canonical-iterator string segment.
-
getCanonStartSet
public boolean getCanonStartSet(int c, UnicodeSet set)Returns true if there are characters whose decomposition starts with c. If so, then the set is cleared and then filled with those characters.ensureCanonIterData()must have been called before this method, or else this method will crash.- Parameters:
c- A Unicode code point.set- A UnicodeSet to receive the characters whose decompositions start with c, if there are any.- Returns:
- true if there are characters whose decomposition starts with c.
-
decompose
public Appendable decompose(CharSequence s, StringBuilder dest)
-
decompose
public void decompose(CharSequence s, int src, int limit, StringBuilder dest, int destLengthEstimate)
Decomposes s[src, limit[ and writes the result to dest. limit can be NULL if src is NUL-terminated. destLengthEstimate is the initial dest buffer capacity and can be -1.
-
decompose
public int decompose(CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
-
decomposeAndAppend
public void decomposeAndAppend(CharSequence s, boolean doDecompose, Normalizer2Impl.ReorderingBuffer buffer)
-
compose
public boolean compose(CharSequence s, int src, int limit, boolean onlyContiguous, boolean doCompose, Normalizer2Impl.ReorderingBuffer buffer)
-
composeQuickCheck
public int composeQuickCheck(CharSequence s, int src, int limit, boolean onlyContiguous, boolean doSpan)
Very similar to compose(): Make the same changes in both places if relevant. doSpan: spanQuickCheckYes (ignore bit 0 of the return value) !doSpan: quickCheck- Returns:
- bits 31..1: spanQuickCheckYes (==s.length() if "yes") and bit 0: set if "maybe"; otherwise, if the span length<s.length() then the quick check result is "no"
-
composeAndAppend
public void composeAndAppend(CharSequence s, boolean doCompose, boolean onlyContiguous, Normalizer2Impl.ReorderingBuffer buffer)
-
makeFCD
public int makeFCD(CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
-
makeFCDAndAppend
public void makeFCDAndAppend(CharSequence s, boolean doMakeFCD, Normalizer2Impl.ReorderingBuffer buffer)
-
hasDecompBoundary
public boolean hasDecompBoundary(int c, boolean before)
-
isDecompInert
public boolean isDecompInert(int c)
-
hasCompBoundaryBefore
public boolean hasCompBoundaryBefore(int c)
-
hasCompBoundaryAfter
public boolean hasCompBoundaryAfter(int c, boolean onlyContiguous, boolean testInert)
-
hasFCDBoundaryBefore
public boolean hasFCDBoundaryBefore(int c)
-
hasFCDBoundaryAfter
public boolean hasFCDBoundaryAfter(int c)
-
isFCDInert
public boolean isFCDInert(int c)
-
decomposeShort
public void decomposeShort(CharSequence s, int src, int limit, Normalizer2Impl.ReorderingBuffer buffer)
-
composePair
public int composePair(int a, int b)
-
-
-