Package com.ibm.icu.impl
Class TrieBuilder
- java.lang.Object
-
- com.ibm.icu.impl.TrieBuilder
-
- Direct Known Subclasses:
IntTrieBuilder
public class TrieBuilder extends Object
Builder class to manipulate and generate a trie. This is useful for ICU data in primitive types. Provides a compact way to store information that is indexed by Unicode values, such as character properties, types, keyboard values, etc. This is very useful when you have a block of Unicode data that contains significant values while the rest of the Unicode data is unused in the application or when you have a lot of redundance, such as where all 21,000 Han ideographs have the same value. However, lookup is much faster than a hash table. A trie of any primitive data type serves two purposes:- Fast access of the indexed values.
- Smaller memory footprint.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interfaceTrieBuilder.DataManipulateCharacter data in com.ibm.impl.Trie have different user-specified format for different purposes.
-
Field Summary
Fields Modifier and Type Field Description protected static intBMP_INDEX_LENGTH_Length of the BMP portion of the index (stage 1) array.static intDATA_BLOCK_LENGTHNumber of data values in a stage 2 (data array) block. 2, 4, 8, .., 0x200protected static intDATA_GRANULARITY_The alignment size of a stage 2 data block.protected static intINDEX_SHIFT_Shift size for shifting left the index array values.protected intm_dataCapacity_protected intm_dataLength_protected int[]m_index_Index values at build-time are 32 bits wide for easier processing.protected intm_indexLength_protected booleanm_isCompacted_protected booleanm_isLatin1Linear_protected int[]m_map_Map of adjusted indexes, used in utrie_compact().protected static intMASK_Mask for getting the lower bits from the input index.protected static intMAX_DATA_LENGTH_Maximum length of the runtime data (stage 2) array.protected static intMAX_INDEX_LENGTH_Length of the index (stage 1) array before folding.protected static intOPTIONS_DATA_IS_32_BIT_If set, then the data (stage 2) array is 32 bits wide.protected static intOPTIONS_INDEX_SHIFT_Shifting to position the index value in optionsprotected static intOPTIONS_LATIN1_IS_LINEAR_If set, then Latin-1 data (for U+0000..U+00ff) is stored in the data (stage 2) array as a simple, linear array at data + DATA_BLOCK_LENGTH.protected static intSHIFT_Shift size for shifting right the input index. 1..9protected static intSURROGATE_BLOCK_COUNT_Number of index (stage 1) entries per lead surrogate.
-
Constructor Summary
Constructors Modifier Constructor Description protectedTrieBuilder()protectedTrieBuilder(TrieBuilder table)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected static booleanequal_int(int[] array, int start1, int start2, int length)Compare two sections of an array for equality.protected static intfindSameIndexBlock(int[] index, int indexLength, int otherBlock)Finds the same index block as the otherBlockprotected voidfindUnusedBlocks()Set a value in the trie index map to indicate which data block is referenced and which one is not.booleanisInZeroBlock(int ch)Checks if the character belongs to a zero block in the trie
-
-
-
Field Detail
-
DATA_BLOCK_LENGTH
public static final int DATA_BLOCK_LENGTH
Number of data values in a stage 2 (data array) block. 2, 4, 8, .., 0x200- See Also:
- Constant Field Values
-
m_index_
protected int[] m_index_
Index values at build-time are 32 bits wide for easier processing. Bit 31 is set if the data block is used by multiple index values (from setRange()).
-
m_indexLength_
protected int m_indexLength_
-
m_dataCapacity_
protected int m_dataCapacity_
-
m_dataLength_
protected int m_dataLength_
-
m_isLatin1Linear_
protected boolean m_isLatin1Linear_
-
m_isCompacted_
protected boolean m_isCompacted_
-
m_map_
protected int[] m_map_
Map of adjusted indexes, used in utrie_compact(). Maps from original indexes to new ones.
-
SHIFT_
protected static final int SHIFT_
Shift size for shifting right the input index. 1..9- See Also:
- Constant Field Values
-
MAX_INDEX_LENGTH_
protected static final int MAX_INDEX_LENGTH_
Length of the index (stage 1) array before folding. Maximum number of Unicode code points (0x110000) shifted right by SHIFT.- See Also:
- Constant Field Values
-
BMP_INDEX_LENGTH_
protected static final int BMP_INDEX_LENGTH_
Length of the BMP portion of the index (stage 1) array.- See Also:
- Constant Field Values
-
SURROGATE_BLOCK_COUNT_
protected static final int SURROGATE_BLOCK_COUNT_
Number of index (stage 1) entries per lead surrogate. Same as number of indexe entries for 1024 trail surrogates, ==0x400>>UTRIE_SHIFT 10 - SHIFT == Number of bits of a trail surrogate that are used in index table lookups.- See Also:
- Constant Field Values
-
MASK_
protected static final int MASK_
Mask for getting the lower bits from the input index. DATA_BLOCK_LENGTH - 1.- See Also:
- Constant Field Values
-
INDEX_SHIFT_
protected static final int INDEX_SHIFT_
Shift size for shifting left the index array values. Increases possible data size with 16-bit index values at the cost of compactability. This requires blocks of stage 2 data to be aligned by UTRIE_DATA_GRANULARITY. 0..UTRIE_SHIFT- See Also:
- Constant Field Values
-
MAX_DATA_LENGTH_
protected static final int MAX_DATA_LENGTH_
Maximum length of the runtime data (stage 2) array. Limited by 16-bit index values that are left-shifted by INDEX_SHIFT_.- See Also:
- Constant Field Values
-
OPTIONS_INDEX_SHIFT_
protected static final int OPTIONS_INDEX_SHIFT_
Shifting to position the index value in options- See Also:
- Constant Field Values
-
OPTIONS_DATA_IS_32_BIT_
protected static final int OPTIONS_DATA_IS_32_BIT_
If set, then the data (stage 2) array is 32 bits wide.- See Also:
- Constant Field Values
-
OPTIONS_LATIN1_IS_LINEAR_
protected static final int OPTIONS_LATIN1_IS_LINEAR_
If set, then Latin-1 data (for U+0000..U+00ff) is stored in the data (stage 2) array as a simple, linear array at data + DATA_BLOCK_LENGTH.- See Also:
- Constant Field Values
-
DATA_GRANULARITY_
protected static final int DATA_GRANULARITY_
The alignment size of a stage 2 data block. Also the granularity for compaction.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
TrieBuilder
protected TrieBuilder()
-
TrieBuilder
protected TrieBuilder(TrieBuilder table)
-
-
Method Detail
-
isInZeroBlock
public boolean isInZeroBlock(int ch)
Checks if the character belongs to a zero block in the trie- Parameters:
ch- codepoint which data is to be retrieved- Returns:
- true if ch is in the zero block
-
equal_int
protected static final boolean equal_int(int[] array, int start1, int start2, int length)Compare two sections of an array for equality.
-
findUnusedBlocks
protected void findUnusedBlocks()
Set a value in the trie index map to indicate which data block is referenced and which one is not. utrie_compact() will remove data blocks that are not used at all. Set - 0 if it is used - -1 if it is not used
-
findSameIndexBlock
protected static final int findSameIndexBlock(int[] index, int indexLength, int otherBlock)Finds the same index block as the otherBlock- Parameters:
index- arrayindexLength- size of indexotherBlock-- Returns:
- same index block
-
-