Class OffsetSimultaneousEncoder
java.lang.Object
io.confluent.parallelconsumer.offsets.OffsetSimultaneousEncoder
public class OffsetSimultaneousEncoder extends Object
Encode with multiple strategies at the same time.
Have results in an accessible structure, easily selecting the highest compression.
- See Also:
invoke()
-
Field Summary
Fields Modifier and Type Field Description static StringCOMPRESSION_FORCED_RESOURCE_LOCKUsed to prevent tests running in parallel that depends on setting static state in this class.static booleancompressionForcedForce the encoder to also add the compressed versions.static intLARGE_INPUT_MAP_SIZE_THRESHOLDSize threshold in bytes after which compressing the encodings will be compared, as it seems to be typically worth the extra compression step when beyond this size in the source array. -
Constructor Summary
Constructors Constructor Description OffsetSimultaneousEncoder(long lowWaterMark, long highestSucceededOffset, Set<Long> incompleteOffsets) -
Method Summary
Modifier and Type Method Description Map<OffsetEncoding,byte[]>getEncodingMap()Map of different encoding types for the same offset data, used for retrieving the data for the encoding typeSet<Long>getIncompleteOffsets()The offsets which have not yet been fully completed and can't have their offset committedSortedSet<EncodedOffsetPair>getSortedEncodings()Ordered set of the different encodings, used to quickly retrieve the most compressed encodingOffsetSimultaneousEncoderinvoke()Highwater mark already encoded in string -OffsetMapCodecManager.makeOffsetMetadataPayload(long, io.confluent.parallelconsumer.state.PartitionState<K, V>)- so encoding BitSet run length may not be needed, or could be swappedbyte[]packSmallest()Select the smallest encoding, and pack it.
-
Field Details
-
LARGE_INPUT_MAP_SIZE_THRESHOLD
public static final int LARGE_INPUT_MAP_SIZE_THRESHOLDSize threshold in bytes after which compressing the encodings will be compared, as it seems to be typically worth the extra compression step when beyond this size in the source array.- See Also:
- Constant Field Values
-
compressionForced
public static boolean compressionForcedForce the encoder to also add the compressed versions. Useful for testing.Visible for testing.
-
COMPRESSION_FORCED_RESOURCE_LOCK
Used to prevent tests running in parallel that depends on setting static state in this class. Manipulation of static state in tests needs to be removed to this isn't necessary.- See Also:
- Constant Field Values
-
-
Constructor Details
-
Method Details
-
invoke
Highwater mark already encoded in string -OffsetMapCodecManager.makeOffsetMetadataPayload(long, io.confluent.parallelconsumer.state.PartitionState<K, V>)- so encoding BitSet run length may not be needed, or could be swapped Simultaneously encodes: Conditionaly encodes compression variants: Currently commented out isOffsetEncoding.ByteArrayas there doesn't seem to be an advantage over BitSet encoding.TODO: optimisation - inline this into the partition iteration loop in
WorkManagerTODO: optimisation - could double the run-length range from Short.MAX_VALUE (~33,000) to Short.MAX_VALUE * 2 (~66,000) by using unsigned shorts instead (higest representable relative offset is Short.MAX_VALUE because each runlength entry is a Short)
TODO VERY large offests ranges are slow (Integer.MAX_VALUE) - encoding scans could be avoided if passing in map of incompletes which should already be known
-
packSmallest
Select the smallest encoding, and pack it.- Throws:
NoEncodingPossibleException- See Also:
packEncoding(EncodedOffsetPair)
-
getIncompleteOffsets
The offsets which have not yet been fully completed and can't have their offset committed -
getEncodingMap
Map of different encoding types for the same offset data, used for retrieving the data for the encoding type -
getSortedEncodings
Ordered set of the different encodings, used to quickly retrieve the most compressed encoding- See Also:
packSmallest()
-