Class NLTokenUnit


  • public final class NLTokenUnit
    extends java.lang.Object
    NLTokenizer is a class used to automatically segment natural-language text. An instance of this class is created with a specific unit and assigned a string to tokenize, and clients can then obtain ranges for tokens in that string appropriate to the given unit. Units are defined by NLTokenUnit, which specifies the size of the units in a string to which tokenization or tagging applies, whether word, sentence, paragraph, or document.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static long Document
      Token unit is the entire string
      static long Paragraph
      Token units are at paragraph level
      static long Sentence
      Token units are at sentence level
      static long Word
      Token units are at word or equivalent level
    • Method Summary

      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • Word

        public static final long Word
        Token units are at word or equivalent level
        See Also:
        Constant Field Values
      • Sentence

        public static final long Sentence
        Token units are at sentence level
        See Also:
        Constant Field Values
      • Paragraph

        public static final long Paragraph
        Token units are at paragraph level
        See Also:
        Constant Field Values
      • Document

        public static final long Document
        Token unit is the entire string
        See Also:
        Constant Field Values