A C D F G H I L M N O P R S T U W 
All Classes All Packages

A

add(String) - Method in class org.apache.tika.langdetect.tika.LanguageProfile
Adds a single occurrence of the given ngram to this profile.
add(StringBuffer) - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Adds ngrams from a single word to this profile
add(String, long) - Method in class org.apache.tika.langdetect.tika.LanguageProfile
Adds multiple occurrences of the given ngram to this profile.
addProfile(String, LanguageProfile) - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Adds a single language profile
addText(char[], int, int) - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
 
analyze(StringBuilder) - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Analyzes a piece of text

C

clearProfiles() - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Clears the current map of language profiles
close() - Method in class org.apache.tika.langdetect.tika.ProfilingWriter
 
create(String, InputStream, String) - Static method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Creates a new Language profile from (preferably quite large - 5-10k of lines) text file

D

DEFAULT_NGRAM_LENGTH - Static variable in class org.apache.tika.langdetect.tika.LanguageProfile
 
detectAll() - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
 
distance(LanguageProfile) - Method in class org.apache.tika.langdetect.tika.LanguageProfile
Calculates the geometric distance between this and the given other language profile.

F

flush() - Method in class org.apache.tika.langdetect.tika.ProfilingWriter
Ignored.

G

getCount() - Method in class org.apache.tika.langdetect.tika.LanguageProfile
 
getCount(String) - Method in class org.apache.tika.langdetect.tika.LanguageProfile
 
getErrors() - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Returns a string of error messages related to initializing language profiles
getLanguage() - Method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Gets the identified language
getLanguage() - Method in class org.apache.tika.langdetect.tika.ProfilingWriter
Returns the language that best matches the current state of the language profile.
getName() - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
 
getProfile() - Method in class org.apache.tika.langdetect.tika.ProfilingWriter
Returns the language profile being built by this writer.
getRawScore() - Method in class org.apache.tika.langdetect.tika.LanguageIdentifier
1 - vector distance between the language model and the content
getSimilarity(LanguageProfilerBuilder) - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Calculates a score how well NGramProfiles match each other
getSorted() - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Returns a sorted list of ngrams (sort done by 1. frequency 2. sequence)
getSupportedLanguages() - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Returns what languages are supported for language identification

H

hasErrors() - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Tests whether there were errors initializing language config
hasModel(String) - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
 

I

initProfiles() - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Builds the language profiles.
initProfiles(Map<String, LanguageProfile>) - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Initializes the language profiles from a user supplied initialized Map.
isReasonablyCertain() - Method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Tries to judge whether the identification is certain enough to be trusted.

L

LanguageIdentifier - Class in org.apache.tika.langdetect.tika
Identifier of the language that best matches a given content profile.
LanguageIdentifier(String) - Constructor for class org.apache.tika.langdetect.tika.LanguageIdentifier
Constructs a language identifier based on a String of text content
LanguageIdentifier(LanguageProfile) - Constructor for class org.apache.tika.langdetect.tika.LanguageIdentifier
Constructs a language identifier based on a LanguageProfile
LanguageProfile - Class in org.apache.tika.langdetect.tika
Language profile based on ngram counts.
LanguageProfile() - Constructor for class org.apache.tika.langdetect.tika.LanguageProfile
 
LanguageProfile(int) - Constructor for class org.apache.tika.langdetect.tika.LanguageProfile
 
LanguageProfile(String) - Constructor for class org.apache.tika.langdetect.tika.LanguageProfile
 
LanguageProfile(String, int) - Constructor for class org.apache.tika.langdetect.tika.LanguageProfile
 
LanguageProfilerBuilder - Class in org.apache.tika.langdetect.tika
This class runs a ngram analysis over submitted text, results might be used for automatic language identification.
LanguageProfilerBuilder(String) - Constructor for class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Constructs a new ngram profile where minlen=3, maxlen=3
LanguageProfilerBuilder(String, int, int) - Constructor for class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Constructs a new ngram profile
load(InputStream) - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Loads a ngram profile from an InputStream (assumes UTF-8 encoded content)
loadModels() - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
 
loadModels(Set<String>) - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
 

M

main(String[]) - Static method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
main method used for testing only

N

normalize() - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Normalizes the profile (calculates the ngrams frequencies)

O

org.apache.tika.langdetect.tika - package org.apache.tika.langdetect.tika
 

P

ProfilingWriter - Class in org.apache.tika.langdetect.tika
Writer that builds a language profile based on all the written content.
ProfilingWriter() - Constructor for class org.apache.tika.langdetect.tika.ProfilingWriter
 
ProfilingWriter(LanguageProfile) - Constructor for class org.apache.tika.langdetect.tika.ProfilingWriter
 

R

reset() - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
 

S

save(OutputStream) - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Writes NGramProfile content into OutputStream, content is outputted with UTF-8 encoding
setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
not supported

T

TikaLanguageDetector - Class in org.apache.tika.langdetect.tika
This is Tika's original legacy, homegrown language detector.
TikaLanguageDetector() - Constructor for class org.apache.tika.langdetect.tika.TikaLanguageDetector
 
toString() - Method in class org.apache.tika.langdetect.tika.LanguageIdentifier
 
toString() - Method in class org.apache.tika.langdetect.tika.LanguageProfile
 
toString() - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
 

U

useInterleaved - Static variable in class org.apache.tika.langdetect.tika.LanguageProfile
 

W

write(char[], int, int) - Method in class org.apache.tika.langdetect.tika.ProfilingWriter
 
A C D F G H I L M N O P R S T U W 
All Classes All Packages