All Classes Interface Summary Class Summary Enum Summary Exception Summary
| Class |
Description |
| AbstractEncoder |
Base class for payload encoders.
|
| Among |
|
| ArabicAnalyzer |
Analyzer for Arabic.
|
| ArabicLetterTokenizer |
Deprecated.
|
| ArabicNormalizationFilter |
|
| ArabicNormalizer |
Normalizer for Arabic.
|
| ArabicStemFilter |
|
| ArabicStemmer |
Stemmer for Arabic.
|
| ArmenianAnalyzer |
Analyzer for Armenian.
|
| ArmenianStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
|
| BasqueAnalyzer |
Analyzer for Basque.
|
| BasqueStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
|
| BrazilianAnalyzer |
Analyzer for Brazilian Portuguese language.
|
| BrazilianStemFilter |
|
| BrazilianStemmer |
A stemmer for Brazilian Portuguese words.
|
| BulgarianAnalyzer |
Analyzer for Bulgarian.
|
| BulgarianStemFilter |
|
| BulgarianStemmer |
Light Stemmer for Bulgarian.
|
| ByteVector |
This class implements a simple byte vector with access to the underlying
array.
|
| CatalanAnalyzer |
Analyzer for Catalan.
|
| CatalanStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
|
| CharArrayIterator |
|
| CharVector |
This class implements a simple char vector with access to the underlying
array.
|
| ChineseAnalyzer |
Deprecated.
|
| ChineseFilter |
Deprecated.
|
| ChineseTokenizer |
Deprecated.
|
| CJKAnalyzer |
An Analyzer that tokenizes text with StandardTokenizer,
normalizes content with CJKWidthFilter, folds case with
LowerCaseFilter, forms bigrams of CJK with CJKBigramFilter,
and filters stopwords with StopFilter
|
| CJKBigramFilter |
Forms bigrams of CJK terms that are generated from StandardTokenizer
or ICUTokenizer.
|
| CJKTokenizer |
Deprecated.
|
| CJKWidthFilter |
A TokenFilter that normalizes CJK width differences:
Folds fullwidth ASCII variants into the equivalent basic latin
Folds halfwidth Katakana variants into the equivalent kana
|
| CompoundWordTokenFilterBase |
Base class for decomposition token filters.
|
| CzechAnalyzer |
Analyzer for Czech language.
|
| CzechStemFilter |
A TokenFilter that applies CzechStemmer to stem Czech words.
|
| CzechStemmer |
Light Stemmer for Czech.
|
| DanishAnalyzer |
Analyzer for Danish.
|
| DanishStemmer |
Generated class implementing code defined by a snowball script.
|
| DateRecognizerSinkFilter |
Attempts to parse the CharTermAttributeImpl.termBuffer() as a Date using a DateFormat.
|
| DelimitedPayloadTokenFilter |
Characters before the delimiter are the "token", those after are the payload.
|
| DictionaryCompoundWordTokenFilter |
A TokenFilter that decomposes compound words found in many Germanic languages.
|
| DutchAnalyzer |
Analyzer for Dutch language.
|
| DutchStemFilter |
Deprecated.
|
| DutchStemmer |
Deprecated.
|
| DutchStemmer |
Generated class implementing code defined by a snowball script.
|
| EdgeNGramTokenFilter |
Tokenizes the given token into n-grams of given size(s).
|
| EdgeNGramTokenFilter.Side |
Specifies which side of the input the n-gram should be generated from
|
| EdgeNGramTokenizer |
Tokenizes the input from an edge into n-grams of given size(s).
|
| EdgeNGramTokenizer.Side |
Specifies which side of the input the n-gram should be generated from
|
| ElisionFilter |
Removes elisions from a TokenStream.
|
| EmptyTokenStream |
An always exhausted token stream.
|
| EnglishAnalyzer |
Analyzer for English.
|
| EnglishMinimalStemFilter |
|
| EnglishMinimalStemmer |
Minimal plural stemmer for English.
|
| EnglishPossessiveFilter |
TokenFilter that removes possessives (trailing 's) from words.
|
| EnglishStemmer |
Generated class implementing code defined by a snowball script.
|
| FinnishAnalyzer |
Analyzer for Finnish.
|
| FinnishLightStemFilter |
|
| FinnishLightStemmer |
Light Stemmer for Finnish.
|
| FinnishStemmer |
Generated class implementing code defined by a snowball script.
|
| FloatEncoder |
Encode a character array Float as a Payload.
|
| FrenchAnalyzer |
Analyzer for French language.
|
| FrenchLightStemFilter |
|
| FrenchLightStemmer |
Light Stemmer for French.
|
| FrenchMinimalStemFilter |
|
| FrenchMinimalStemmer |
Light Stemmer for French.
|
| FrenchStemFilter |
Deprecated.
|
| FrenchStemmer |
Deprecated.
|
| FrenchStemmer |
Generated class implementing code defined by a snowball script.
|
| GalicianAnalyzer |
Analyzer for Galician.
|
| GalicianMinimalStemFilter |
|
| GalicianMinimalStemmer |
Minimal Stemmer for Galician
|
| GalicianStemFilter |
|
| GalicianStemmer |
Galician stemmer implementing "Regras do lematizador para o galego".
|
| German2Stemmer |
Generated class implementing code defined by a snowball script.
|
| GermanAnalyzer |
Analyzer for German language.
|
| GermanLightStemFilter |
|
| GermanLightStemmer |
Light Stemmer for German.
|
| GermanMinimalStemFilter |
|
| GermanMinimalStemmer |
Minimal Stemmer for German.
|
| GermanNormalizationFilter |
|
| GermanStemFilter |
A TokenFilter that stems German words.
|
| GermanStemmer |
A stemmer for German words.
|
| GermanStemmer |
Generated class implementing code defined by a snowball script.
|
| GreekAnalyzer |
Analyzer for the Greek language.
|
| GreekLowerCaseFilter |
Normalizes token text to lower case, removes some Greek diacritics,
and standardizes final sigma to sigma.
|
| GreekStemFilter |
A TokenFilter that applies GreekStemmer to stem Greek
words.
|
| GreekStemmer |
A stemmer for Greek words, according to: Development of a Stemmer for the
Greek Language. Georgios Ntais
|
| HindiAnalyzer |
Analyzer for Hindi.
|
| HindiNormalizationFilter |
|
| HindiNormalizer |
Normalizer for Hindi.
|
| HindiStemFilter |
A TokenFilter that applies HindiStemmer to stem Hindi words.
|
| HindiStemmer |
Light Stemmer for Hindi.
|
| HTMLStripCharFilter |
A CharFilter that wraps another Reader and attempts to strip out HTML constructs.
|
| HungarianAnalyzer |
Analyzer for Hungarian.
|
| HungarianLightStemFilter |
|
| HungarianLightStemmer |
Light Stemmer for Hungarian.
|
| HungarianStemmer |
Generated class implementing code defined by a snowball script.
|
| HunspellAffix |
Wrapper class representing a hunspell affix
|
| HunspellDictionary |
In-memory structure for the dictionary (.dic) and affix (.aff)
data of a hunspell dictionary.
|
| HunspellStemFilter |
TokenFilter that uses hunspell affix rules and words to stem tokens.
|
| HunspellStemmer |
HunspellStemmer uses the affix rules declared in the HunspellDictionary to generate one or more stems for a word.
|
| HunspellStemmer.Stem |
Stem represents all information known about a stem of a word.
|
| HunspellWord |
A dictionary (.dic) entry with its associated flags.
|
| Hyphen |
This class represents a hyphen.
|
| Hyphenation |
This class represents a hyphenated word.
|
| HyphenationCompoundWordTokenFilter |
A TokenFilter that decomposes compound words found in many Germanic languages.
|
| HyphenationException |
This class has been taken from the Apache FOP project (http://xmlgraphics.apache.org/fop/).
|
| HyphenationTree |
This tree structure stores the hyphenation patterns in an efficient way for
fast lookup.
|
| IdentityEncoder |
Does nothing other than convert the char array to a byte array using the specified encoding.
|
| IndicNormalizationFilter |
A TokenFilter that applies IndicNormalizer to normalize text
in Indian Languages.
|
| IndicNormalizer |
Normalizes the Unicode representation of text in Indian languages.
|
| IndicTokenizer |
Deprecated.
|
| IndonesianAnalyzer |
Analyzer for Indonesian (Bahasa)
|
| IndonesianStemFilter |
|
| IndonesianStemmer |
Stemmer for Indonesian.
|
| IntegerEncoder |
Encode a character array Integer as a Payload.
|
| IrishAnalyzer |
Analyzer for Irish.
|
| IrishLowerCaseFilter |
Normalises token text to lower case, handling t-prothesis
and n-eclipsis (i.e., that 'nAthair' should become 'n-athair')
|
| IrishStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
|
| ItalianAnalyzer |
Analyzer for Italian.
|
| ItalianLightStemFilter |
|
| ItalianLightStemmer |
Light Stemmer for Italian.
|
| ItalianStemmer |
Generated class implementing code defined by a snowball script.
|
| KpStemmer |
Generated class implementing code defined by a snowball script.
|
| KStemFilter |
A high-performance kstem filter for english.
|
| KStemmer |
This class implements the Kstem algorithm
|
| LatvianAnalyzer |
Analyzer for Latvian.
|
| LatvianStemFilter |
|
| LatvianStemmer |
Light stemmer for Latvian.
|
| LovinsStemmer |
Generated class implementing code defined by a snowball script.
|
| NGramTokenFilter |
Tokenizes the input into n-grams of the given size(s).
|
| NGramTokenizer |
Tokenizes the input into n-grams of the given size(s).
|
| NorwegianAnalyzer |
Analyzer for Norwegian.
|
| NorwegianLightStemFilter |
|
| NorwegianLightStemmer |
Light Stemmer for Norwegian.
|
| NorwegianMinimalStemFilter |
|
| NorwegianMinimalStemmer |
Minimal Stemmer for Norwegian bokmål (no-nb)
|
| NorwegianStemmer |
Generated class implementing code defined by a snowball script.
|
| NumericPayloadTokenFilter |
Assigns a payload to a token based on the Token.type()
|
| OpenStringBuilder |
A StringBuilder that allows one to access the array.
|
| PathHierarchyTokenizer |
Tokenizer for path-like hierarchies.
|
| PatternAnalyzer |
Efficient Lucene analyzer/tokenizer that preferably operates on a String rather than a
Reader, that can flexibly separate text into terms via a regular expression Pattern
(with behaviour identical to String.split(String)),
and that combines the functionality of
LetterTokenizer,
LowerCaseTokenizer,
WhitespaceTokenizer,
StopFilter into a single efficient
multi-purpose class.
|
| PatternConsumer |
This interface is used to connect the XML pattern file parser to the
hyphenation tree.
|
| PatternParser |
A SAX document handler to read and parse hyphenation patterns from a XML
file.
|
| PayloadEncoder |
Mainly for use with the DelimitedPayloadTokenFilter, converts char buffers to Payload.
|
| PayloadHelper |
Utility methods for encoding payloads.
|
| PersianAnalyzer |
Analyzer for Persian.
|
| PersianCharFilter |
CharFilter that replaces instances of Zero-width non-joiner with an
ordinary space.
|
| PersianNormalizationFilter |
|
| PersianNormalizer |
Normalizer for Persian.
|
| PorterStemmer |
Generated class implementing code defined by a snowball script.
|
| PortugueseAnalyzer |
Analyzer for Portuguese.
|
| PortugueseLightStemFilter |
|
| PortugueseLightStemmer |
Light Stemmer for Portuguese
|
| PortugueseMinimalStemFilter |
|
| PortugueseMinimalStemmer |
Minimal Stemmer for Portuguese
|
| PortugueseStemFilter |
|
| PortugueseStemmer |
Portuguese stemmer implementing the RSLP (Removedor de Sufixos da Lingua Portuguesa)
algorithm.
|
| PortugueseStemmer |
Generated class implementing code defined by a snowball script.
|
| PositionFilter |
Set the positionIncrement of all tokens to the "positionIncrement",
except the first return token which retains its original positionIncrement value.
|
| PrefixAndSuffixAwareTokenFilter |
|
| PrefixAwareTokenFilter |
Joins two token streams and leaves the last token of the first stream available
to be used when updating the token values in the second stream based on that token.
|
| QueryAutoStopWordAnalyzer |
An Analyzer used primarily at query time to wrap another analyzer and provide a layer of protection
which prevents very common words from being passed into queries.
|
| ReversePathHierarchyTokenizer |
Tokenizer for domain-like hierarchies.
|
| ReverseStringFilter |
Reverse token string, for example "country" => "yrtnuoc".
|
| RomanianAnalyzer |
Analyzer for Romanian.
|
| RomanianStemmer |
Generated class implementing code defined by a snowball script.
|
| RSLPStemmerBase |
Base class for stemmers that use a set of RSLP-like stemming steps.
|
| RSLPStemmerBase.Rule |
A basic rule, with no exceptions.
|
| RSLPStemmerBase.RuleWithSetExceptions |
A rule with a set of whole-word exceptions.
|
| RSLPStemmerBase.RuleWithSuffixExceptions |
A rule with a set of exceptional suffixes.
|
| RSLPStemmerBase.Step |
A step containing a list of rules.
|
| RussianAnalyzer |
Analyzer for Russian language.
|
| RussianLetterTokenizer |
Deprecated.
|
| RussianLightStemFilter |
|
| RussianLightStemmer |
Light Stemmer for Russian.
|
| RussianLowerCaseFilter |
Deprecated.
|
| RussianStemFilter |
Deprecated.
|
| RussianStemmer |
Generated class implementing code defined by a snowball script.
|
| ShingleAnalyzerWrapper |
A ShingleAnalyzerWrapper wraps a ShingleFilter around another Analyzer.
|
| ShingleFilter |
A ShingleFilter constructs shingles (token n-grams) from a token stream.
|
| ShingleMatrixFilter |
Deprecated.
|
| ShingleMatrixFilter.Matrix |
A column focused matrix in three dimensions:
|
| ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec |
|
| ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec |
A full featured codec not to be used for something serious.
|
| ShingleMatrixFilter.TokenPositioner |
|
| ShingleMatrixFilter.TokenSettingsCodec |
Strategy used to code and decode meta data of the tokens from the input stream
regarding how to position the tokens in the matrix, set and retreive weight, et c.
|
| ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec |
A codec that creates a two dimensional matrix
by treating tokens from the input stream with 0 position increment
as new rows to the current column.
|
| SingleTokenTokenStream |
A TokenStream containing a single token.
|
| SnowballAnalyzer |
Deprecated.
|
| SnowballFilter |
A filter that stems words using a Snowball-generated stemmer.
|
| SnowballProgram |
This is the rev 502 of the Snowball SVN trunk,
but modified:
made abstract and introduced abstract method stem to avoid expensive reflection in filter class.
|
| SolrSynonymParser |
Parser for the Solr synonyms format.
|
| SpanishAnalyzer |
Analyzer for Spanish.
|
| SpanishLightStemFilter |
|
| SpanishLightStemmer |
Light Stemmer for Spanish
|
| SpanishStemmer |
Generated class implementing code defined by a snowball script.
|
| StemmerOverrideFilter |
Provides the ability to override any KeywordAttribute aware stemmer
with custom dictionary-based stemming.
|
| StemmerUtil |
Some commonly-used stemming functions
|
| SwedishAnalyzer |
Analyzer for Swedish.
|
| SwedishLightStemFilter |
|
| SwedishLightStemmer |
Light Stemmer for Swedish.
|
| SwedishStemmer |
Generated class implementing code defined by a snowball script.
|
| SynonymFilter |
Matches single or multi word synonyms in a token stream.
|
| SynonymMap |
A map of synonyms, keys and values are phrases.
|
| SynonymMap.Builder |
Builds an FSTSynonymMap.
|
| TernaryTree |
Ternary Search Tree.
|
| TestApp |
|
| ThaiAnalyzer |
Analyzer for Thai language.
|
| ThaiWordFilter |
TokenFilter that use BreakIterator to break each
Token that is Thai into separate Token(s) for each Thai word.
|
| TokenOffsetPayloadTokenFilter |
Adds the Token.setStartOffset(int)
and Token.setEndOffset(int)
First 4 bytes are the start
|
| TokenRangeSinkFilter |
Counts the tokens as they go by and saves to the internal list those between the range of lower and upper, exclusive of upper
|
| TokenTypeSinkFilter |
Adds a token to the sink if it has a specific type.
|
| TurkishAnalyzer |
Analyzer for Turkish.
|
| TurkishLowerCaseFilter |
Normalizes Turkish token text to lower case.
|
| TurkishStemmer |
Generated class implementing code defined by a snowball script.
|
| TypeAsPayloadTokenFilter |
Makes the Token.type() a payload.
|
| WikipediaTokenizer |
Extension of StandardTokenizer that is aware of Wikipedia syntax.
|
| WordnetSynonymParser |
Parser for wordnet prolog format
|