Uses of Class
org.apache.lucene.analysis.TokenFilter
-
Packages that use TokenFilter Package Description org.apache.lucene.analysis API and code to convert text into indexable/searchable tokens.org.apache.lucene.analysis.ar Analyzer for Arabic.org.apache.lucene.analysis.bg Analyzer for Bulgarian.org.apache.lucene.analysis.br Analyzer for Brazilian Portuguese.org.apache.lucene.analysis.cjk Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters).org.apache.lucene.analysis.cn Analyzer for Chinese, which indexes unigrams (individual chinese characters).org.apache.lucene.analysis.cn.smart Analyzer for Simplified Chinese, which indexes words.org.apache.lucene.analysis.compound A filter that decomposes compound words you find in many Germanic languages into the word parts.org.apache.lucene.analysis.cz Analyzer for Czech.org.apache.lucene.analysis.de Analyzer for German.org.apache.lucene.analysis.el Analyzer for Greek.org.apache.lucene.analysis.en Analyzer for English.org.apache.lucene.analysis.es Analyzer for Spanish.org.apache.lucene.analysis.fa Analyzer for Persian.org.apache.lucene.analysis.fi Analyzer for Finnish.org.apache.lucene.analysis.fr Analyzer for French.org.apache.lucene.analysis.ga Analysis for Irish.org.apache.lucene.analysis.gl Analyzer for Galician.org.apache.lucene.analysis.hi Analyzer for Hindi.org.apache.lucene.analysis.hu Analyzer for Hungarian.org.apache.lucene.analysis.hunspell Stemming TokenFilter using a Java implementation of the Hunspell stemming algorithm.org.apache.lucene.analysis.icu Analysis components based on ICUorg.apache.lucene.analysis.id Analyzer for Indonesian.org.apache.lucene.analysis.in Analysis components for Indian languages.org.apache.lucene.analysis.it Analyzer for Italian.org.apache.lucene.analysis.ja Analyzer for Japanese.org.apache.lucene.analysis.lv Analyzer for Latvian.org.apache.lucene.analysis.miscellaneous Miscellaneous TokenStreamsorg.apache.lucene.analysis.ngram Character n-gram tokenizers and filters.org.apache.lucene.analysis.nl Analyzer for Dutch.org.apache.lucene.analysis.no Analyzer for Norwegian.org.apache.lucene.analysis.payloads Provides various convenience classes for creating payloads on Tokens.org.apache.lucene.analysis.phonetic Analysis components for phonetic search.org.apache.lucene.analysis.position Filter for assigning position increments.org.apache.lucene.analysis.pt Analyzer for Portuguese.org.apache.lucene.analysis.reverse Filter to reverse token text.org.apache.lucene.analysis.ru Analyzer for Russian.org.apache.lucene.analysis.shingle Word n-gram filtersorg.apache.lucene.analysis.snowball TokenFilterandAnalyzerimplementations that use Snowball stemmers.org.apache.lucene.analysis.standard Standards-based analyzers implemented with JFlex.org.apache.lucene.analysis.stempel Stempel: Algorithmic Stemmerorg.apache.lucene.analysis.sv Analyzer for Swedish.org.apache.lucene.analysis.synonym Analysis components for Synonyms.org.apache.lucene.analysis.th Analyzer for Thai.org.apache.lucene.analysis.tr Analyzer for Turkish.org.apache.lucene.collation CollationKeyFilterconverts each token into its binaryCollationKeyusing the providedCollator, and then encode theCollationKeyas a String usingIndexableBinaryStringTools, to allow it to be stored as an index term.org.apache.lucene.facet.enhancements Enhanced category featuresorg.apache.lucene.facet.enhancements.association Association category enhancementsorg.apache.lucene.facet.index.streaming Expert: attributes streaming definition for indexing facetsorg.apache.lucene.queryParser A simple query parser implemented with JavaCC.org.apache.lucene.search.highlight The highlight package contains classes to provide "keyword in context" features typically used to highlight search terms in the text of results pages. -
-
Uses of TokenFilter in org.apache.lucene.analysis
Subclasses of TokenFilter in org.apache.lucene.analysis Modifier and Type Class Description classASCIIFoldingFilterThis class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists.classCachingTokenFilterThis class can be used if the token attributes of a TokenStream are intended to be consumed more than once.classFilteringTokenFilterAbstract base class for TokenFilters that may remove tokens.classISOLatin1AccentFilterDeprecated.If you build a new index, useASCIIFoldingFilterwhich covers a superset of Latin 1.classKeywordMarkerFilterMarks terms as keywords via theKeywordAttribute.classLengthFilterRemoves words that are too long or too short from the stream.classLimitTokenCountFilterThis TokenFilter limits the number of tokens while indexing.classLookaheadTokenFilter<T extends LookaheadTokenFilter.Position>An abstract TokenFilter to make it easier to build graph token filters requiring some lookahead.classLowerCaseFilterNormalizes token text to lower case.classMockFixedLengthPayloadFilterTokenFilter that adds random fixed-length payloads.classMockGraphTokenFilterRandomly inserts overlapped (posInc=0) tokens with posLength sometimes > 1.classMockHoleInjectingTokenFilterclassMockRandomLookaheadTokenFilterUsesLookaheadTokenFilterto randomly peek at future tokens.classMockVariableLengthPayloadFilterTokenFilter that adds random variable-length payloads.classPorterStemFilterTransforms the token stream as per the Porter stemming algorithm.classStopFilterRemoves stop words from a token stream.classTeeSinkTokenFilterThis TokenFilter provides the ability to set aside attribute states that have already been analyzed.classTypeTokenFilterRemoves tokens whose types appear in a set of blocked types from a token stream.classValidatingTokenFilterA TokenFilter that checks consistency of the tokens (eg offsets are consistent with one another). -
Uses of TokenFilter in org.apache.lucene.analysis.ar
Subclasses of TokenFilter in org.apache.lucene.analysis.ar Modifier and Type Class Description classArabicNormalizationFilterATokenFilterthat appliesArabicNormalizerto normalize the orthography.classArabicStemFilterATokenFilterthat appliesArabicStemmerto stem Arabic words.. -
Uses of TokenFilter in org.apache.lucene.analysis.bg
Subclasses of TokenFilter in org.apache.lucene.analysis.bg Modifier and Type Class Description classBulgarianStemFilterATokenFilterthat appliesBulgarianStemmerto stem Bulgarian words. -
Uses of TokenFilter in org.apache.lucene.analysis.br
Subclasses of TokenFilter in org.apache.lucene.analysis.br Modifier and Type Class Description classBrazilianStemFilterATokenFilterthat appliesBrazilianStemmer. -
Uses of TokenFilter in org.apache.lucene.analysis.cjk
Subclasses of TokenFilter in org.apache.lucene.analysis.cjk Modifier and Type Class Description classCJKBigramFilterForms bigrams of CJK terms that are generated from StandardTokenizer or ICUTokenizer.classCJKWidthFilterATokenFilterthat normalizes CJK width differences: Folds fullwidth ASCII variants into the equivalent basic latin Folds halfwidth Katakana variants into the equivalent kana -
Uses of TokenFilter in org.apache.lucene.analysis.cn
Subclasses of TokenFilter in org.apache.lucene.analysis.cn Modifier and Type Class Description classChineseFilterDeprecated.UseStopFilterinstead, which has the same functionality. -
Uses of TokenFilter in org.apache.lucene.analysis.cn.smart
Subclasses of TokenFilter in org.apache.lucene.analysis.cn.smart Modifier and Type Class Description classWordTokenFilterATokenFilterthat breaks sentences into words. -
Uses of TokenFilter in org.apache.lucene.analysis.compound
Subclasses of TokenFilter in org.apache.lucene.analysis.compound Modifier and Type Class Description classCompoundWordTokenFilterBaseBase class for decomposition token filters.classDictionaryCompoundWordTokenFilterATokenFilterthat decomposes compound words found in many Germanic languages.classHyphenationCompoundWordTokenFilterATokenFilterthat decomposes compound words found in many Germanic languages. -
Uses of TokenFilter in org.apache.lucene.analysis.cz
Subclasses of TokenFilter in org.apache.lucene.analysis.cz Modifier and Type Class Description classCzechStemFilterATokenFilterthat appliesCzechStemmerto stem Czech words. -
Uses of TokenFilter in org.apache.lucene.analysis.de
Subclasses of TokenFilter in org.apache.lucene.analysis.de Modifier and Type Class Description classGermanLightStemFilterATokenFilterthat appliesGermanLightStemmerto stem German words.classGermanMinimalStemFilterATokenFilterthat appliesGermanMinimalStemmerto stem German words.classGermanNormalizationFilterNormalizes German characters according to the heuristics of the German2 snowball algorithm.classGermanStemFilterATokenFilterthat stems German words. -
Uses of TokenFilter in org.apache.lucene.analysis.el
Subclasses of TokenFilter in org.apache.lucene.analysis.el Modifier and Type Class Description classGreekLowerCaseFilterNormalizes token text to lower case, removes some Greek diacritics, and standardizes final sigma to sigma.classGreekStemFilterATokenFilterthat appliesGreekStemmerto stem Greek words. -
Uses of TokenFilter in org.apache.lucene.analysis.en
Subclasses of TokenFilter in org.apache.lucene.analysis.en Modifier and Type Class Description classEnglishMinimalStemFilterATokenFilterthat appliesEnglishMinimalStemmerto stem English words.classEnglishPossessiveFilterTokenFilter that removes possessives (trailing 's) from words.classKStemFilterA high-performance kstem filter for english. -
Uses of TokenFilter in org.apache.lucene.analysis.es
Subclasses of TokenFilter in org.apache.lucene.analysis.es Modifier and Type Class Description classSpanishLightStemFilterATokenFilterthat appliesSpanishLightStemmerto stem Spanish words. -
Uses of TokenFilter in org.apache.lucene.analysis.fa
Subclasses of TokenFilter in org.apache.lucene.analysis.fa Modifier and Type Class Description classPersianNormalizationFilterATokenFilterthat appliesPersianNormalizerto normalize the orthography. -
Uses of TokenFilter in org.apache.lucene.analysis.fi
Subclasses of TokenFilter in org.apache.lucene.analysis.fi Modifier and Type Class Description classFinnishLightStemFilterATokenFilterthat appliesFinnishLightStemmerto stem Finnish words. -
Uses of TokenFilter in org.apache.lucene.analysis.fr
Subclasses of TokenFilter in org.apache.lucene.analysis.fr Modifier and Type Class Description classElisionFilterRemoves elisions from aTokenStream.classFrenchLightStemFilterATokenFilterthat appliesFrenchLightStemmerto stem French words.classFrenchMinimalStemFilterATokenFilterthat appliesFrenchMinimalStemmerto stem French words.classFrenchStemFilterDeprecated.UseSnowballFilterwithFrenchStemmerinstead, which has the same functionality. -
Uses of TokenFilter in org.apache.lucene.analysis.ga
Subclasses of TokenFilter in org.apache.lucene.analysis.ga Modifier and Type Class Description classIrishLowerCaseFilterNormalises token text to lower case, handling t-prothesis and n-eclipsis (i.e., that 'nAthair' should become 'n-athair') -
Uses of TokenFilter in org.apache.lucene.analysis.gl
Subclasses of TokenFilter in org.apache.lucene.analysis.gl Modifier and Type Class Description classGalicianMinimalStemFilterATokenFilterthat appliesGalicianMinimalStemmerto stem Galician words.classGalicianStemFilterATokenFilterthat appliesGalicianStemmerto stem Galician words. -
Uses of TokenFilter in org.apache.lucene.analysis.hi
Subclasses of TokenFilter in org.apache.lucene.analysis.hi Modifier and Type Class Description classHindiNormalizationFilterATokenFilterthat appliesHindiNormalizerto normalize the orthography.classHindiStemFilterATokenFilterthat appliesHindiStemmerto stem Hindi words. -
Uses of TokenFilter in org.apache.lucene.analysis.hu
Subclasses of TokenFilter in org.apache.lucene.analysis.hu Modifier and Type Class Description classHungarianLightStemFilterATokenFilterthat appliesHungarianLightStemmerto stem Hungarian words. -
Uses of TokenFilter in org.apache.lucene.analysis.hunspell
Subclasses of TokenFilter in org.apache.lucene.analysis.hunspell Modifier and Type Class Description classHunspellStemFilterTokenFilter that uses hunspell affix rules and words to stem tokens. -
Uses of TokenFilter in org.apache.lucene.analysis.icu
Subclasses of TokenFilter in org.apache.lucene.analysis.icu Modifier and Type Class Description classICUFoldingFilterA TokenFilter that applies search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.classICUNormalizer2FilterNormalize token text with ICU'sNormalizer2classICUTransformFilterATokenFilterthat transforms text with ICU. -
Uses of TokenFilter in org.apache.lucene.analysis.id
Subclasses of TokenFilter in org.apache.lucene.analysis.id Modifier and Type Class Description classIndonesianStemFilterATokenFilterthat appliesIndonesianStemmerto stem Indonesian words. -
Uses of TokenFilter in org.apache.lucene.analysis.in
Subclasses of TokenFilter in org.apache.lucene.analysis.in Modifier and Type Class Description classIndicNormalizationFilterATokenFilterthat appliesIndicNormalizerto normalize text in Indian Languages. -
Uses of TokenFilter in org.apache.lucene.analysis.it
Subclasses of TokenFilter in org.apache.lucene.analysis.it Modifier and Type Class Description classItalianLightStemFilterATokenFilterthat appliesItalianLightStemmerto stem Italian words. -
Uses of TokenFilter in org.apache.lucene.analysis.ja
Subclasses of TokenFilter in org.apache.lucene.analysis.ja Modifier and Type Class Description classJapaneseBaseFormFilterReplaces term text with theBaseFormAttribute.classJapaneseKatakanaStemFilterATokenFilterthat normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC).classJapanesePartOfSpeechStopFilterRemoves tokens that match a set of part-of-speech tags.classJapaneseReadingFormFilterATokenFilterthat replaces the term attribute with the reading of a token in either katakana or romaji form. -
Uses of TokenFilter in org.apache.lucene.analysis.lv
Subclasses of TokenFilter in org.apache.lucene.analysis.lv Modifier and Type Class Description classLatvianStemFilterATokenFilterthat appliesLatvianStemmerto stem Latvian words. -
Uses of TokenFilter in org.apache.lucene.analysis.miscellaneous
Subclasses of TokenFilter in org.apache.lucene.analysis.miscellaneous Modifier and Type Class Description classStemmerOverrideFilterProvides the ability to override anyKeywordAttributeaware stemmer with custom dictionary-based stemming. -
Uses of TokenFilter in org.apache.lucene.analysis.ngram
Subclasses of TokenFilter in org.apache.lucene.analysis.ngram Modifier and Type Class Description classEdgeNGramTokenFilterTokenizes the given token into n-grams of given size(s).classNGramTokenFilterTokenizes the input into n-grams of the given size(s). -
Uses of TokenFilter in org.apache.lucene.analysis.nl
Subclasses of TokenFilter in org.apache.lucene.analysis.nl Modifier and Type Class Description classDutchStemFilterDeprecated.UseSnowballFilterwithDutchStemmerinstead, which has the same functionality. -
Uses of TokenFilter in org.apache.lucene.analysis.no
Subclasses of TokenFilter in org.apache.lucene.analysis.no Modifier and Type Class Description classNorwegianLightStemFilterATokenFilterthat appliesNorwegianLightStemmerto stem Norwegian words.classNorwegianMinimalStemFilterATokenFilterthat appliesNorwegianMinimalStemmerto stem Norwegian words. -
Uses of TokenFilter in org.apache.lucene.analysis.payloads
Subclasses of TokenFilter in org.apache.lucene.analysis.payloads Modifier and Type Class Description classDelimitedPayloadTokenFilterCharacters before the delimiter are the "token", those after are the payload.classNumericPayloadTokenFilterAssigns a payload to a token based on theToken.type()classTokenOffsetPayloadTokenFilterAdds theToken.setStartOffset(int)andToken.setEndOffset(int)First 4 bytes are the startclassTypeAsPayloadTokenFilterMakes theToken.type()a payload. -
Uses of TokenFilter in org.apache.lucene.analysis.phonetic
Subclasses of TokenFilter in org.apache.lucene.analysis.phonetic Modifier and Type Class Description classBeiderMorseFilterTokenFilter for Beider-Morse phonetic encoding.classDoubleMetaphoneFilterFilter for DoubleMetaphone (supporting secondary codes)classPhoneticFilterCreate tokens for phonetic matches. -
Uses of TokenFilter in org.apache.lucene.analysis.position
Subclasses of TokenFilter in org.apache.lucene.analysis.position Modifier and Type Class Description classPositionFilterSet the positionIncrement of all tokens to the "positionIncrement", except the first return token which retains its original positionIncrement value. -
Uses of TokenFilter in org.apache.lucene.analysis.pt
Subclasses of TokenFilter in org.apache.lucene.analysis.pt Modifier and Type Class Description classPortugueseLightStemFilterATokenFilterthat appliesPortugueseLightStemmerto stem Portuguese words.classPortugueseMinimalStemFilterATokenFilterthat appliesPortugueseMinimalStemmerto stem Portuguese words.classPortugueseStemFilterATokenFilterthat appliesPortugueseStemmerto stem Portuguese words. -
Uses of TokenFilter in org.apache.lucene.analysis.reverse
Subclasses of TokenFilter in org.apache.lucene.analysis.reverse Modifier and Type Class Description classReverseStringFilterReverse token string, for example "country" => "yrtnuoc". -
Uses of TokenFilter in org.apache.lucene.analysis.ru
Subclasses of TokenFilter in org.apache.lucene.analysis.ru Modifier and Type Class Description classRussianLightStemFilterATokenFilterthat appliesRussianLightStemmerto stem Russian words.classRussianLowerCaseFilterDeprecated.UseLowerCaseFilterinstead, which has the same functionality.classRussianStemFilterDeprecated.UseSnowballFilterwithRussianStemmerinstead, which has the same functionality. -
Uses of TokenFilter in org.apache.lucene.analysis.shingle
Subclasses of TokenFilter in org.apache.lucene.analysis.shingle Modifier and Type Class Description classShingleFilterA ShingleFilter constructs shingles (token n-grams) from a token stream. -
Uses of TokenFilter in org.apache.lucene.analysis.snowball
Subclasses of TokenFilter in org.apache.lucene.analysis.snowball Modifier and Type Class Description classSnowballFilterA filter that stems words using a Snowball-generated stemmer. -
Uses of TokenFilter in org.apache.lucene.analysis.standard
Subclasses of TokenFilter in org.apache.lucene.analysis.standard Modifier and Type Class Description classClassicFilterNormalizes tokens extracted withClassicTokenizer.classStandardFilterNormalizes tokens extracted withStandardTokenizer. -
Uses of TokenFilter in org.apache.lucene.analysis.stempel
Subclasses of TokenFilter in org.apache.lucene.analysis.stempel Modifier and Type Class Description classStempelFilterTransforms the token stream as per the stemming algorithm. -
Uses of TokenFilter in org.apache.lucene.analysis.sv
Subclasses of TokenFilter in org.apache.lucene.analysis.sv Modifier and Type Class Description classSwedishLightStemFilterATokenFilterthat appliesSwedishLightStemmerto stem Swedish words. -
Uses of TokenFilter in org.apache.lucene.analysis.synonym
Subclasses of TokenFilter in org.apache.lucene.analysis.synonym Modifier and Type Class Description classSynonymFilterMatches single or multi word synonyms in a token stream. -
Uses of TokenFilter in org.apache.lucene.analysis.th
Subclasses of TokenFilter in org.apache.lucene.analysis.th Modifier and Type Class Description classThaiWordFilterTokenFilterthat useBreakIteratorto break each Token that is Thai into separate Token(s) for each Thai word. -
Uses of TokenFilter in org.apache.lucene.analysis.tr
Subclasses of TokenFilter in org.apache.lucene.analysis.tr Modifier and Type Class Description classTurkishLowerCaseFilterNormalizes Turkish token text to lower case. -
Uses of TokenFilter in org.apache.lucene.collation
Subclasses of TokenFilter in org.apache.lucene.collation Modifier and Type Class Description classCollationKeyFilterConverts each token into itsCollationKey, and then encodes the CollationKey withIndexableBinaryStringTools, to allow it to be stored as an index term.classICUCollationKeyFilterConverts each token into itsCollationKey, and then encodes the CollationKey withIndexableBinaryStringTools, to allow it to be stored as an index term. -
Uses of TokenFilter in org.apache.lucene.facet.enhancements
Subclasses of TokenFilter in org.apache.lucene.facet.enhancements Modifier and Type Class Description classEnhancementsCategoryTokenizerA tokenizer which adds to each category token payload according to theCategoryEnhancements defined in the givenEnhancementsIndexingParams. -
Uses of TokenFilter in org.apache.lucene.facet.enhancements.association
Subclasses of TokenFilter in org.apache.lucene.facet.enhancements.association Modifier and Type Class Description classAssociationListTokenizerTokenizer for associations of a category -
Uses of TokenFilter in org.apache.lucene.facet.index.streaming
Subclasses of TokenFilter in org.apache.lucene.facet.index.streaming Modifier and Type Class Description classCategoryListTokenizerA base class for category list tokenizers, which add category list tokens to category streams.classCategoryParentsStreamThis class adds parents to aCategoryAttributesStream.classCategoryTokenizerBasic class for setting theCharTermAttributes andPayloadAttributes of category tokens.classCategoryTokenizerBaseA base class for all token filters which add term and payload attributes to tokens and are to be used inCategoryDocumentBuilder.classCountingListTokenizerCategoryListTokenizerfor facet counting -
Uses of TokenFilter in org.apache.lucene.queryParser
Subclasses of TokenFilter in org.apache.lucene.queryParser Modifier and Type Class Description static classQueryParserTestBase.QPTestFilterFilter which discards the token 'stop' and which expands the token 'phrase' into 'phrase1 phrase2' -
Uses of TokenFilter in org.apache.lucene.search.highlight
Subclasses of TokenFilter in org.apache.lucene.search.highlight Modifier and Type Class Description classOffsetLimitTokenFilterThis TokenFilter limits the number of tokens while indexing by adding up the current offset.
-