| Package | Description |
|---|---|
| org.apache.lucene.analysis |
API and code to convert text into indexable/searchable tokens.
|
| org.apache.lucene.analysis.standard |
Standards-based analyzers implemented with JFlex.
|
| Modifier and Type | Class | Description |
|---|---|---|
class |
CharTokenizer |
An abstract base class for simple, character-oriented tokenizers.
|
class |
KeywordTokenizer |
Emits the entire input as a single token.
|
class |
LetterTokenizer |
A LetterTokenizer is a tokenizer that divides text at non-letters.
|
class |
LowerCaseTokenizer |
LowerCaseTokenizer performs the function of LetterTokenizer
and LowerCaseFilter together.
|
class |
WhitespaceTokenizer |
A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
|
| Modifier and Type | Field | Description |
|---|---|---|
protected Tokenizer |
ReusableAnalyzerBase.TokenStreamComponents.source |
| Constructor | Description |
|---|---|
TokenStreamComponents(Tokenizer source) |
Creates a new
ReusableAnalyzerBase.TokenStreamComponents instance. |
TokenStreamComponents(Tokenizer source,
TokenStream result) |
Creates a new
ReusableAnalyzerBase.TokenStreamComponents instance. |
| Modifier and Type | Class | Description |
|---|---|---|
class |
ClassicTokenizer |
A grammar-based tokenizer constructed with JFlex
|
class |
StandardTokenizer |
A grammar-based tokenizer constructed with JFlex.
|
class |
UAX29URLEmailTokenizer |
This class implements Word Break rules from the Unicode Text Segmentation
algorithm, as specified in
Unicode Standard Annex #29
URLs and email addresses are also tokenized according to the relevant RFCs.
|
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.