Package org.apache.lucene.analysis
Class StopFilter
- java.lang.Object
-
- org.apache.lucene.util.AttributeSource
-
- org.apache.lucene.analysis.TokenStream
-
- org.apache.lucene.analysis.TokenFilter
-
- org.apache.lucene.analysis.FilteringTokenFilter
-
- org.apache.lucene.analysis.StopFilter
-
- All Implemented Interfaces:
Closeable,AutoCloseable
public final class StopFilter extends FilteringTokenFilter
Removes stop words from a token stream.You must specify the required
Versioncompatibility when creating StopFilter:- As of 3.1, StopFilter correctly handles Unicode 4.0 supplementary characters in stopwords and position increments are preserved
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.AttributeFactory, AttributeSource.State
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
-
-
Constructor Summary
Constructors Constructor Description StopFilter(boolean enablePositionIncrements, TokenStream in, Set<?> stopWords)Deprecated.useStopFilter(Version, TokenStream, Set)insteadStopFilter(boolean enablePositionIncrements, TokenStream input, Set<?> stopWords, boolean ignoreCase)Deprecated.UseStopFilter(Version, TokenStream, Set)insteadStopFilter(Version matchVersion, TokenStream in, Set<?> stopWords)Constructs a filter which removes words from the input TokenStream that are named in the Set.StopFilter(Version matchVersion, TokenStream input, Set<?> stopWords, boolean ignoreCase)Deprecated.UseStopFilter(Version, TokenStream, Set)instead
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected booleanaccept()Returns the next input Token whose term() is not a stop word.static booleangetEnablePositionIncrementsVersionDefault(Version matchVersion)Deprecated.useStopFilter(Version, TokenStream, Set)insteadstatic Set<Object>makeStopSet(String... stopWords)Deprecated.usemakeStopSet(Version, String...)insteadstatic Set<Object>makeStopSet(String[] stopWords, boolean ignoreCase)Deprecated.usemakeStopSet(Version, String[], boolean)instead;static Set<Object>makeStopSet(List<?> stopWords)Deprecated.usemakeStopSet(Version, List)insteadstatic Set<Object>makeStopSet(List<?> stopWords, boolean ignoreCase)Deprecated.usemakeStopSet(Version, List, boolean)insteadstatic Set<Object>makeStopSet(Version matchVersion, String... stopWords)Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.static Set<Object>makeStopSet(Version matchVersion, String[] stopWords, boolean ignoreCase)Creates a stopword set from the given stopword array.static Set<Object>makeStopSet(Version matchVersion, List<?> stopWords)Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.static Set<Object>makeStopSet(Version matchVersion, List<?> stopWords, boolean ignoreCase)Creates a stopword set from the given stopword list.-
Methods inherited from class org.apache.lucene.analysis.FilteringTokenFilter
getEnablePositionIncrements, incrementToken, reset, setEnablePositionIncrements
-
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, end
-
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
-
-
-
-
Constructor Detail
-
StopFilter
@Deprecated public StopFilter(boolean enablePositionIncrements, TokenStream input, Set<?> stopWords, boolean ignoreCase)
Deprecated.UseStopFilter(Version, TokenStream, Set)insteadConstruct a token stream filtering the given input. IfstopWordsis an instance ofCharArraySet(true ifmakeStopSet()was used to construct the set) it will be directly used andignoreCasewill be ignored sinceCharArraySetdirectly controls case sensitivity. IfstopWordsis not an instance ofCharArraySet, a new CharArraySet will be constructed andignoreCasewill be used to specify the case sensitivity of that set.- Parameters:
enablePositionIncrements- true if token positions should record the removed stop wordsinput- Input TokenStreamstopWords- A Set of Strings or char[] or any other toString()-able set representing the stopwordsignoreCase- if true, all words are lower cased first
-
StopFilter
@Deprecated public StopFilter(Version matchVersion, TokenStream input, Set<?> stopWords, boolean ignoreCase)
Deprecated.UseStopFilter(Version, TokenStream, Set)insteadConstruct a token stream filtering the given input. IfstopWordsis an instance ofCharArraySet(true ifmakeStopSet()was used to construct the set) it will be directly used andignoreCasewill be ignored sinceCharArraySetdirectly controls case sensitivity. IfstopWordsis not an instance ofCharArraySet, a new CharArraySet will be constructed andignoreCasewill be used to specify the case sensitivity of that set.- Parameters:
matchVersion- Lucene version to enable correct Unicode 4.0 behavior in the stop set if Version > 3.0. See above for details.input- Input TokenStreamstopWords- A Set of Strings or char[] or any other toString()-able set representing the stopwordsignoreCase- if true, all words are lower cased first
-
StopFilter
@Deprecated public StopFilter(boolean enablePositionIncrements, TokenStream in, Set<?> stopWords)
Deprecated.useStopFilter(Version, TokenStream, Set)insteadConstructs a filter which removes words from the input TokenStream that are named in the Set.- Parameters:
enablePositionIncrements- true if token positions should record the removed stop wordsin- Input streamstopWords- A Set of Strings or char[] or any other toString()-able set representing the stopwords- See Also:
makeStopSet(Version, java.lang.String[])
-
StopFilter
public StopFilter(Version matchVersion, TokenStream in, Set<?> stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set.- Parameters:
matchVersion- Lucene version to enable correct Unicode 4.0 behavior in the stop set if Version > 3.0. See above for details.in- Input streamstopWords- A Set of Strings or char[] or any other toString()-able set representing the stopwords- See Also:
makeStopSet(Version, java.lang.String[])
-
-
Method Detail
-
makeStopSet
@Deprecated public static final Set<Object> makeStopSet(String... stopWords)
Deprecated.usemakeStopSet(Version, String...)insteadBuilds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.- See Also:
passing false to ignoreCase
-
makeStopSet
public static final Set<Object> makeStopSet(Version matchVersion, String... stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.- Parameters:
matchVersion- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords- An array of stopwords- See Also:
passing false to ignoreCase
-
makeStopSet
@Deprecated public static final Set<Object> makeStopSet(List<?> stopWords)
Deprecated.usemakeStopSet(Version, List)insteadBuilds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.- Parameters:
stopWords- A List of Strings or char[] or any other toString()-able list representing the stopwords- Returns:
- A Set (
CharArraySet) containing the words - See Also:
passing false to ignoreCase
-
makeStopSet
public static final Set<Object> makeStopSet(Version matchVersion, List<?> stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.- Parameters:
matchVersion- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords- A List of Strings or char[] or any other toString()-able list representing the stopwords- Returns:
- A Set (
CharArraySet) containing the words - See Also:
passing false to ignoreCase
-
makeStopSet
@Deprecated public static final Set<Object> makeStopSet(String[] stopWords, boolean ignoreCase)
Deprecated.usemakeStopSet(Version, String[], boolean)instead;Creates a stopword set from the given stopword array.- Parameters:
stopWords- An array of stopwordsignoreCase- If true, all words are lower cased first.- Returns:
- a Set containing the words
-
makeStopSet
public static final Set<Object> makeStopSet(Version matchVersion, String[] stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword array.- Parameters:
matchVersion- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords- An array of stopwordsignoreCase- If true, all words are lower cased first.- Returns:
- a Set containing the words
-
makeStopSet
@Deprecated public static final Set<Object> makeStopSet(List<?> stopWords, boolean ignoreCase)
Deprecated.usemakeStopSet(Version, List, boolean)insteadCreates a stopword set from the given stopword list.- Parameters:
stopWords- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase- if true, all words are lower cased first- Returns:
- A Set (
CharArraySet) containing the words
-
makeStopSet
public static final Set<Object> makeStopSet(Version matchVersion, List<?> stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword list.- Parameters:
matchVersion- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase- if true, all words are lower cased first- Returns:
- A Set (
CharArraySet) containing the words
-
accept
protected boolean accept() throws IOExceptionReturns the next input Token whose term() is not a stop word.- Specified by:
acceptin classFilteringTokenFilter- Throws:
IOException
-
getEnablePositionIncrementsVersionDefault
@Deprecated public static boolean getEnablePositionIncrementsVersionDefault(Version matchVersion)
Deprecated.useStopFilter(Version, TokenStream, Set)insteadReturns version-dependent default for enablePositionIncrements. Analyzers that embed StopFilter use this method when creating the StopFilter. Prior to 2.9, this returns false. On 2.9 or later, it returns true.
-
-