public class TFTermPruningPolicy extends TermPruningPolicy
Larger threshold value will produce a smaller index.
See TermPruningPolicy for size vs performance considerations.
This implementation uses simple term frequency thresholds to remove all postings from documents where a given term occurs rarely (i.e. its TF in a document is smaller than the threshold).
Threshold values in this method are expressed as absolute term frequencies.
| Modifier and Type | Field | Description |
|---|---|---|
protected int |
curThr |
|
protected int |
defThreshold |
|
protected Map<String,Integer> |
thresholds |
DEL_ALL, DEL_PAYLOADS, DEL_POSTINGS, DEL_STORED, DEL_VECTORfieldFlags, in| Constructor | Description |
|---|---|
TFTermPruningPolicy(IndexReader in,
Map<String,Integer> fieldFlags,
Map<String,Integer> thresholds,
int defThreshold) |
| Modifier and Type | Method | Description |
|---|---|---|
void |
initPositionsTerm(TermPositions in,
Term t) |
Called when moving
TermPositions to a new Term. |
boolean |
pruneAllPositions(TermPositions termPositions,
Term t) |
Prune all postings per term (invoked once per term per doc)
|
int |
pruneSomePositions(int docNum,
int[] positions,
Term curTerm) |
Prune some postings per term (invoked once per term per doc).
|
boolean |
pruneTermEnum(TermEnum te) |
Pruning of all postings for a term (invoked once per term).
|
int |
pruneTermVectorTerms(int docNumber,
String field,
String[] terms,
int[] freqs,
TermFreqVector tfv) |
Pruning of individual terms in term vectors.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitpruneAllFieldPostings, prunePayload, pruneWholeTermVectorpublic boolean pruneTermEnum(TermEnum te) throws IOException
TermPruningPolicypruneTermEnum in class TermPruningPolicyte - positioned term enum.IOExceptionpublic void initPositionsTerm(TermPositions in, Term t) throws IOException
TermPruningPolicyTermPositions to a new Term.initPositionsTerm in class TermPruningPolicyin - input term positionst - current termIOExceptionpublic boolean pruneAllPositions(TermPositions termPositions, Term t) throws IOException
TermPruningPolicypruneAllPositions in class TermPruningPolicytermPositions - positioned term positions. Implementations MUST NOT
advance this by calling TermPositions methods that advance either
the position pointer (next, skipTo) or term pointer (seek).t - current termIOExceptionpublic int pruneTermVectorTerms(int docNumber,
String field,
String[] terms,
int[] freqs,
TermFreqVector tfv)
throws IOException
TermPruningPolicypruneTermVectorTerms in class TermPruningPolicydocNumber - document numberfield - field nameterms - array of termsfreqs - array of term frequenciestfv - the original term frequency vectorIOExceptionpublic int pruneSomePositions(int docNum,
int[] positions,
Term curTerm)
TermPruningPolicypruneSomePositions in class TermPruningPolicydocNum - current document numberpositions - original term positions in the document (and indirectly
term frequency)curTerm - current termCopyright © 2000-2018 Apache Software Foundation. All Rights Reserved.