Package org.apache.lucene.facet.index
Class CategoryDocumentBuilder
- java.lang.Object
-
- org.apache.lucene.facet.index.CategoryDocumentBuilder
-
- Direct Known Subclasses:
EnhancementsDocumentBuilder
public class CategoryDocumentBuilder extends Object
A utility class which allows attachment ofCategoryPaths orCategoryAttributes to a given document using a taxonomy.
Construction could be done with either a givenFacetIndexingParamsor the default implementationDefaultFacetIndexingParams.
A CategoryDocumentBuilder can be reused by repeatedly setting the categories and building the document. Categories are provided either asCategoryAttributeelements throughsetCategories(Iterable), or asCategoryPathelements throughsetCategoryPaths(Iterable).Note that both
setCategories(Iterable)andsetCategoryPaths(Iterable)return thisCategoryDocumentBuilder, allowing the following pattern:new CategoryDocumentBuilder(taxonomy, params).setCategories(categories).build(doc).- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Field Summary
Fields Modifier and Type Field Description protected Map<String,List<CategoryAttribute>>categoriesMapprotected ArrayList<org.apache.lucene.document.Field>fieldListA list of fields which is filled at ancestors' construction and used duringbuild(Document).protected FacetIndexingParamsindexingParamsParameters to be used when indexing categories.protected TaxonomyWritertaxonomyWriterATaxonomyWriterfor adding categories and retrieving their ordinals.
-
Constructor Summary
Constructors Constructor Description CategoryDocumentBuilder(TaxonomyWriter taxonomyWriter)Creating a facets document builder with default facet indexing parameters.
See:CategoryDocumentBuilder(TaxonomyWriter, FacetIndexingParams)CategoryDocumentBuilder(TaxonomyWriter taxonomyWriter, FacetIndexingParams params)Creating a facets document builder with a given facet indexing parameters object.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.lucene.document.Documentbuild(org.apache.lucene.document.Document doc)Adds the fields created in one of the "set" methods to the documentprotected voidfillCategoriesMap(Iterable<CategoryAttribute> categories)Fills the categories mapping between a field name and a list of categories that belongs to it according to this builder'sFacetIndexingParamsobjectprotected CategoryListTokenizergetCategoryListTokenizer(org.apache.lucene.analysis.TokenStream categoryStream)Get a category list tokenizer (or a series of such tokenizers) to create the category list tokens.protected CategoryTokenizergetCategoryTokenizer(org.apache.lucene.analysis.TokenStream categoryStream)Get aCategoryTokenizerto create the category tokens.protected CountingListTokenizergetCountingListTokenizer(org.apache.lucene.analysis.TokenStream categoryStream)Get aCountingListTokenizerfor creating counting list token.protected org.apache.lucene.analysis.TokenStreamgetParentsStream(CategoryAttributesStream categoryAttributesStream)Get a stream of categories which includes the parents, according to policies defined in indexing parameters.CategoryDocumentBuildersetCategories(Iterable<CategoryAttribute> categories)Set the categories of the document builder from anIterableofCategoryAttributeobjects.CategoryDocumentBuildersetCategoryPaths(Iterable<CategoryPath> categoryPaths)Set the categories of the document builder from anIterableofCategoryPathobjects.
-
-
-
Field Detail
-
taxonomyWriter
protected final TaxonomyWriter taxonomyWriter
ATaxonomyWriterfor adding categories and retrieving their ordinals.
-
indexingParams
protected final FacetIndexingParams indexingParams
Parameters to be used when indexing categories.
-
fieldList
protected final ArrayList<org.apache.lucene.document.Field> fieldList
A list of fields which is filled at ancestors' construction and used duringbuild(Document).
-
categoriesMap
protected Map<String,List<CategoryAttribute>> categoriesMap
-
-
Constructor Detail
-
CategoryDocumentBuilder
public CategoryDocumentBuilder(TaxonomyWriter taxonomyWriter) throws IOException
Creating a facets document builder with default facet indexing parameters.
See:CategoryDocumentBuilder(TaxonomyWriter, FacetIndexingParams)- Parameters:
taxonomyWriter- to which new categories will be added, as well as translating known categories to ordinals- Throws:
IOException
-
CategoryDocumentBuilder
public CategoryDocumentBuilder(TaxonomyWriter taxonomyWriter, FacetIndexingParams params) throws IOException
Creating a facets document builder with a given facet indexing parameters object.- Parameters:
taxonomyWriter- to which new categories will be added, as well as translating known categories to ordinalsparams- holds all parameters the indexing process should use such as category-list parameters- Throws:
IOException
-
-
Method Detail
-
setCategoryPaths
public CategoryDocumentBuilder setCategoryPaths(Iterable<CategoryPath> categoryPaths) throws IOException
Set the categories of the document builder from anIterableofCategoryPathobjects.- Parameters:
categoryPaths- An iterable of CategoryPath objects which holds the categories (facets) which will be added to the document atbuild(Document)- Returns:
- This CategoryDocumentBuilder, to enable this one line call:
newCategoryDocumentBuilder(TaxonomyWriter).setCategoryPaths(Iterable).build(Document). - Throws:
IOException
-
setCategories
public CategoryDocumentBuilder setCategories(Iterable<CategoryAttribute> categories) throws IOException
Set the categories of the document builder from anIterableofCategoryAttributeobjects.- Parameters:
categories- An iterable ofCategoryAttributeobjects which holds the categories (facets) which will be added to the document atbuild(Document)- Returns:
- This CategoryDocumentBuilder, to enable this one line call:
newCategoryDocumentBuilder(TaxonomyWriter).setCategories(Iterable).build(Document). - Throws:
IOException
-
getParentsStream
protected org.apache.lucene.analysis.TokenStream getParentsStream(CategoryAttributesStream categoryAttributesStream)
Get a stream of categories which includes the parents, according to policies defined in indexing parameters.- Parameters:
categoryAttributesStream- The input stream- Returns:
- The parents stream.
- See Also:
OrdinalPolicy (for policy of adding category tokens for parents),PathPolicy (for policy of adding category list tokens for parents)
-
fillCategoriesMap
protected void fillCategoriesMap(Iterable<CategoryAttribute> categories) throws IOException
Fills the categories mapping between a field name and a list of categories that belongs to it according to this builder'sFacetIndexingParamsobject- Parameters:
categories- Iterable over the category attributes- Throws:
IOException
-
getCategoryListTokenizer
protected CategoryListTokenizer getCategoryListTokenizer(org.apache.lucene.analysis.TokenStream categoryStream)
Get a category list tokenizer (or a series of such tokenizers) to create the category list tokens.- Parameters:
categoryStream- A stream containingCategoryAttributewith the relevant data.- Returns:
- The category list tokenizer (or series of tokenizers) to be used in creating category list tokens.
-
getCountingListTokenizer
protected CountingListTokenizer getCountingListTokenizer(org.apache.lucene.analysis.TokenStream categoryStream)
Get aCountingListTokenizerfor creating counting list token.- Parameters:
categoryStream- A stream containingCategoryAttributes with the relevant data.- Returns:
- A counting list tokenizer to be used in creating counting list token.
-
getCategoryTokenizer
protected CategoryTokenizer getCategoryTokenizer(org.apache.lucene.analysis.TokenStream categoryStream) throws IOException
Get aCategoryTokenizerto create the category tokens. This method can be overridden for adding more attributes to the category tokens.- Parameters:
categoryStream- A stream containingCategoryAttributewith the relevant data.- Returns:
- The
CategoryTokenizerto be used in creating category tokens. - Throws:
IOException
-
build
public org.apache.lucene.document.Document build(org.apache.lucene.document.Document doc)
Adds the fields created in one of the "set" methods to the document
-
-