Class NumericUtils
- java.lang.Object
-
- org.apache.lucene.util.NumericUtils
-
public final class NumericUtils extends Object
This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.
This class generates terms to achieve this: First the numerical integer values need to be converted to strings. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting string is sortable like the original integer value. Each value is also prefixed (in the first char) by the
shiftvalue (number of bits removed) used during encoding.To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout:
doubleToSortableLong(double),floatToSortableInt(float). You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to longs or ints (e.g. date to long:Date.getTime()).For easy usage, the trie algorithm is implemented for indexing inside
NumericTokenStreamthat can indexint,long,float, anddouble. For querying,NumericRangeQueryandNumericRangeFilterimplement the query part for the same data types.This class can also be used, to generate lexicographically sortable (according
String.compareTo(String)) representations of numeric data types for other usages (e.g. sorting).- Since:
- 2.9
- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classNumericUtils.IntRangeBuilderExpert: Callback forsplitIntRange(org.apache.lucene.util.NumericUtils.IntRangeBuilder, int, int, int).static classNumericUtils.LongRangeBuilderExpert: Callback forsplitLongRange(org.apache.lucene.util.NumericUtils.LongRangeBuilder, int, long, long).
-
Field Summary
Fields Modifier and Type Field Description static intBUF_SIZE_INTExpert: The maximum term length (used forchar[]buffer size) for encodingintvalues.static intBUF_SIZE_LONGExpert: The maximum term length (used forchar[]buffer size) for encodinglongvalues.static intPRECISION_STEP_DEFAULTThe default precision step used byNumericField,NumericTokenStream,NumericRangeQuery, andNumericRangeFilteras defaultstatic charSHIFT_START_INTExpert: Integers are stored at lower precision by shifting off lower bits.static charSHIFT_START_LONGExpert: Longs are stored at lower precision by shifting off lower bits.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static StringdoubleToPrefixCoded(double val)static longdoubleToSortableLong(double val)Converts adoublevalue to a sortable signedlong.static StringfloatToPrefixCoded(float val)static intfloatToSortableInt(float val)Converts afloatvalue to a sortable signedint.static StringintToPrefixCoded(int val)static StringintToPrefixCoded(int val, int shift)static intintToPrefixCoded(int val, int shift, char[] buffer)Expert: Returns prefix coded bits after reducing the precision byshiftbits.static StringlongToPrefixCoded(long val)static StringlongToPrefixCoded(long val, int shift)static intlongToPrefixCoded(long val, int shift, char[] buffer)Expert: Returns prefix coded bits after reducing the precision byshiftbits.static doubleprefixCodedToDouble(String val)static floatprefixCodedToFloat(String val)static intprefixCodedToInt(String prefixCoded)static longprefixCodedToLong(String prefixCoded)static floatsortableIntToFloat(int val)Converts a sortableintback to afloat.static doublesortableLongToDouble(long val)Converts a sortablelongback to adouble.static voidsplitIntRange(NumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound)Expert: Splits an int range recursively.static voidsplitLongRange(NumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound)Expert: Splits a long range recursively.
-
-
-
Field Detail
-
PRECISION_STEP_DEFAULT
public static final int PRECISION_STEP_DEFAULT
The default precision step used byNumericField,NumericTokenStream,NumericRangeQuery, andNumericRangeFilteras default- See Also:
- Constant Field Values
-
SHIFT_START_LONG
public static final char SHIFT_START_LONG
Expert: Longs are stored at lower precision by shifting off lower bits. The shift count is stored asSHIFT_START_LONG+shiftin the first character- See Also:
- Constant Field Values
-
BUF_SIZE_LONG
public static final int BUF_SIZE_LONG
Expert: The maximum term length (used forchar[]buffer size) for encodinglongvalues.
-
SHIFT_START_INT
public static final char SHIFT_START_INT
Expert: Integers are stored at lower precision by shifting off lower bits. The shift count is stored asSHIFT_START_INT+shiftin the first character- See Also:
- Constant Field Values
-
BUF_SIZE_INT
public static final int BUF_SIZE_INT
Expert: The maximum term length (used forchar[]buffer size) for encodingintvalues.
-
-
Method Detail
-
longToPrefixCoded
public static int longToPrefixCoded(long val, int shift, char[] buffer)Expert: Returns prefix coded bits after reducing the precision byshiftbits. This is method is used byNumericTokenStream.- Parameters:
val- the numeric valueshift- how many bits to strip from the rightbuffer- that will contain the encoded chars, must be at least ofBUF_SIZE_LONGlength- Returns:
- number of chars written to buffer
-
longToPrefixCoded
public static String longToPrefixCoded(long val, int shift)
-
longToPrefixCoded
public static String longToPrefixCoded(long val)
-
intToPrefixCoded
public static int intToPrefixCoded(int val, int shift, char[] buffer)Expert: Returns prefix coded bits after reducing the precision byshiftbits. This is method is used byNumericTokenStream.- Parameters:
val- the numeric valueshift- how many bits to strip from the rightbuffer- that will contain the encoded chars, must be at least ofBUF_SIZE_INTlength- Returns:
- number of chars written to buffer
-
intToPrefixCoded
public static String intToPrefixCoded(int val, int shift)
-
intToPrefixCoded
public static String intToPrefixCoded(int val)
-
prefixCodedToLong
public static long prefixCodedToLong(String prefixCoded)
-
prefixCodedToInt
public static int prefixCodedToInt(String prefixCoded)
-
doubleToSortableLong
public static long doubleToSortableLong(double val)
Converts adoublevalue to a sortable signedlong. The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as long. By this the precision is not reduced, but the value can easily used as a long. The sort order (includingDouble.NaN) is defined byDouble.compareTo(java.lang.Double);NaNis greater than positive infinity.- See Also:
sortableLongToDouble(long)
-
doubleToPrefixCoded
public static String doubleToPrefixCoded(double val)
-
sortableLongToDouble
public static double sortableLongToDouble(long val)
Converts a sortablelongback to adouble.- See Also:
doubleToSortableLong(double)
-
prefixCodedToDouble
public static double prefixCodedToDouble(String val)
-
floatToSortableInt
public static int floatToSortableInt(float val)
Converts afloatvalue to a sortable signedint. The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as int. By this the precision is not reduced, but the value can easily used as an int. The sort order (includingFloat.NaN) is defined byFloat.compareTo(java.lang.Float);NaNis greater than positive infinity.- See Also:
sortableIntToFloat(int)
-
floatToPrefixCoded
public static String floatToPrefixCoded(float val)
-
sortableIntToFloat
public static float sortableIntToFloat(int val)
Converts a sortableintback to afloat.- See Also:
floatToSortableInt(float)
-
prefixCodedToFloat
public static float prefixCodedToFloat(String val)
-
splitLongRange
public static void splitLongRange(NumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound)
Expert: Splits a long range recursively. You may implement a builder that adds clauses to aBooleanQueryfor each call to itsNumericUtils.LongRangeBuilder.addRange(String,String)method.This method is used by
NumericRangeQuery.
-
splitIntRange
public static void splitIntRange(NumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound)
Expert: Splits an int range recursively. You may implement a builder that adds clauses to aBooleanQueryfor each call to itsNumericUtils.IntRangeBuilder.addRange(String,String)method.This method is used by
NumericRangeQuery.
-
-