|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.search.Similarity
org.apache.lucene.search.DefaultSimilarity
org.apache.lucene.misc.SweetSpotSimilarity
public class SweetSpotSimilarity
A similarity with a lengthNorm that provides for a "platuea" of equally good lengths, and tf helper functions.
For lengthNorm, A global min/max can be specified to define the platuea of lengths that should all have a norm of 1.0. Below the min, and above the max the lengthNorm drops off in a sqrt function.
A per field min/max can be specified if different fields have different sweet spots.
For tf, baselineTf and hyperbolicTf functions are provided, which subclasses can choose between.
Constructor Summary | |
---|---|
SweetSpotSimilarity()
|
Method Summary | |
---|---|
float |
baselineTf(float freq)
Implemented as:
(x <= min) ? base : sqrt(x+(base**2)-min)
...but with a special case check for 0. |
float |
hyperbolicTf(float freq)
Uses a hyperbolic tangent function that allows for a hard max... |
float |
lengthNorm(String fieldName,
int numTerms)
Implemented as:
1/sqrt( steepness * (abs(x-min) + abs(x-max) - (max-min)) + 1 )
. |
void |
setBaselineTfFactors(float base,
float min)
Sets the baseline and minimum function variables for baselineTf |
void |
setHyperbolicTfFactors(float min,
float max,
double base,
float xoffset)
Sets the function variables for the hyperbolicTf functions |
void |
setLengthNormFactors(int min,
int max,
float steepness)
Sets the default function variables used by lengthNorm when no field specifc variables have been set. |
void |
setLengthNormFactors(String field,
int min,
int max,
float steepness)
Sets the function variables used by lengthNorm for a specific named field |
float |
tf(int freq)
Delegates to baselineTf |
Methods inherited from class org.apache.lucene.search.DefaultSimilarity |
---|
coord, idf, queryNorm, sloppyFreq, tf |
Methods inherited from class org.apache.lucene.search.Similarity |
---|
decodeNorm, encodeNorm, getDefault, getNormDecoder, idf, idf, scorePayload, setDefault |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SweetSpotSimilarity()
Method Detail |
---|
public void setBaselineTfFactors(float base, float min)
baselineTf(float)
public void setHyperbolicTfFactors(float min, float max, double base, float xoffset)
min
- the minimum tf value to ever be returned (default: 0.0)max
- the maximum tf value to ever be returned (default: 2.0)base
- the base value to be used in the exponential for the hyperbolic function (default: e)xoffset
- the midpoint of the hyperbolic function (default: 10.0)hyperbolicTf(float)
public void setLengthNormFactors(int min, int max, float steepness)
lengthNorm(java.lang.String, int)
public void setLengthNormFactors(String field, int min, int max, float steepness)
lengthNorm(java.lang.String, int)
public float lengthNorm(String fieldName, int numTerms)
1/sqrt( steepness * (abs(x-min) + abs(x-max) - (max-min)) + 1 )
.
This degrades to 1/sqrt(x)
when min and max are both 1 and
steepness is 0.5
:TODO: potential optimiation is to just flat out return 1.0f if numTerms is between min and max.
lengthNorm
in class DefaultSimilarity
fieldName
- the name of the fieldnumTerms
- the total number of tokens contained in fields named
fieldName of doc.
setLengthNormFactors(int, int, float)
public float tf(int freq)
tf
in class Similarity
freq
- the frequency of a term within a document
baselineTf(float)
public float baselineTf(float freq)
(x <= min) ? base : sqrt(x+(base**2)-min)
...but with a special case check for 0.
This degrates to sqrt(x)
when min and base are both 0
setBaselineTfFactors(float, float)
public float hyperbolicTf(float freq)
tf(x)=min+(max-min)/2*(((base**(x-xoffset)-base**-(x-xoffset))/(base**(x-xoffset)+base**-(x-xoffset)))+1)
This code is provided as a convincience for subclasses that want to use a hyperbolic tf function.
setHyperbolicTfFactors(float, float, double, float)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |