org.apache.lucene.analysis.shingle
Class ShingleAnalyzerWrapper

java.lang.Object
  extended by org.apache.lucene.analysis.Analyzer
      extended by org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper

public class ShingleAnalyzerWrapper
extends Analyzer

A ShingleAnalyzerWrapper wraps a ShingleFilter around another analyzer. A shingle is another namefor a token based n-gram.


Field Summary
protected  Analyzer defaultAnalyzer
           
protected  int maxShingleSize
           
protected  boolean outputUnigrams
           
 
Constructor Summary
ShingleAnalyzerWrapper()
          Wraps StandardAnalyzer.
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)
           
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int maxShingleSize)
           
ShingleAnalyzerWrapper(int nGramSize)
           
 
Method Summary
 int getMaxShingleSize()
          The max shingle (ngram) size
 boolean isOutputUnigrams()
           
 void setMaxShingleSize(int maxShingleSize)
          Set the maximum size of output shingles
 void setOutputUnigrams(boolean outputUnigrams)
          Shall the filter pass the original tokens (the "unigrams") to the output stream?
 TokenStream tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 
Methods inherited from class org.apache.lucene.analysis.Analyzer
getPositionIncrementGap, getPreviousTokenStream, reusableTokenStream, setPreviousTokenStream
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

defaultAnalyzer

protected Analyzer defaultAnalyzer

maxShingleSize

protected int maxShingleSize

outputUnigrams

protected boolean outputUnigrams
Constructor Detail

ShingleAnalyzerWrapper

public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)

ShingleAnalyzerWrapper

public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer,
                              int maxShingleSize)

ShingleAnalyzerWrapper

public ShingleAnalyzerWrapper()
Wraps StandardAnalyzer.


ShingleAnalyzerWrapper

public ShingleAnalyzerWrapper(int nGramSize)
Method Detail

getMaxShingleSize

public int getMaxShingleSize()
The max shingle (ngram) size

Returns:
The max shingle (ngram) size

setMaxShingleSize

public void setMaxShingleSize(int maxShingleSize)
Set the maximum size of output shingles

Parameters:
maxShingleSize - max shingle size

isOutputUnigrams

public boolean isOutputUnigrams()

setOutputUnigrams

public void setOutputUnigrams(boolean outputUnigrams)
Shall the filter pass the original tokens (the "unigrams") to the output stream?

Parameters:
outputUnigrams - Whether or not the filter shall pass the original tokens to the output stream

tokenStream

public TokenStream tokenStream(String fieldName,
                               Reader reader)
Description copied from class: Analyzer
Creates a TokenStream which tokenizes all the text in the provided Reader. Must be able to handle null field name for backward compatibility.

Specified by:
tokenStream in class Analyzer


Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.