org.apache.lucene.analysis
Class LowerCaseTokenizer

java.lang.Object
  extended by org.apache.lucene.analysis.TokenStream
      extended by org.apache.lucene.analysis.Tokenizer
          extended by org.apache.lucene.analysis.CharTokenizer
              extended by org.apache.lucene.analysis.LetterTokenizer
                  extended by org.apache.lucene.analysis.LowerCaseTokenizer

public final class LowerCaseTokenizer
extends LetterTokenizer

LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together. It divides text at non-letters and converts them to lower case. While it is functionally equivalent to the combination of LetterTokenizer and LowerCaseFilter, there is a performance advantage to doing the two tasks at once, hence this (redundant) implementation.

Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.


Field Summary
 
Fields inherited from class org.apache.lucene.analysis.Tokenizer
input
 
Constructor Summary
LowerCaseTokenizer(Reader in)
          Construct a new LowerCaseTokenizer.
 
Method Summary
protected  char normalize(char c)
          Collects only characters which satisfy Character.isLetter(char).
 
Methods inherited from class org.apache.lucene.analysis.LetterTokenizer
isTokenChar
 
Methods inherited from class org.apache.lucene.analysis.CharTokenizer
next, reset
 
Methods inherited from class org.apache.lucene.analysis.Tokenizer
close
 
Methods inherited from class org.apache.lucene.analysis.TokenStream
next, reset
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LowerCaseTokenizer

public LowerCaseTokenizer(Reader in)
Construct a new LowerCaseTokenizer.

Method Detail

normalize

protected char normalize(char c)
Collects only characters which satisfy Character.isLetter(char).

Overrides:
normalize in class CharTokenizer


Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.