org.apache.lucene.benchmark.byTask.feeds
Class TrecDocMaker
java.lang.Object
org.apache.lucene.benchmark.byTask.feeds.BasicDocMaker
org.apache.lucene.benchmark.byTask.feeds.TrecDocMaker
- All Implemented Interfaces:
- DocMaker
public class TrecDocMaker
- extends BasicDocMaker
A DocMaker using the (compressed) Trec collection for its input.
Config properties:
- work.dir=<path to the root of docs and indexes dirs| Default: work>
- docs.dir=<path to the docs dir| Default: trec>
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.BasicDocMaker |
BODY_FIELD, BYTES_FIELD, config, DATE_FIELD, forever, ID_FIELD, indexVal, NAME_FIELD, storeVal, termVecVal, TITLE_FIELD |
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.BasicDocMaker |
addBytes, addUniqueBytes, collectFiles, getByteCount, getCount, getHtmlParser, makeDocument, makeDocument, numUniqueBytes, printDocStatistics, resetUniqueBytes, setHTMLParser |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
dateFormat
protected ThreadLocal dateFormat
dataDir
protected File dataDir
inputFiles
protected ArrayList inputFiles
nextFile
protected int nextFile
iteration
protected int iteration
reader
protected BufferedReader reader
TrecDocMaker
public TrecDocMaker()
setConfig
public void setConfig(Config config)
- Description copied from interface:
DocMaker
- Set the properties
- Specified by:
setConfig
in interface DocMaker
- Overrides:
setConfig
in class BasicDocMaker
openNextFile
protected void openNextFile()
throws NoMoreDataException,
Exception
- Throws:
NoMoreDataException
Exception
closeInputs
protected void closeInputs()
read
protected StringBuffer read(String prefix,
StringBuffer sb,
boolean collectMatchLine,
boolean collectAll)
throws Exception
- Throws:
Exception
getNextDocData
protected DocData getNextDocData()
throws NoMoreDataException,
Exception
- Description copied from class:
BasicDocMaker
- Return the data of the next document.
All current implementations can create docs forever.
When the input data is exhausted, input files are iterated.
This re-iteration can be avoided by setting doc.maker.forever to false (default is true).
- Specified by:
getNextDocData
in class BasicDocMaker
- Returns:
- data of the next document.
- Throws:
NoMoreDataException
- if data is exhausted (and 'forever' set to false).
Exception
getDateFormat
protected DateFormat getDateFormat(int n)
parseDate
protected Date parseDate(String dateStr)
resetInputs
public void resetInputs()
- Description copied from interface:
DocMaker
- Reset inputs so that the test run would behave, input wise, as if it just started.
- Specified by:
resetInputs
in interface DocMaker
- Overrides:
resetInputs
in class BasicDocMaker
numUniqueTexts
public int numUniqueTexts()
- Description copied from interface:
DocMaker
- Return how many real unique texts are available, 0 if not applicable.
Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.