org.apache.lucene.benchmark.byTask.feeds
Interface DocMaker

All Known Implementing Classes:
BasicDocMaker, DirDocMaker, EnwikiDocMaker, LineDocMaker, ReutersDocMaker, SimpleDocMaker, SortableSimpleDocMaker, TrecDocMaker

public interface DocMaker

Create documents for the test.
Each call to makeDocument would create the next document. When input is exhausted, the DocMaker iterates over the input again, providing a source for unlimited number of documents, though not all of them are unique.


Method Summary
 long getByteCount()
          Return total byte size of docs made since last reset.
 int getCount()
          Return number of docs made since last reset.
 HTMLParser getHtmlParser()
          Returns the htmlParser.
 Document makeDocument()
          Create the next document.
 Document makeDocument(int size)
          Create the next document, of the given size by input bytes.
 long numUniqueBytes()
          Return total bytes of all available unique texts, 0 if not applicable
 int numUniqueTexts()
          Return how many real unique texts are available, 0 if not applicable.
 void printDocStatistics()
          Print some statistics on docs available/added/etc.
 void resetInputs()
          Reset inputs so that the test run would behave, input wise, as if it just started.
 void setConfig(Config config)
          Set the properties
 void setHTMLParser(HTMLParser htmlParser)
          Set the html parser to use, when appropriate
 

Method Detail

makeDocument

Document makeDocument(int size)
                      throws Exception
Create the next document, of the given size by input bytes. If the implementation does not support control over size, an exception is thrown.

Parameters:
size - size of document, or 0 if there is no size requirement.
Throws:
if - cannot make the document, or if size>0 was specified but this feature is not supported.
Exception

makeDocument

Document makeDocument()
                      throws Exception
Create the next document.

Throws:
Exception

setConfig

void setConfig(Config config)
Set the properties


resetInputs

void resetInputs()
Reset inputs so that the test run would behave, input wise, as if it just started.


numUniqueTexts

int numUniqueTexts()
Return how many real unique texts are available, 0 if not applicable.


numUniqueBytes

long numUniqueBytes()
Return total bytes of all available unique texts, 0 if not applicable


getCount

int getCount()
Return number of docs made since last reset.


getByteCount

long getByteCount()
Return total byte size of docs made since last reset.


printDocStatistics

void printDocStatistics()
Print some statistics on docs available/added/etc.


setHTMLParser

void setHTMLParser(HTMLParser htmlParser)
Set the html parser to use, when appropriate


getHtmlParser

HTMLParser getHtmlParser()
Returns the htmlParser.



Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.