org.apache.lucene.benchmark.byTask.feeds
Class DemoHTMLParser
java.lang.Object
org.apache.lucene.benchmark.byTask.feeds.DemoHTMLParser
- All Implemented Interfaces:
- HTMLParser
public class DemoHTMLParser
- extends Object
- implements HTMLParser
HTML Parser that is based on Lucene's demo HTML parser.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DemoHTMLParser
public DemoHTMLParser()
parse
public DocData parse(String name,
Date date,
Reader reader,
DateFormat dateFormat)
throws IOException,
InterruptedException
- Description copied from interface:
HTMLParser
- Parse the input Reader and return DocData.
A provided name or date is used for the result, otherwise an attempt is
made to set them from the parsed data.
- Specified by:
parse
in interface HTMLParser
- Parameters:
name
- name of the result doc data. If null, attempt to set by parsed data.date
- date of the result doc data. If null, attempt to set by parsed data.reader
- of html text to parse.dateFormat
- date formatter to use for extracting the date.
- Returns:
- Parsed doc data.
- Throws:
IOException
InterruptedException
parse
public DocData parse(String name,
Date date,
StringBuffer inputText,
DateFormat dateFormat)
throws IOException,
InterruptedException
- Description copied from interface:
HTMLParser
- Parse the inputText and return DocData.
- Specified by:
parse
in interface HTMLParser
inputText
- the html text to parse.
- Throws:
IOException
InterruptedException
- See Also:
HTMLParser.parse(String, Date, Reader, DateFormat)
Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.