All Packages Class Hierarchy This Package Previous Next Index
Class websphinx.HTMLParser
java.lang.Object
|
+----websphinx.HTMLParser
- public class HTMLParser
- extends Object
HTML parser. Parses an input stream or String and
converts it to a sequence of Tags and a tree of Elements.
HTMLParser is used by Page to parse pages.
-
HTMLParser()
- Make an HTMLParser.
-
HTMLParser(DownloadParameters)
- Make an HTMLParser which retrieves pages
using the specified download parameters.
-
dontParse(Page, InputStream)
- Download an input stream without parsing it.
-
dontParse(Page, Reader)
- Download an input stream without parsing it.
-
main(String[])
-
-
parse(Page, InputStream)
- Parse an input stream.
-
parse(Page, Reader)
- Parse an input stream.
-
parse(Page, String)
- Parse a string.
HTMLParser
public HTMLParser()
- Make an HTMLParser.
HTMLParser
public HTMLParser(DownloadParameters dp)
- Make an HTMLParser which retrieves pages
using the specified download parameters. Pages
larger than dp.getMaxPageSize() are rejected by parse()
with an IOException.
- Parameters:
- dp - download parameters used during parsing
parse
public void parse(Page page,
InputStream stream) throws IOException
- Parse an input stream.
- Parameters:
- page - Page to receive parsed HTML
- input - stream containing HTML
parse
public void parse(Page page,
Reader stream) throws IOException
- Parse an input stream.
- Parameters:
- page - Page to receive parsed HTML
- input - stream containing HTML
parse
public void parse(Page page,
String content) throws IOException
- Parse a string.
- Parameters:
- page - Page to receive parsed HTML
- content - String containing HTML
dontParse
public void dontParse(Page page,
InputStream stream) throws IOException
- Download an input stream without parsing it.
- Parameters:
- page - Page to receive the downloaded content
- input - stream containing content
dontParse
public void dontParse(Page page,
Reader stream) throws IOException
- Download an input stream without parsing it.
- Parameters:
- page - Page to receive the downloaded content
- r - stream containing content
main
public static void main(String args[]) throws Exception
All Packages Class Hierarchy This Package Previous Next Index