All Packages Class Hierarchy This Package Previous Next Index
Class websphinx.StandardClassifier
java.lang.Object
|
+----websphinx.StandardClassifier
- public class StandardClassifier
- extends Object
- implements Classifier
Standard classifier, installed in every crawler by default.
On the entire page, this classifier sets the following labels:
- root: page is the root page of a Web site. For instance,
"http://www.digital.com/" and "http://www.digital.com/index.html" are both
marked as root, but "http://www.digital.com/about" is not.
Also sets one or more of the following labels on every link:
- hyperlink: link is a hyperlink (A, AREA, or FRAME tags) to another page on the Web (using http, file, ftp, or gopher protocols)
- image: link is an inline image (IMG).
- form: link is a form (FORM tag). A form generally requires some parameters to use.
- code: link points to code (APPLET, EMBED, or SCRIPT).
- remote: link points to a different Web server.
- local: link points to the same Web server.
- same-page: link points to the same page (e.g., by an anchor reference like "#top")
- sibling: a local link that points to a page in the same directory (e.g. "sibling.html")
- descendent: a local link that points downwards in the directory structure (e.g., "deep/deeper/deepest.html")
- ancestor: a link that points upwards in the directory structure (e.g., "../..")
-
priority
- Priority of this classifier.
-
StandardClassifier()
- Make a StandardClassifier.
-
classify(Page)
-
Classify a page.
-
getPriority()
- Get priority of this classifier.
priority
public static final float priority
- Priority of this classifier.
StandardClassifier
public StandardClassifier()
- Make a StandardClassifier.
classify
public void classify(Page page)
- Classify a page.
- Parameters:
- page - Page to classify
getPriority
public float getPriority()
- Get priority of this classifier.
- Returns:
- priority.
All Packages Class Hierarchy This Package Previous Next Index