Protocol interface,
see also org.apache.nutch.net.protocols.See: Description
| Interface | Description |
|---|---|
| Protocol |
A retriever of url content.
|
| ProtocolStatusCodes | |
| RobotRules |
This class holds the rules which were parsed from a robots.txt file, and can
test paths against those rules.
|
| Class | Description |
|---|---|
| Content | |
| ProtocolFactory |
Creates and caches
Protocol plugins. |
| ProtocolOutput |
Simple aggregate to pass from protocol plugins both content and protocol
status.
|
| ProtocolStatusUtils | |
| RobotRulesParser |
This class uses crawler-commons for handling the parsing of
robots.txt files. |
| Exception | Description |
|---|---|
| ProtocolException | |
| ProtocolNotFound |
Protocol interface,
see also org.apache.nutch.net.protocols.Copyright © 2019 The Apache Software Foundation