Home
Visit CollectorsVisit Collectors
A visit collector is a way to collect visits (instances of org.polliwog.data.Visit), based on certain criteria, as polliwog is processing a log file.

Once collected the visits can be processed and sections/pages generated to display the information. For instance you may wish to collect visits from potential bots/spiders (ones that aren't listed in the robots_list.txt file)

The collector definitions are stored in file: data/visit-collectors.xml. This is an xml file that has the following elements/attributes:
Show the help for this table
XML Definition HelpClose
The Children column shows the child elements that can be used within the specified element. Child elements can appear in any order (there is no enforcement via a DTD).
  • A + after the element name indicates that at least 1 child element with that name must be present.
  • A * after the element name indicates that 0 or more elements can be present.
  • A ? after the element name indicates that either 1 or no elements with that name can be present.
  • If no symbol is provided after the name then one element must be provided.
The Attributes column shows the attribute that can be used on the specified element. Attribute definitions are defined as: name(value_type,required|optional), where value_type is one of:
  • string - A string value, this can be anything.
  • integer - An integer value.
Required and optional are represented as: R and O respectively.
NameRootChildrenParent(s)AttributesDescription
collectorsYcollector+NONENONEThe root elements, each child #collector element defines a collection.
collectorNANYcollectorsname, classDefines a particular collection. The class attribute defines the class that will represent the collection and implements the org.polliwog.collectors.VisitCollector interface. The name attribute defines a unique name (within the list of visit collections) for the collection, it is this identifier that can be used in sections to identify the collection to use as the input.
Basic Visit Collector
Whilst it is possible to provide your own implementation of the VisitCollector interface a basic implementation that allows you to define criteria for collecting visits via a External site, opens in new window JoSQL WHERE clause is available with the org.polliwog.collectors.BasicVisitCollector class.

To use this class just use a value of:
org.polliwog.collectors.BasicVisitCollector
for the class attribute of the collector element.

The WHERE clause is then provided by the content of the collector element. Remember that the class for any accessors is: Visit. Any valid WHERE clause can be used, the expression used will be evaluated to a boolean true/false value. This value is returned from the accept(org.polliwog.data.Hit) method.

Example
Find all potential bots/spiders.
<collector name="potentialBots"
           class="org.polliwog.collectors.BasicVisitCollector">
  userAgent $IN LIKE ('%http://%', '%bot%', '%spider%', '%crawler%')
</collector>
Example
Find hot linked image visits (replace www.mysite.com with the name of your website).
<collector name="hotLinkedImageHits"
           class="org.polliwog.collectors.BasicHitCollector">
 (SELECT *
  FROM   pages
  WHERE  :_allobjs.size = 1
  AND    requestURI.path $IN LIKE ('%.gif', '%.jpg', '%.png')
  AND    refererURI.toString LIKE 'http://%'
  AND    refererURI.toString NOT LIKE 'http://www.mysite.com%')
</collector>