Home
Hit CollectorsHit Collectors
A hit collector is a way to collect hits (instances of org.polliwog.data.Hit), based on certain criteria, as polliwog is processing a log file.

Once collected the hits can be processed and sections/pages generated to display the information. For instance you may wish to collect "image hot-linking hits" (that is hits from other websites that link directly to images on your site).

Each hit collection defines the types of hits that it wants to collect.

The collector definitions are stored in file: data/hit-collectors.xml. This is an xml file that has the following elements/attributes:
Show the help for this table
XML Definition HelpClose
The Children column shows the child elements that can be used within the specified element. Child elements can appear in any order (there is no enforcement via a DTD).
  • A + after the element name indicates that at least 1 child element with that name must be present.
  • A * after the element name indicates that 0 or more elements can be present.
  • A ? after the element name indicates that either 1 or no elements with that name can be present.
  • If no symbol is provided after the name then one element must be provided.
The Attributes column shows the attribute that can be used on the specified element. Attribute definitions are defined as: name(value_type,required|optional), where value_type is one of:
  • string - A string value, this can be anything.
  • integer - An integer value.
Required and optional are represented as: R and O respectively.
NameRootChildrenParent(s)AttributesDescription
collectorsYcollector+NONENONEThe root elements, each child #collector element defines a collection.
collectorNANYcollectorson, name, classDefines a particular collection. The class attribute defines the class that will represent the collection and implements the org.polliwog.collectors.HitCollector interface. The name attribute defines a unique name (within the list of hit collections) for the collection, it is this identifier that can be used in sections to identify the collection to use as the input. The on attribute defines the types of hit that should be collected, should be a comma-separated list of one of more of the values: nonPage, page, filtered. The usual value for this attribute is: nonPage, page.
Type of Hits
There are 3 types of hits that are relevant to hit collections. The types relate to how polliwog categorizes the hit. It should be noted that in terms of hit collections a hit will only be classified as one of the types. The types are:
  • Filtered - The hit is categorized as filtered if there is a hit filter in use and it rejects the hit, i.e. it has been filtered and won't contribute to any visit/pages statistics. To collect this type of hit, use a value of: filtered for the on attribute of the collector element.
  • Non page - The hit is categorized as not a page if the page collector decides that the hit does not constitute a page. To collect this type of hit, use a value of: nonPage for the on attribute of the collector element.
  • Page - The hit is categorized as a page if the page collector decides that the hit does constitute a page. To collect this type of hit, use a value of: page for the on attribute of the collector element.
Basic Hit Collector
Whilst it is possible to provide your own implementation of the HitCollector interface a basic implementation that allows you to define criteria for collecting hits via a External site, opens in new window JoSQL WHERE clause is available with the org.polliwog.collectors.BasicHitCollector class.

To use this class just use a value of:
org.polliwog.collectors.BasicHitCollector
for the class attribute of the collector element.

The WHERE clause is then provided by the content of the collector element. Remember that the class for any accessors is: Hit. Any valid WHERE clause can be used, the expression used will be evaluated to a boolean true/false value. This value is returned from the accept(org.polliwog.data.Hit) method.

Example
Find all hits that have a request parameter: search for pages.
<collector on="page"
           name="searchHits"
           class="org.polliwog.collectors.BasicHitCollector">
  get (requestParameters, "search") != NULL
</collector>
Example
Find hot linked image hits (replace www.mysite.com with the name of your website).
<collector on="nonPage,page"
           name="hotLinkedImageHits"
           class="org.polliwog.collectors.BasicHitCollector">
  refererURI NOT LIKE 'http://www.mysite.com'
  AND
  refererURI.scheme LIKE 'http%'
  AND
  requestURI.path IN $LIKE ('%.gif', '%.jpg')
</collector>