A page title resolver is used to convert the requested url, including query string, into a human readable title that has meaning.
For example, it is not uncommon to have urls like:
http://www.webmasterworld.com/forum10/3968.htm
Which has little meaning to a human being looking at a log file. However a title that has more meaning would be:
Forums : Webmaster General : Strange URLs
The referer, if present, is also resolved if polliwog detects that it refers to your site.
A page title resolver is created by implementing the
org.polliwog.resolvers.PageTitleResolver interface.
The implementation of the interface can perform whatever tasks is needed to generate the mapping.
polliwog only creates a single instance of the resolver that is then used for the entire log, the instance is intialized by calling the
init(org.polliwog.data.VisitorEnvironment) method.
The title will only be resolved if a page title resolver is provided in the
properties file for property:
pageTitleResolverClass
If no resolver is provided, or the resolver returns
null
for the title, then polliwog will use the url as the page title. By default polliwog will also consider the query parameters in the url to be part of the page title, this can be controlled by using
property:
queryParmsArePartOfPage
The resolved title is then set in the
Hit class. The resolved title will also be transfered to the
org.polliwog.data.HitPage class as well.
A basic implementation of the page title resolver is provided by the
org.polliwog.resolvers.BasicPageTitleResolver class. This class uses a xml file to provide the mappings between urls and the page titles. It is a simple implementation and may to be used where there is a straight forward mapping between the two.
The format of the xml file is as follows:
XML Definition Help
The Children column shows the child elements that can be used within the specified element. Child elements can appear in any order (there is no enforcement via a DTD).
- A + after the element name indicates that at least 1 child element with that name must be present.
- A * after the element name indicates that 0 or more elements can be present.
- A ? after the element name indicates that either 1 or no elements with that name can be present.
If no symbol is provided after the name then one element must be provided.
The Attributes column shows the attribute that can be used on the specified element. Attribute definitions are defined as: name(value_type,required|optional), where value_type is one of:
- string - A string value, this can be anything.
- integer - An integer value.
- class - A fully qualified classname.
- enum{values} - A specific value, one or more of those given in the brackets, which will be comma-separated.
- boolean - Either
true or false .
Required and optional are represented as: R and O respectively.
|
Name | Root | Children | Parent(s) | Attributes | Description |
---|
page-titles | Y | type+, page+ | NONE | NONE | The root element, each child page element defines a mapping between a url and a title. Each child type element defines the type of match that can be used within each page element. |
page | N | match+ | page-titles | id(string,O) name(string,R) | Defines a mapping between a url and a title. The id attribute is now deprecated (in favour of using a JoSQL match type instead) defines the url, which should the path part of the url including the query string. The name attribute defines the title of the page. |
match | N | TEXT | page | type(string,R) | Defines a match. The type attribute should map to one of the type elements (via the id attribute on the type element) so that the resolver knows what kind of match to create. When a JoSQLMatchType is used the content of the match element should be a JoSQL WHERE clause that will be executed to determine whether the url maps to the page title. The class available to the WHERE clause is: AbstractURIField. |
type | N | NONE | page-titles | id(string,R) class(string,R) | Defines a match type. The id attribute is basically the name of the match type, the type attribute on the match element should map to one of the ids. The class attribute should be a fully qualified classname that polliwog should create when a match is created. The class specified should implement the MatchType interface. |
The resolver uses property:
basicPageTitleResolverTitlesFile
to define the location of the xml file to use.
Example
<page-titles>
<type id="josql"
class="org.polliwog.resolvers.JoSQLMatchType" />
<page name="Front Page">
<match type="josql">
path IN ('/index.html', '/')
</match>
</page>
</page-titles>
A more complex resolver, used for the pygmy possum site, is provided by class:
org.polliwog.resolvers.PygmyPossumPageTitleResolver.