Home
Log Entry FormatsLog Entry Formats
As standard polliwog is able to detect and process logs in the following formats:
  • Apache Combined Log format
  • W3C Extended Log File format (used by IIS)
  • Java Application Server
The formats are defined in xml files and specified to polliwog by using property:
logEntryFormats
Which is a comma separated list of files that hold the formats that polliwog should use. polliwog will try each format in turn to determine the format to use for your log file.

However polliwog can be configured to process just about any log format.

To create a new custom log format it is recommended that you copy an existing format (they can be found in the data directory) and modify it as needed.

The following types of log field are supported:
NameLog Entry Format IdsAssociated polliwog class
Hostname/IP address%h, c-ip, %client.name%HostnameField
Date/Time%t, date time, %datetime%DateTimeField
Request Line%r, cs-method cs-uri-stem cs-uri-query, %request%RequestLineField, in the case of a W3C format log an instance of: W3CRequestLineField is used instead.
HTTP Status Code%>s, sc-status, %status%StatusCodeField
Bytes Returned%b, sc-bytes, %response.length%SizeField
Referer Header%{Referer}i, cs(Referer), %header.referer%RefererHeaderField
User Agent Header%{User-agent}i, cs(User-Agent), %header.user-agent%RequestHeaderField
Other fields, such as username, can be supported however polliwog at present doesn't make use of them so they have not been added.

When parsing a log file entry polliwog will tokenize the line according to the format information. For each field you can specify how many tokens (tokens are separated by whitespace) makes up the field (the default is 1) and whether the tokens have been quoted.

For example, in a W3C log three fields, cs-method cs-uri-stem cs-uri-query, usually make up one field, %r, that is usually found in an Apache log. For Apache the format entry would be:
<field class="org.polliwog.fields.RequestLineField"
       openQuote='"'
       closeQuote='"'
       escapedBy="" />
And in W3C it would be:
<field class="org.polliwog.fields.W3CRequestLineField"
       tokenCount="3" />
The log format file has the following elements/attributes:
Show the help for this table
NameRootChildrenParent(s)AttributesDescription
configYfield+NONENONEThe root element, each child field element describes a different field to parse in the log file.
fieldNparam*configclass(class,R), blank(boolean,O), openQuote(string,O), closeQuote(string,O), escapedBy(string,O), tokenCount(integer,O)Defines a field in the log file.

The class attribute tells polliwog which object to use to model the data in the field (see above for details of the fields supported), this is only required if the blank attribute is not given.

The blank attribute indicates that the field/token should be skipped, this should be used for fields such as the username.

When a field should be quoted use the openQuote and closeQuote attributes to indicate what single character is used for quoting (note: multiple characters are not supported). If the field is quoted then use the escapedBy attribute to indicate what single character is used to escape the openQuote and closeQuote characters.

If the field spans across multiple tokens then use the tokenCount attribute to indicate how many there are, it should be noted that if the token count is greater than 1 then polliwog will merge the specified number of tokens into a single token separated by a whitespace character. It is then the responsibility of the field implementation to pull the token apart to interpret the value.
paramNNONEfieldid(string,R), value(string,R)Defines a parameter that should be used to configure the field object instance created (as specified by the class attribute on the field element. The actual values supported are dependent upon the field instance created.
Custom Log Format
The usual reason to create a custom log format is because your web server may not be writing out a particular field (as defined in the standard log formats known to polliwog) to the log. For example, some web servers don't (by default) write the User Agent header or Referer header to the log.

In this case a custom log format is required. The best way to do this is to modify an existing format, change it's name and then modify the:
logEntryFormats
property to indicate to polliwog that it should the format. (Note: it is recommended that you use a property override for this rather than modifying the logEntryFormats property in the properties.xml file.)
Creating custom field implementations
It may be that the standard field classes (see the fields package for details) are not adequate for your log file. To create your own instance just extend org.polliwog.fields.AbstractField and then use the classname in the class attribute of the relevant field element.

Note: the Hit class ONLY supports the standard fields, therefore your implementation must return one of the existing field id values from the getFieldId() method. For example, if your implementation models a referer field then it should return the same field id as RefererHeaderField. The best way to do this would be to do:
public class MyRefererField extends RefererHeaderField
{

   // Override the init method.
   public void init (Map params, String value)
   {
   
      // Process the field.
   
   }

}