Analyzing Web Server Log Files with RapidMiner (Part 2): Sessions

The Read Server Log operator returns an example set. Each entry of the example set corresponds to an entry in the log file. The operator automatically associates a session attribute value with each of the examples. Many examples may share the same session attribute value. But hey, what exactly is a session?

Example set returned by the Read Server Log operator

The operator has an important parameter named session timeout whose value must be a non-negative integer. The value of this parameter is measured in milliseconds. A session is a series of log file entries that correspond to HTTP requests initiated by the same user agent from the same host, so that the time difference between the first and the last log file entry in the session is always less or equal to session timeout.

When the first hit by a specific user agent from a specific host is found in the log file, a new session attribute value is assigned to the corresponding example. All subsequent hits by the same user agent from the same site within session timeout are associated with the same session attribute value. Once session timeout is reached, a new session attribute value is generated for the same user agent from the same site.

The operator automatically converts date and time values to integers. The time attribute of the resulting example set stores the number of minutes since January 1, 1970, 00:00:00 GMT. (See the source code of the com.rapidminer.operator.io.loganalysis.LogFileSourceOperator class for implementation details.) Note that some information is lost during date and time conversion, since seconds in time values are discarded.

Advertisements
Tagged ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: