The content of that document is written for people who already have some knowledge about the functioning and possibilities of Google Analytics. Readers who do not know that tool should first become acquainted with the free materials published in the Google Analytics Academy.

Contents

Installing the Google Analytics Tracking Code

In order to use Google Analytics (hereinafter called GA), the user should first install the GA tracking code on the pages generated by the reader application. By default, the code is embedded in the templates of the pages of the reader application, but the so-called tracking identifier should be given. The identifier is available in the Google Analytics administration panel, in the “Service > Tracking information > Tracking code” section, and it has the following form: UA-<number sequence>-<number sequence>, for example:

Once the code has been copied from Google Analytics pages, it should be placed in the WEB-INF/components/templates/AnalyticsComponent.vm file as the ................ configuration parameter value. The tracking code should be embedded correctly after the changes have been saved in the configuration file and after the reader application has been restarted. As a result, the tracking code will be visible in the HTML code of the web pages of the reader application, and data will be reported to GA.

Advanced Google Analytics Configuration

Apart from the installation of the tracking code, configuring advanced Google Analytics functions is also recommended. The most important advanced settings are described below.

Removing the Unnecessary Parameters from Website Addresses

A removal of unnecessary, in a given context, parameters from web addresses can be configured in the administrative options of GA, in the “View” section, in the “View settings” option, in the “Exclude URL query parameters” – value “navref,ref” should be entered there.

Configuring the Search Tracking inside the Digital Library

Search tracking inside the digital library can be configured in the administrative options of Google Analytics, in the “View” section, in the “View settings” option. The user should enable “Site search tracking” option and the “Search categories” there and enter:

  • the “q,search_value1,q,search_value2,q,search_value3,q,search_value4” sequence in the “Parameter” field– as a result, simple and advanced queries (the first four constituents) will be tracked; and
  • the “qf1,qf2,qf3,qf4,qf5” sequence in the “Parameter” field – as a result, up to five search filters used by users will be tracked at once.

Site search reports can be found in the analytical part of GA, in the “Behavior > Search terms”


Configuring Goals

In GA, goals are for tracking the number of user sessions in which a particular action – which is important from our point of view – is undertaken. In the case of an online store, the action could be adding a product to the cart, making a purchase, or sending a contact form. For a digital library, we suggest defining a series of goals described below.

The goals are defined in the administrative part of GA, in the “View > Goals" section. Once the goals have been defined, the data necessary for displaying statistics related to goals are collected in GA. The first effects of that can be checked in the analytical part of GA, in the “Conversions > Goals > Overview” section, after a day or two.

 

The User Has Used the Simple Search during the Session

Goal settings as in the screenshot below. The value of the regular expression: /dlibra/results\?q\=.+\&action=SimpleSearchAction\&.+

The User Has Used the Advanced Search during the Session

Goal settings as in the screenshot below. The values of particular fields:

  • the value in the “Begins with” field: /dlibra/results?action=AdvancedSearchAction
  • the “Path” switch: ON
  • the value in the “Step 1” section:
    • name: advanced search form
    • screen/page: /dlibra/advsearch
    • required: YES


The User Has Used Search Result Filtering during the Session

Goal settings as in the screenshot below. The value of the regular expression:  /dlibra/results\?.+\&qf1=.+

The User Has Used Search Result Sorting during the Session

Goal settings as in the screenshot below. The value of the regular expression: /dlibra/results\?.+\&sf=.+

The User Has Browsed Metadata Value Indexes during the Session

Goal settings as in the screenshot below. The value in the “Begins with” field: /dlibra/indexsearch

The User Has Displayed the Content of a Digital Object during the Session

Goal settings as in the screenshot below. The value of the regular expression: /dlibra/publication/[0-9]+/edition/[0-9]+/content.*

The User Has Remained for at Least Five Minutes on the Page with the Content of a Digital Object

Goal settings as in the screenshot below. The values of particular fields:

  • category: reading
  • action: heartbeat
  • label: remains empty
  • value: remains empty
  • Use the action value as the goal value during conversion: YES

The time spent on a page with the content of a digital object is tracked by dedicated reporting scripts built into the code of the dLibra system, in the page on which the content of the object is presented. When the page is open, an event is sent to GA every 5 minutes (it is described as [category: reading, action: heartbeat]), so the period of time during which the user interacts with the content of the object (for example, browses a PDF file) can be tracked. Without additional support on the part of the system of the digital library, such an interaction is not caught by the GA tracking scripts, which results in significant underestimation of user session times.

The User Has Downloaded a Digital Object in the Form of a ZIP Archive during the Session

Goal settings as in the screenshot below. The value of the regular expression: /Content/[0-9]+/zip/

The User Has Spent at Least Five Minutes in Total on a Page of the Digital Library during the Session

Goal settings as in the screenshot below.

Data Analysis Samples

Below, examples of data analysis concerning user traffic in a digital library are shown. Those examples are by no means exhaustive, but they have been selected with the view to highlighting selected GA possibilities which are interesting in the context of digital libraries.

Object Popularity – the Number of Metadata Page Impressions and of Online Impressions


  1. Select the “Behavior > Content analysis” report.
  2. In the table, in the “Level 1 of the page path” column, click the “/dlibra/” item.
  3. In the table, in the “Level 2 of the page path” column, click the “/publication/” item.
  4. The table with publication identifiers is visible; the default sorting order is by page views, in descending order. In this context, a page view is:
    1. for a publication with content – an impression of the metadata page or an impression of online content;
    2. for a publication without content (for example, higher-order elements of group publications, like a whole journal) – an impression of the page with the description/structure of an element of a group publication.
  5. When a publication identifier in the “Level 3 of the page path” column is clicked, the table about the way of interacting with the publication will be displayed. The table should contain two rows:
    1. /zip/ – for downloading the publication as a ZIP archive; and
    2. /<main file name> – for displaying the publication online.
  6. In step 3 above, additional filtering may be introduced to only display downloads. For that purpose, the user should:
    1. add “Additional dimension” (the button over the table) named “Level 3 of the page path”;
    2. enable advanced filtering of table rows (the “Advanced” link on the right, over the table) and select:

Object Popularity – the Number of Online Impressions and of Downloads in the Form of a ZIP Archive

  1. Select the “Behavior > Content analysis” report.
  2. In the table, in the “Level 1 of the page path” column, click the “/Content/” item.
  3. The table with publication identifiers is visible; the default sorting order is by page views, in descending order. In this context, a page view is the number of online impressions and of downloads in the form of a ZIP archive.
  4. When a publication identifier in the “Level 2 of the page path” column is clicked, the table about the way of interacting with the publication will be displayed. The table contains two rows:
    1. /zip/ – for downloading the publication as a ZIP archive; and
    2. /<main file name> – for displaying the publication online.
  5. In step 3 above, additional filtering may be introduced to only display downloads. For that purpose, the user should:
    1. add “Additional dimension” (the button over the table) named “Level 3 of the page path”;
    2. enable advanced filtering of table rows (the “Advanced” link on the right, over the table) and select:

Interest in Objects – the Time Spent on Object Content Pages

  1. Select the “Behavior > Events > The most frequent events” report.
  2. In the table with events, click the “reading” item (in the “Event category” column).
  3. In the table with events, click the “heartbeat” item (in the “Event action” column).
  4. In the “Event label” column, the table presents a list of identifiers of the editions the content of which the users have viewed the longest. Every label has the following structure: edition-<numerical identifier of the edition>-format-<numerical identifier of the format>, and the last element is only filled for multi-format objects.
  5. The columns are interpreted in the following way:
    1. the total number of events – the total number of the five-minute intervals that users have spent viewing the content of the edition;
    2. unique events – the number of unique user sessions, in which a user has spent at least five minutes viewing the content of the edition; and
    3. event value – the total number of minutes that users have spent viewing the content of the edition.
  6. To facilitate data analysis, an additional dimension can be added to the table (the button just over the table). The suggested dimensions are:
    1. page – a column with the links to the objects will be added; and
    2. page title – a column with the titles of the objects will be added.

 

  • No labels