Elasticsearch Data Store - BloomReach Experience - Open Source CMS

This article covers a Hippo CMS version 10. There's an updated version available that covers our most recent release.

23-09-2016

Elasticsearch Data Store

Introduction

To use the Trends panel and to see the experiment servings, visits must be stored in Elasticsearch.

Installation

Download and install Elasticsearch 1.7 (Hippo currently does not support Elasticsearch 2.x).

Then add the following Maven artifact to your site module's pom.xml:

<dependency>
  <groupId>com.onehippo.cms7</groupId>
  <artifactId>hippo-addon-targeting-state-elastic</artifactId>
  <scope>runtime</scope>
  <version>${hippo.addon-targeting.version}</version>
</dependency>
For guidance on installing, configuring, deploying and administering Elasticsearch please refer to the Elasticsearch documentation. Hippo does not require any special or additional steps to set up Elasticsearch. For production environments we recommend a cluster of at least two Elasticsearch nodes for high availability.

Configure Visits Data Store

Change the vists data store configuration using the Console application:

/targeting:targeting/targeting:datastores/targeting:visits
  - targeting:storefactoryclass =
         com.onehippo.cms7.targeting.storage.elastic.ElasticStoreFactory

Available properties

property type default description
locations (multiple) strings http://localhost:9200/ A multi-valued string property containing url locations of nodes in the elasticsearch cluster to connect to. One location is enough to connect to the cluster. Specifying multiple locations adds robustness for the startup process.
indexName string visits The name of the elasticsearch index (roughly equivalent to a database in rdbms) to use.
authentication string none Optional username and password separated by a semicolon for if elasticsearch required authenticated access.
timeToLiveSeconds long infinite How long items stored by this store should be kept, in seconds.
maxConnections long 20 Maximum number of client threads in the connection pool that will be used to connect to elasticsearch
retrieveVisitorTimeoutMillis long 20

The time in milleseconds to wait for a connection with the relevance data store to be established.

Configure Visitor & Requests Data Stores (Development Environments Only)

Using Elasticsearch for the Visitor and Requests data stores is only supported in development environments. In production environment Couchbase is required. See Relevance Data Stores for more information.

To configure Elasticsearch for the requestlog data store:

/targeting:targeting/targeting:datastores/targeting:requestlog
  - targeting:storefactoryclass =
         com.onehippo.cms7.targeting.storage.elastic.ElasticStoreFactory

The default value for the indexName property here is request-log.

To configure Elasticsearch for the visitors data store:

/targeting:targeting/targeting:datastores/targeting:visitors
  - targeting:storefactoryclass =
         com.onehippo.cms7.targeting.storage.elastic.ElasticStoreFactory

The default value for the indexName property here is targeting-data.

Enable Visits Aggregator and Model Trainer Jobs

To use Trends the Visits Aggregator job must be enabled.

/targeting:targeting/targeting:dataflow/modelTrainer
  - running = true

To use Experiments the Model Trainer job must be enabled.

/targeting:targeting/targeting:dataflow/visitsAggregator
  - running = true

Configure Elasticsearch

Create the indices configured above (referred to by the indexName property) in Elasticsearch, e.g. using curl:

curl -s -S -XPUT http://elastic.host:9200/indexname

The indices must be accessible for reading and writing to the users as configured by the authentication property.  How this can be done is out of scope of this document because it depends on the deployment scenario of your Elasticsearch instance. Please consult your administrator to find out how you can create the index in your Elasticsearch instance.

Troubleshooting

If you configured Elasticsearch as Visits data store and you are seeing the following error message in the logs:

[INFO] [talledLocalContainer] 03.12.2015 17:15:58 WARN  dataflowScheduling-1 [AbstractDataFlowService.run:153] Exception while processing job 'VisitsAggregatorJob': java.lang.UnsupportedOperationException: retrieveLogInterval has not yet been implemented

You are using the In-Memory store for the request log. The Visits Aggregator job, which aggregates data from the request log, is not supported by the In-Memory store.

In a development environment use either Elastisearch or Couchbase.

In a production environment use Couchbase.

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?