Run an Updater Script

Introduction

Goal

Run a Groovy Updater Script to perform bulk changes on repository content.

Background 

The Updater Editor allows developers to create, manage and run updater scripts against a running repository from within the CMS UI. Updater scripts can perform bulk changes to existing content.

See Write an Updater Script for more information. This page explains the execution options available in the Updater Editor.

With Great Power Comes Great Responsibility

Updater scripts can modify large parts of your repository. Use them with care.

Security

The scripts are executed via a custom Groovy ClassLoader which protects against obvious and trivial mistakes and misuse (for example invoking System.exit()). However this is not intended to provide a fully protected Groovy sandbox. This means that technically Groovy Updater scripts can be used to execute external programs, possibly compromising the server environment.
Therefore, protection against incorrect usage of Groovy updater scripts must be enforced by limiting the access and usage to trusted developers and administrators only.

Manage Updater Scripts

The left side of the Updater Editor consists of three parts:

  • Registry
    Contains all created updater scripts. Select a script from the registry to execute it.
  • Queue
    Contains all scripts that are (waiting to be) executed. The scripts are executed in the order in which they were added to the queue. Only one script is executed simultaneously, even in a clustered environment. You can stop the currently executing script, and delete queued scripts from the queue. Stopping a script will finish the current NodeUpdaterVisitor#doUpdate call before actually stopping. The output of the script is available in the bottom part of the screen, and live updated every few seconds.
  • History
    Contains all scripts that have been (fully or partially) executed. Scripts that have been executed can be reverted from here, provided they support this feature by having implemented undoUpdate.

Execution Options

Node Selection

The updater engine uses the visitor pattern. Which nodes are visited is specified by Select node using (Repository path, XPath query, or Updater):

  • Repository path is an absolute path in the repository, for example: /content/documents or /hst:hst/hst:configurations. All nodes below the path will be visited, including the node specified by the path itself.
  • XPath query is an XPath query that selects the nodes to visits. Examples queries are:
     //element(*, hippo:document)  all nodes of type ' hippo:document'
      /jcr:root/hst:hst/hst:configurations//element(*, hst:sitemapitem)  all nodes of type ' hst:sitemapitem' below  /hst:hst/hst:configurations
     //*[@example:title='foo']  all nodes that have the property ' example:title' set to the value 'foo'
  • Updater indicates the script itself provides the logic for navigating one or more nodes to visit. The script must implement (override) the firstNode and nextNode methods provided by the BaseNodeUpdateVisitor base class.

Performance

Changes to visited nodes are saved in batches. Each executed script can specify a Batch Size and a Throttle value:

  • batch size is the number of nodes that have to be modified before changes are written to the repository (the engine counts the number of updated nodes by checking if the return of #doUpdate(Node) method; true for updated, false for skipped and exception/error for failed ones). Keep the batch size reasonably low, say fifty or a hundred, to avoid large changesets that consume a lot of memory.
    See more detail in Reporting of Execution section below.
  • throttle is the number of milliseconds to wait after each batch. This avoids that a running repository is swamped with changes and becomes unresponsive to other users.

Logging

  • Log Level
    You can select the log level of an Updater script from one of the following: TRACE, DEBUG and INFO. DEBUG has been set by default. For example, if you set the log level to INFO in the Updater Editor, any log messages at TRACE or DEBUG level in the script won't be printed out.
  • Log Target (available since Bloomreach Experience Manager versions 14.7.7 and 15.0.1)
    You can select the target for the Updater script to write log messages to from one of the following: LOG FILES or REPOSITORY. When LOG FILES is selected, log messages are written to regular log files using the logger for org.onehippo.repository.update.UpdaterExecutionReport and not displayed in the UI. When REPOSITORY is selected, log messages are written to JCR nodes and displayed in the UI when running the script.
    The Log Target option is only available either when running in local development mode (i.e. using the cargo.run profile) or when the system property groovy.persist.logs.supported is set to true. In all other scenarios, the Log Target option is not available and log messages are written to log files only.
    Log Target REPOSITORY should be avoided in use cases where frequently running scripts produces large log outputs, as this would fill up the datastore aggressively and lead to performance issues.
    In Bloomreach Experience Manager versions 14.7.6 and 15.0.0:
    • Only when using the cargo.run profile (local development), log messages by Groovy Updater Scripts are written to JCR nodes and displayed in the UI when running the script.
    • In all other scenarios, log messages are written to regular log files using the logger for org.onehippo.repository.update.UpdaterExecutionReport and not displayed in the UI.
    In Bloomreach Experience Manager versions 14.7.5 and earlier, log messages by Groovy Updater Scripts are always written to JCR nodes and displayed in the UI when running the script.

Parameters

Scripts can accept Parameters:

  • Parameters can be specified with a valid JSON string which defines a map of parameter name (String) and parameter value (Object) pairs.
    Example: { "basePath": "/content/documents/myproject/news", "tag" : "gogreen" }

Execution Mode

There are two ways to execute a script:

  • Execute will visit all specified nodes and save the changes to the repository after each batch. The UUIDs of all modified nodes are logged in case the script has to be undone later.

  • Dry run will also visit all specified nodes, but never write any changes to the repository (i.e. the engine calls  Session.refresh(false) after each batch)

Use dry run to try out new scripts without risk.

Automatically Execute Updater Scripts on Startup

It's possible to automatically execute scripts on startup by using the repository-data-application module to add the scripts as content definitions to /hippo:configuration/hippo:update/hippo:queue. Once the application has started, it will execute any scripts in the queue.

As of version 13, the updater execution module is configured by default to run scripts on full CMS nodes only.

Technically, it is possible to automatically execute a script in a delivery-tier-only environment as long as the following two are true:

  • At the node /hippo:configuration/hippo:modules/updater-execution, the property hipposys:cmsonly is set to false.
  • The updater script only depends on libraries available on the classpath in that environment (typically this does not include CMS libraries!).

Undo Updates

An updater script can support undo of its modifications by implementing the undoUpdate method.

Scripts in the History that have been executed can be undone by clicking the Undo button. The updater engine will then visit only those nodes again that were modified before by the doUpdate method. For these modified nodes it will call the method undoUpdate.

Items in the History that were dry run or were the result of an undo run cannot be undone.

Bootstrapping of Updater Scripts

As of Bloomreach Experience Manager version 14.1.0, when the developer adds an updater script through the CMS UI, the updater script is stored as config instead of as content. Newly added updater scripts are also stored in seperate YAML files. This mechanism allows developers to bootstrap the updater scripts without creating content actions for updater scripts. For more information about bootstrapping of content and config, please visit Manage Content page.

Strict Mode

This feature is available since Bloomreach Experience Manager 15.4.0

Strict Mode disables creating and modifying Updater Scripts and consequently restricts CMS users with admin privileges to running existing Updater Scripts. This way, implementation projects can opt to have Updater Scripts controlled solely by developers through a deployment workflow. CMS admin users can run the scripts but can't modify them. 

When Strict Mode is enabled, the Updater Editor will show as read-only, except for the Parameters field, the log level and the log target.

Strict Mode can be enabled by setting the system property groovy.strict.mode=true.

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?