Run an Updater Script - BloomReach Experience - Open Source CMS

This article covers a Hippo CMS version 12. There's an updated version available that covers our most recent release.

02-08-2018

Run an Updater Script

Introduction

Goal

Run a Groovy Updater Script to perform bulk changes on repository content.

Background 

The Updater Editor allows developers to create, manage and run updater scripts against a running repository from within the CMS UI. Updater scripts can perform bulk changes to existing content.

See Write an Updater Script for more information. This page explains the execution options available in the Updater Editor.

With Great Power Comes Great Responsibility

Updater scripts can modify large parts of your repository. Use them with care.

Security

The scripts are executed via a custom Groovy ClassLoader which protects against obvious and trivial mistakes and misuse (for example invoking System.exit()). However this is not intended to provide a fully protected Groovy sandbox. This means that technically Groovy Updater scripts can be used to execute external programs, possibly compromising the server environment.
Therefore, protection against incorrect usage of Groovy updater scripts must be enforced by limiting the access and usage to trusted developers and administrators only.

Manage Updater Scripts

The left side of the Updater Editor consists of three parts:

  • Registry
    Contains all created updater scripts. Select a script from the registry to execute it.
  • Queue
    Contains all scripts that are (waiting to be) executed. The scripts are executed in the order in which they were added to the queue. Only one script is executed simultaneously, even in a clustered environment. You can stop the currently executing script, and delete queued scripts from the queue. Stopping a script will finish the current NodeUpdaterVisitor#doUpdate call before actually stopping. The output of the script is available in the bottom part of the screen, and live updated every few seconds.
  • History
    Contains all scripts that have been (fully or partially) executed. Scripts that have been executed can be reverted from here, provided they support this feature by having implemented undoUpdate.

Execution Options

Node Selection

The updater engine uses the visitor pattern. Which nodes are visited is specified by Select node using (Repository path, XPath query, or Updater):

  • Repository path is an absolute path in the repository, for example: /content/documents or /hst:hst/hst:configurations. All nodes below the path will be visited, including the node specified by the path itself.
  • XPath query is an XPath query that selects the nodes to visits. Examples queries are:
     //element(*, hippo:document)  all nodes of type ' hippo:document'
      /jcr:root/hst:hst/hst:configurations//element(*, hst:sitemapitem)  all nodes of type ' hst:sitemapitem' below  /hst:hst/hst:configurations
     //*[@example:title='foo']  all nodes that have the property ' example:title' set to the value 'foo'
  • Updater indicates the script itself provides the logic for navigating one or more nodes to visit. The script must implement (override) the firstNode and nextNode methods provided by the BaseNodeUpdateVisitor base class.
    This feature is available since Bloomreach Experience Manager v12.1.1 (also backported to v12.0.4, v11.2.5 and v10.2.9)

Performance

Changes to visited nodes are saved in batches. Each executed script can specify a Batch Size and a Throttle value:

  • batch size is the number of nodes that have to be modified before changes are written to the repository (the engine counts the number of updated nodes by checking if the return of #doUpdate(Node) method; true for updated, false for skipped and exception/error for failed ones). Keep the batch size reasonably low, say fifty or a hundred, to avoid large changesets that consume a lot of memory.
    See more detail in Reporting of Execution section below.
  • throttle is the number of milliseconds to wait after each batch. This avoids that a running repository is swamped with changes and becomes unresponsive to other users.

Logging

  • Log Level You can select the log level of an Updat er script from one of these: TRACE, DEBUG and INFO. DEBUG has been set by default. For example, if you set the log level to INFO in the Updater Editor, any log messages at TRACE or DEBUG level in the script won't  be printed out.
This feature is available since Bloomreach Experience Manager 12.2.0.

Parameters

Scripts can accept Parameters:

  • Parameters can be specified with a valid JSON string which defines a map of parameter name (String) and parameter value (Object) pairs.
    Example: { "basePath": "/content/documents/myhippoproject/news", "tag" : "gogreen" }

Execution Mode

There are two ways to execute a script:

  • Execute will visit all specified nodes and save the changes to the repository after each batch. The UUIDs of all modified nodes are logged in case the script has to be undone later.

  • Dry run will also visit all specified nodes, but never write any changes to the repository (i.e. the engine calls  Session.refresh(false) after each batch)

Use dry run to try out new scripts without risk.

Automatically Execute Updater Scripts on Startup

It's possible to automatically execute scripts on startup by using the repository-data-application module to add the scripts as content definitions to /hippo:configuration/hippo:update/hippo:queue. Once the application has started, it will execute any scripts in the queue.

Note: there can be additional limitations with respect to the accessible classpath for an automatically executing Updater Script, depending on in which environment it is executed.
In a delivery-tier-only environment, only the functionality provided by Hippo Repository might be available on the classpath.  

Undo Updates

An updater script can support undo of its modifications by implementing the undoUpdate method.

Scripts in the History that have been executed can be undone by clicking the Undo button. The updater engine will then visit only those nodes again that were modified before by the doUpdate method. For these modified nodes it will call the method undoUpdate.

Items in the History that were dry run or were the result of an undo run cannot be undone.

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?