Robots.txt Plugin Configuration - BloomReach Experience - Open Source CMS

This article covers a Hippo CMS version 10. There's an updated version available that covers our most recent release.

03-07-2015

Robots.txt Plugin Configuration

Installation

The Robots.txt plugin can be added to your project using Hippo's setup application.

Prerequisites:

Instructions:

  1. Using Hippo's setup application, add Robots.txt to your project.
  2. Rebuild and  restart your project. On startup some additional steps to set up the plugin are performed by the setup application and it will indicate a second rebuild is required.
  3. Rebuild and  restart your project one more time.
  4. Point your web browser to http://localhost:8080/site/robots.txt to see the robots.txt
See  Plugin Installation for more information on the two required rebuilds.

Configuration

The robots.txt file is configured through a special document in the CMS.

  1. Select the Content Perspective.
  2. Browse to the administration folder.
  3. Edit and publish the robots document.

Multiple Sections are supported, each section representing a User-agent: line followed by zero or more Disallow: lines.

The shown configuration has the effect that for the User-agent called Googlebot, all URLs under [your.site]/abc and [your.site]/def are disallowed. In addition, for all User-agents, all URLs under [your.site]/skip/this/url are disallowed.

On top of this, the robots.txt plugin automatically adds all faceted navigation URLs as disallowed. This behavior can only be overridden on the CMS Console, by adding the Boolean property robotstxt:disallowfacnav with value false to the Robots.txt configuration document.

Delivery Tier Configuration

The setup application adds all the required delivery tier configuration. This configuration will work for typical projects. If needed, theconfiguration can be adapted to the requirements of your project.

A component configuration and a Freemarker template are added to the hst:default configuration:

/hst:hst/hst:configurations/hst:default/hst:pages/robotstxt [hst:component]
  - hst:componentclassname = org.onehippo.forge.robotstxt.components.RobotstxtComponent
  - hst:template = robotstxt.ftl
/hst:hst/hst:configurations/hst:default/hst:templates/robotstxt.ftl [hst:template]
  - hst:script = [freemarker template source code]

A sitemap item (i.e. URL) is added to your project's configuration (e.g. myhippoproject):

/hst:hst/hst:configurations/myhippoproject/hst:sitemap/robots.txt [hst:sitemapitem]
  - hst:componentconfigurationid = hst:pages/robotstxt
  - hst:relativecontentpath = ../administration/robots

Example

The screen shot below shows sample output for the Hippo demo website, which has quite a few faceted navigation links. Note how they show up after the section-specific paths for all sections.

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?