Skip User Agents - BloomReach Experience - Open Source CMS
07-01-2019

Skip User Agents

For a relevant customer experience, it is unnecessary that crawlers like those used by Google, Yahoo, Bing and others get targeted pages. In addition, it is undesirable that crawlers influence Scoring and Normalization averages. The same holds for link checker tools and services, like WatchMouse, W3C-checklink or Xenu Link Sleuth. Also, in the Real-Time Visitor Analysis screen it is most likely not desirable that these kind of crawler or linkchecker visitors are shown. From a crawler's point of view, it also doesn't make sense that it indexes a targeted page. For example, the stores nearby block on the right of a page doesn't add any value to be indexed by search engines.

Hence, by default, we do not target these kind of requests at all. By default, the Relevance Module ignores requests from a large set of commonly used crawlers and link checkers. This is done by checking the request User-Agent header whether it contains some string that indicates that it is a robot or link checker. You can add extra user agents to skip in the repository at the multi-valued property at: /targeting:targeting/targeting:skipUserAgents

It pays to take a look at the collected request log data to exclude agents that do not need to be targeted.

The default list of skipped user agents can be found in hippo-addon-targeting-repository at src/main/resources/hcm-config/targeting-configuration.yaml.

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?