HtmlCleanerService - BloomReach Experience - Open Source CMS

This article covers a Hippo CMS version 10. There's an updated version available that covers our most recent release.

22-05-2015

HtmlCleanerService

Purpose

The HtmlCleanerService ensures that the content of an HTML field is correct XHTML, by changing the content before it is saved (as unpublished version). This is of importance if the user has added content by copy/paste or if the user edited in HTML mode. It also prevents XSS attacks by removing javascript: handlers from element attributes.

Typing in normal mode could not result in incorrect HTML, so the HtmlCleanerService will not change the content normally.

The HtmlCleanerService only preserves XHTML tags that are configured to be preserved. Non-HTML tags cannot be preserved. Tags that are not preserved are changed into the p or the div tag. If there is a mismatch in opening and closing tags, the HtmlCleanerService will remove the offending content. This occurs without warning.

Configuration

The configuration of the  HtmlCleanerService is kept in the Hippo repository (together with all content). It can be changed via the Console under the node:  /hippo:configuration/hippo:frontend/cms/cms-services/htmlCleanerService.

preserved tags

Each tag to preserve should be mentioned in the list  whitelist as a frontend:pluginconfig. Valid attributes should be listed in the multi-valued ' attributes' property.

no warning

If you add an non-HTML tag to the configuration, for example due to a typo, you will not get any error message, neither in the console, nor in the logs of the repository and CMS.

non-HTML tags

Non-HTML tags in the configuration have no effect on preservation. They will still not be preserved, as that would not result in correct XHTML.

preserved attributes

Attributes that are listed in the multi-valued ' attributes' property are preserved for the particular tag. Other attributes will be removed.

comments It's possible to remove comments by setting the property ' omitComments' to true
formatting The serialization format of the cleaner is specified by the ' serializer' property to either ' simple', ' pretty' or ' compact'. The default is 'simple'.
Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?