SiteMapItem Matching - BloomReach Experience - Open Source CMS

This article covers a Hippo CMS version 12. There's an updated version available that covers our most recent release.

26-04-2018

SiteMapItem Matching

After the URL has been matched to a Mount, the remaining part of the URL after the Mount is attempted to be matched to a SiteMapItem. So if the URL http://localhost:8080/site/fr/home matches to the Mount fr, the remaining part of the URL to be matched in the sitemap is home. The SiteMap is configured by default at /hst:hst/hst:configurations/{myproject}/hst:sitemap. The basic idea behind SiteMap matching is to provide flexible rules for matching specific URLs or complete URL spaces and map URLs on component configurations. The SiteMap is also used (more precise, the inverse of it). The SiteMap is a composite structure containing a hierarchy of SiteMapItems. It can contain SiteMapItems with explicit names, or with wildcards matching to any name. A SiteMapItem with name _index_ has a special meaning and is available since CMS 11.2. See the usage of _index_ SiteMapItems below.

From now on we talk about a path segment we refer to some part of the URL between two slashes. As wildcards the SiteMap supports:

1 _default_          this is equivalent to a *, matching any single path segment
2 _any_              this is equivalent to a **, matching any ending of a URL
3 _default_.ext      where 'ext' can be some extension, for example *.html
4 _any_.ext          where 'ext' can be some extension, for example **.xml

** and **.xxx matchers are only allowed as leaf SiteMapItem in the composite structure.

During SiteMapItem matching phase, the remainder of the URL after the Mount is attempted to be matched to the best SiteMapItem. The best SiteMapItem is the one that matches earliest path segments more specifically.

1 An exact (explicit) match is considered more specific than a wildcard match
2 * is more specific than a **
3 *.html is more specific than a *
4 **.html is more specific than **
5 * is more specific than **.html

Up to rule 4 it is very straightforward. Rule 5 is debatable, but we chose for * to be more explicit than **.html. For example suppose the following (contrived) SiteMap

‚Äč/hst:hst:
  /hst:configurations:
    /example:
      /hst:sitemap:
        /home:
        /news:
          /_any_.html:
          /_any_:
        /agenda:
          /_any_.html:
          /_any_:
          /2011:
            /_default_:
              /_default_:
        /_any_:

The following URLs (after the Mount part) match to the following SiteMapItems:

 /home -->home
 /news -->news
 /news/2011 -->news/_any_
 /news/2011/myNewsItem.html --> news/_any_.html
 /agenda/2010 --> agenda/_any_
 /agenda/2011/foo --> agenda/2011/_default_
 /agenda/2011/foo/bar --> agenda/2011/_default_/_default_
 /agenda/2011/foo/myAgendaItem.html --> agenda/2011/_default_/_default_
 /agenda/2011/foo/bar/lux --> agenda/_any_
 /agenda/2011/foo/bar/myAgendaItem.html --> agenda/_any_.html
 /home/foo/bar --> _any_

Give some special attention to /agenda/2011/foo/bar/lux and /agenda/2011/foo/bar/myAgendaItem.html. Understand that agenda/2011/_default/default_ does not fit, and why the fallback to the _any_ and _any_.html is done. The matcher _any_ at the root of the SiteMap is typically the catch-all matcher that creates a 404 page.

After a SiteMapItem is matched the HST request processing is invoked with a flyweight runtime instance of this SiteMapItem. The most important properties of a SiteMapItem are:

  1. hst:componentconfigurationid: The relative path to the hst:component (tree) below /hst:hst/hst:configurations/{myproject}. For example, for the SiteMapItem home it might be hst:pages/home and for news/any_ for example _hst:pages/newsoverview. REST pipelines using a sitemap do not use the hst:componentconfigurationid : The componentconfigurationid is only used for website development that is based on aggregating content based on HST Components
  2. hst:relativecontentpath: The content path relative to /hst:hst/hst:sites/{myproject}/hst:content. For example for the SiteMapItem home it might be common/homepage. Note that the relativecontentpath property can contain references to wildcards from the SiteMap. References are indicated by propertyplaceholders which have the syntax ${integer} or ${parent}, where ${parent} means : use the relativecontentpath of the parent SiteMapItem. Thus, for example the relativecontentpath for the following SiteMapItem could be as follows:
    • news/_any_ : news/${1}
    • news/default/default/_any_ : news/${1}/${2}/${3}
      - Note that the ${1} always refers to the top matched ancestor containing a _default_, ${2} to the second, etc.
      - Also note that if one of the property placeholders cannot be resolved for a request, the entire value is resolved to null.

SiteMapItem _index_ 

SiteMapItem _index_ is available since Bloomreach Experience Manager 11.2.0 and higher

There is a special SiteMapItem name that is _index_. It is a bit comparable to Apache DirectoryIndex Directive though we do not require the pathInfo to end with a '/'. The _index_ SiteMapItem works as follows:

  1. In case a URL is requested that matches SiteMapItem called foo that contains a child SiteMapItem called _index_, and if the hst:relativecontentpath of that _index_ item points to an existing document or folder, 
    then the final matched SiteMapItem will be the _index_ SiteMapItem. 

  2. If there exists an _index_ SiteMapItem below the SiteMapItem matched by the PathInfo, but if that _index_ item does not have an hst:relativecontentpath that points to an existing document or folder, then the SiteMapItem matched by the PathInfo will be returned. 

  3. Linkrewriting for documents that do match an _index_ SiteMapItem do get a link that matches the parent of the _index_ SiteMapItem : During the matching phase, the _index_ SiteMapItem will be used any way. 

  4. The _index_ SiteMapItem is supported below both explicit SiteMapItems and * SiteMapItems. The _index_ SiteMapItem is not supported directly below the hst:sitemap though. It is not supported below **, **.html or *.html SiteMapItems, either.

  5. The hst:relativecontentpath of _index_ SiteMapItems can use propertyplaceholders like ${1}, ${2} and ${parent}.

Additional properties of a SiteMapItem

Property name

Example

Description

hst:namedpipeline

JaxrsRestContentPipeline

The pipeline to use for the further HST request processing. If not present, the parent namedpipeline is used and if there is no parent, the Mount namedpipeline is used. If also not configured on the Mount, the default value used by the HST is DefaultSitePipeline which is a pipeline that invokes the HstComponent based request processing.

hst:refId

homeId

Optional property. It must be unique within a single sitemap item tree. With this property value, you can create a link to the SiteMapItem with this refId value instead of a path value of SiteMapItem. For example, instead of using path values to SiteMapItems, you can configure a refId values in ' hst:referencesitemapitem' of a SiteMenuItem, ' hst:homepage', or ' hst:pagenotfound' of a Mount configuration. This can be very useful if you have different SiteMapItem nodes for each language, but each SiteMapItem has the same ' hst:refId' value such as ' home' because those are just multi-lingual variants of the same sitemap item like ' home'. HST Link Creating components will look up a SiteMapItem configuration by refId first and then it will look up a SiteMapItem configuration by the path if not found by refId.

hst:excludedforlinkrewriting

true

Do not use this sitemapitem for linkrewriting if set to true. This is an important property if you want to support REST sitemap items next to normal website sitemap items.

hst:locale

en_US

The locale for the sitemapitem and descendants. If not configured, the value is inherited from the Mount.

hst:parameternames

pageSize

Keys which can be retrieved during HST request processing. The multi-valued property parameternames ans parametervalues must have equal number of items, otherwise, they are all skipped.

hst:parametervalues

5

values which can be retrieved during HST request processing. Propertyplaceholders like ${1}, ${2} are supported. The multi-valued property parameternames ans parametervalues must have equal number of items, otherwise, they are all skipped.

hst:authenticated / hst:roles / hst:users

see Delivery Tier Authorization Configuration

For securing the sitemapitem.

hst:responseheaders

["Access-Control-Allow-Origin: http://localhost:3000", "Access-Control-Allow-Credentials: true"]

Applicable since Bloomreach Experience Manager 12.3.0.

Custom HTTP Response Header(s) to be always written for a request on this sitemap item, and its descendant sitemap items unless the property is overriden. For example, when Cross-Origin Resource Sharing (CORS) is required with this sitemap item, you can configure related response headers through this property.

This property is to be set to a string array, each of which should be in the form of ( header_name + ':' + header_value ) like the example on the left.

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?