Lucene Analyzer - BloomReach Experience - Open Source CMS

This article covers a Hippo CMS version 11. There's an updated version available that covers our most recent release.

04-07-2016

Lucene Analyzer

Introduction

Hippo Repository uses org.hippoecm.repository.query.lucene.StandardHippoAnalyzer as default Lucene Analyzer for the stored content. This analyzer strips stopwords for the languages English, German, Dutch, French, Spanish and Brazilian. It also applies a ISO Latin 1 accent filter, this replaces a letter like ç with c and ï with i, etc

You can configure custom language analyzers, that for example also add stemming.  The side effect is that it breaks wildcard searching. Explaining this is beyond the scope of this page, as it involves general concepts about inverted indexes, such as Lucene.  We advice to stick to the StandardHippoAnalyzer if you want to avoid wildcard searching issues.

Modify the Analyzer class

The Analyzer class is configured in the repository.xml file.

 

Change the value of

<param name="analyzer" value="org.hippoecm.repository.query.lucene.StandardHippoAnalyzer"/>

to the classname of your analyzer. 

See further Repository deployment settings how use your customized repository.xml.

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?