Format your Data
The Bloomreach API based Catalog Data Management allows you to send and manage your content search data. Your site's content data has to comply with the data format prescribed by Bloomreach to make it searchable along with the metadata and easy for Bloomreach to ingest.
Let’s first understand the following terms: items, catalogs, and collections.
Each piece of content is called an item, such as "Awesome Omelette Recipe" or "How to Prepare a Lasagna". Your content data form catalogs for specific types of content, such as Recipes, Blogs, or Videos. Collections gather the catalogs of a specific content type.
Items
Items are any piece of content or page on your live site that you want Bloomreach to index and return in search. While you can define attributes within your items as you wish, ensure that the items you define have the following components:
- op determines how catalog data should be modified
- path identifies an item record to be modified
- Attributes describe the item, such as title, description, tags, etc. Attributes may be defined differently for each item. For example, a video could have an attribute “duration” defining the total duration of the video, but you would not have to include “duration” for blogs
- Views (optional) allow you to display only a certain version of the content to specific user groups
Catalog
A catalog is simply a grouping of items. These items can be a collection of blog posts, news articles, videos, etc. Bloomreach understands and tracks your items using a catalog. A catalog has a unique name preconfigured, that is also unique to a domain (if you have multiple sites). Further, a catalog also has a unique identifier automatically generated. Catalogs are preconfigured in the Dashboard by Bloomreach, however, you can change the display name.
For example:
Homeoasis.com is a lifestyle, food and fashion site that has blog posts on various types of cooking recipes. Homeoasis is provided with a preconfigured catalog for “Best Potluck Recipes” with an unique identifier: “best_potluck_recipes_1”. This catalog contains many other recipes (items) such as “Potato Salad”, “Cheese Chicken Fritters”, etc. that have unique identifiers such as "Potato_Salad", "cheese_chicken_fritters", etc.
Sample Data
Sample files
Here are some sample catalog data files to get you started.
Sample recipe: "Awesome Omelette"
{ "op":"add", "path":"/items/awesome_omelette", "value":{ "attributes":{ "title":"Awesome Omelette", "url":"https://www.homeoasis.com/recipe/awesome-omelette.html", "description":"Omelettes can be a little intimidating. Omelette also falls on the healthier end of the spectrum, whereas some omelettes are oozing with cheese. The pan-roasted tomatoes are one of favorite additions, but they can be skipped if you’re in a hurry or substituted with another juicy vegetable of your choosing (sautéed mushrooms, zucchini or eggplant, for example). Serve it up with a side of fruit and a steaming cup of coffee or tea and you’re all set!", "medium_image_url":"https://www.homeoasis.com/images/recipe/201851/img1.jpg", "rating":4.7, "reviews":22, "prep_time_mins":10, "cook_time_mins":10, "servings":10, "ingredients":[ "10 Eggs", "240g of grape or cherry tomatoes, halved", "1 tablespoon ghee or olive oil", "2-3 tablespoons pistachio pesto, or other", "100g Olives", "Salt: white & fine" ], "category":[ "Breakfast", "Brunch" ], "directions":"Melt about 1 teaspoon of ghee/oil in an 8-inch cast iron [or non-stick] pan over medium heat. Once hot, add the tomatoes to the pan and sprinkle with salt. Let cook for about 12-18 minutes, flipping every few minutes until the liquid has mostly cooked off and they look caramelized [refer photo 1]. Reduce heat to medium-low and let the pan cool down for a few minutes. Add in remaining ghee/oil. Whisk the eggs briskly for about 30 seconds. Pour eggs into the pan and swirl around to evenly distribute. It should sizzle a bit but not go crazy. You want the eggs to cook slowly. Let the eggs cook without stirring for about 2 minutes until the edges and bottom start to set. Once the omelet starts to set gently lift up the edges with a spatula and tilt the pan towards that edge to help some of the uncooked egg run beneath. Dollop the pesto on one half of the omelet and sprinkle the same half with roasted tomatoes. Loosen the edges of the side with no toppings and carefully fold it over to cover the toppings. Let cook 1 more minute then slice in half and serve immediately. Top with salt + pepper as desired." }, } }
Sample video: “How to Make Our Awesome Omelette”
{ "op":"add", "path":"/items/awesome_omelette_video", "value":{ "attributes":{ "title":"How to Make Our Awesome Omelette", "url":"https://www.homeoasis.com/video/awesome-omelette-video.html", "description":"Follow along our Awesome Omelette recipe with this companion video.", "medium_image_url":"https://www.homeoasis.com/images/recipe/201851/img1.jpg", "rating":4.7, "video_id":HDRS2748, "video_duration":5, "category":[ "Videos", "Breakfast" ], }, } }
Sample PDF, “Awesome Omelette”
{ "op":"add", "path":"/items/awesome_omelette_pdf", "value":{ "attributes":{ "title":"Awesome Omelette", "url":"https://www.homeoasis.com/pdf/awesome-omelette.pdf", "medium_image_url":"https://www.homeoasis.com/images/recipe/201851/img1.jpg", "rating":4.7, "category":[ "PDF", "Breakfast" ], }, "@import":{ "path":"/pdfs/awesome_omelette.pdf" } } }
Attributes
op and path
Every item requires op and path.
- op defines the type of operation to be performed on the catalog with this record. Values allowed are “add” (which can also replace) and “remove”.
- path identifies a specific item record, or a specific portion of an item record, to be operated on. To operate on an entire product record, the path value should be "/items/{item_id}", where item_id is a unique identifier. The item_id for a given content item should be the same in both your catalog data and your pixel.
@import (PDFs)
If your catalog is configured for document search (PDFs), you must include the following attribute:
Field name | Description | Example |
@import | Extracts the contents of the PDF. Within this field, you must include the path, which is the relative FTP path to the PDF. This will extract “title” (the title of the PDF) and “body” (the text content parsed from the PDF) from the PDF as item attributes. If you have already provided values for “title” or “body” as item attributes, then those take precedence over the extracted values. | "path":"/pdfs/awesome_omelette.pdf" |
Custom attributes
You can create your own attributes. Attributes consist of a name and value; for example, an attribute named “title” with a value of “Awesome Omelette”.
Field name |
Description |
Example |
title |
The actual title of the content that you are making is searchable. |
Awesome Omelette |
url |
The url of the HTML page within which this content lies. |
https://www.homeoasis.com/recipe/awesome-omelette.html |
description |
This is the body of your content. You can use it to capture the summary of the content piece. |
|
publication_date |
The date when the content was published online. |
1556803380000 |
author_name |
The name of the author who wrote the content piece. |
John Smith |
category |
The category or categories that the content belongs to, provided as an array. |
Recipes |
Content search does not require items to have any specific attributes, but we recommend providing at least a title to ensure the item is searchable.
On the other hand, we recommend excluding attributes that are irrelevant to search. Sending attributes or information that are not relevant to your desired search experience will increase both search request latency and index generation time.
Rules for naming item_id and custom attributes
- item_id name and attribute names may only use alphanumeric characters (A to Z, 0 to 9) or underscores ( _ )
- Attribute names should not start with a number
- Attribute values can be one of the following types:
- string
- integer
- float
- boolean
- A homogeneous array of any types above
- Objects are not currently supported, so if you have objects of arbitrary depths, you would need to flatten them out.
- Max length of any attribute value should be 32 KB. For arrays, max length of any single value should be 32 KB
Examples:
- “awesome_omelette_123” is a valid item_id
- “awesome:omelette_123” is not a valid item_id because “:” is not a valid character
- “prep_time_mins” is a valid attribute name
- “1st_prep_time_mins” is not a valid attribute name because it starts with “1”
Views
You can specify views to show different versions of the same content item to different viewers. This scenario requires a multi-view catalogs setup and is typically used in cases like Contracts, Price Lists, and Entitlements wherein you want to show different versions of the same content to different viewers. For example, you could use views to show specific content to logged-in or premium users only, or to show different content for different regions. For sites in different languages, when you have different data for each language, you can use domain keys to distinguish between the sites.
To integrate content search with views, you will have to modify your pixel and catalog data. Refer to the Content Search Pixel Integration Scenarios to integrate the content search pixel.
Adding Views to Catalog Data
To specify views for an item, you must include the following fields:
- View ID - Unique identifier linked to a specific view. The view ID must be unique from all other views. Example: “Basic”, “Premium”
- View attributes - Item attributes defined for the view specified by the view ID. Attributes nested inside a view ID only apply to that view, while attributes nested outside of “views” apply to all views.
{ "op":"add", "path":"/items/awesome_omelette", "value":{ "attributes":{ //Attributes shared across all views }, "views":{ "Basic":{ "attributes":{ "title": "url": "description": ... } }, "Premium":{ "attributes":{ "title": "url": "description": ... } }, } } }
Patch operations
Use the proper values for op and path to modify your catalog data. The table below has op and path values for various use cases.
Description |
Op |
Path |
Value Schema |
---|---|---|---|
Add or replace an item |
add |
/items/{item_id} |
Item |
Remove an item |
remove |
/items/{item_id} |
n/a |
Replace all attributes of an item |
add |
/items/{item_id}/attributes |
Attributes |
Add or replace a single attribute |
add |
/items/{item_id}/attributes/{name} |
Attribute value |
Remove a single attribute |
remove |
/items/{item_id}/attributes/{name} |
n/a |
Sample patch operations
Remove the entire Awesome Omelette item
{ "op": "remove", "path": "/items/awesome_omelette" }
Replace the Awesome Omelette ratings value with 4.8 and reviews value with 24
{ "op": "add", "path": "/items/awesome_omelette/attributes/ratings", "value": “4.8” } { "op": "add", "path": "/items/awesome_omelette/attributes/reviews", "value": “24” }
Remove the Awesome Omelette reviews attribute, and replace ratings value with 4.9
{ "op": "remove", "path": "/items/awesome_omelette/attributes/reviews" } { "op": "add", "path": "/items/awesome_omelette/attributes/ratings", "value": “4.9” }
Patch Operations for Views
You can use the following patch operations to modify values within a view.
Description |
Op |
Path |
Value Schema |
Add or replace all views of an item |
add |
/items/{item_id}/views |
Views |
Add or replace a view of an item |
add |
/items/{item_id}/views/{view_id} |
View |
Remove a view from an item |
remove |
/items/{item_id}/views/{view_id} |
n/a |
Replace all attributes of a view |
add |
/items/{item_id}/views/{view_id}/attributes |
Attributes |
Add or replace an attribute of a view of an item |
add |
/items/{item_id}/views/{view_id}/attributes/{name} |
Attribute value |
Remove an attribute from a view of an item |
remove |
/items/{item_id}/views/{view_id}/attributes/{name} |
n/a |
Next Step
After your catalog data has been formatted properly, send your data to Bloomreach for ingestion.