Data best practices
The following guide shows how to conduct quality assurance (QA) for data coming to and from Bloomreach Engagement.
Quality Assurance for data is useful at various stages to ensure data integrity, accuracy, and reliability. By performing QA, you can ensure that data is reliable, accurate, and ready for use in various applications and analyses.
Find incorrect data types
Wrong data types and formats are common causes of various issues. It is often difficult to spot where a type error occurs. Follow the steps below to locate and fix the incorrect data types.
Supported data types:
Numbers: 'numberValue':11.3
Strings: 'stringValue1':'hello world!'
Boolean: test2:false
Objects: 'object1':{'string':'hello world!','number':12 }
Arrays: 'array':['hello world!',2,{'num':11}]
Date: 'now':Math.round(new Date().getTime()/1000)
Use Data Manager
Show tracked values
You can view a sample of 100 values for customer and event properties directly from Data Manager.
For events, sort by the Last 24 Hours
column to see the most recent data. Data Manager shows raw data, so for dates, this should actually be a numeric value.
The Show tracked values
button provides a mini report of your data. If there are issues to investigate, we advise to create a full report.
Check event/property names and types
When data is sent into the platform for the first time, event and property names and types are automatically created.
Date
and Datetime
values are initially seen as numbers and will need to be changed to the appropriate data type.
Customer Profile
Info icons on customer attributes
You will see an information icon (i) where the stored data is of a different data type than the data type defined in Data Manager.
A red (i) icon indicates the data is not understood and cannot be interpreted as the right type. It will not work as expected.
A grey (i) icon indicates the data is being automatically converted for display, filters, etc. It may work as expected.
Automatic conversion issues
You will see a grey (i) icon when automatic conversion is applied. This may not give the correct result.
For example, the text “Y” in a Boolean property will be treated as false
. Ideally, it should be yes
or no
.
Dates are often problematic. For example, a date that uses the “/” separator is treated as a US date format.
Always make sure that the data is sent as the correct type.
Advanced: individual events
There is no (i) icon for individual events, so it may not always be clear if the data is not stored correctly. Use the Network tab of the Developer Tools (press F12
orright-click > Inspect
) in your browser to make sure. Look for the last request to “list”, and check the response.
Useful filters
Wrong date data type
Use this filter to help you spot data stored in the wrong type and find affected customers. The most common problem is with dates stored as text.
Wrong phone number format
Phone numbers can be stored in many formats, which can be tricky to correct. We recommend the E.164 standard, for example: “+441234567890”.
This filter finds anyone with a value that does not start with “+”.
Fix incorrect data types
Once you locate your incorrect data types, follow the steps below to update and/or convert your data.
Scenarios to fix data
Update customer properties
You can use scenarios to reformat and correct data, either as a regular process or a one-off. You can use scenarios to easily rewrite and update customer properties.
The following example shows how to fix a Boolean property set to “Y”. Make sure to set the correct type.
Convert text dates
The following example shows how to correct incoming textual dates. This method uses a Regular Expression filter to match the date text pattern and Jinja to convert it to a “proper” date at noon (to avoid time zone issues).
Regular Expression filter:
[12][0-9][0-9][0-9]-[01][0-9]-[0-3][0-9]
Jinja:
{{ (customer.birth_date | to_timestamp("%Y-%m-%d") | int) + 43200 }}
Convert UK phone numbers
The following is an example of how to convert UK phone numbers:
{#- Strip non-numerics #}
{%- set phone_number = customer.phone | map("int", "") | select("number") | join %}
{#- Check whether we have valid number #}
{%- if not phone_number %}
{%- abort : "Phone does not contain any number" %}
{%- endif %}
{#- Get into the right format #}
{%- set phone_number = "+44"~phone_number[1:] if phone_number[0] == "0" else phone_number %}
{%- set phone_number = "+44"~phone_number if phone_number[0] == "7" else phone_number %}
{{- phone_number }}
Imports to fix data
Event data cannot be changed once in the system. To fix incorrect data, you would have to re-import the data and delete the incorrect events. In some cases, this means you have to go back to the original source or run exports and imports, which can be complex.
Luckily, if you have a small number of events to fix (<10k), you can use a report + import combination:
- Create a report of the incorrect events with the hard ID, necessary properties, and timestamp.
- Create a new event import. You can use a Bloomreach report as the import source, so choose the report you created in step 1.
- Map the columns to the hard ID, event properties and timestamp. You will have to do each one manually, because the report headings are prefixed with a “# number”, so the name will not match.
- Add new columns with static values.
- Run the import.
For more complex changes, use expressions and include them in your report.
Updated about 2 months ago