Insights
Dissecting a CDP’s Segmentation Engine
5 min read
By Julien Kervizic

CDPs can perform different flavors of segmentation, and there are different ways to engineer a segmentation engine.

Segmentation types

CDPs implement different types types of segmentation. A distinction is generally between types of segmentation that generate a MECE segment (mutually exclusive, collectively exhaustive) and those that generate overlapping segments. Another difference is the method of segmentation, whether it is a static or dynamic (also referred to as adaptive segmentation).

MECE Segments (Segment assignment): With the MECE Segment type, the segment assigns customers to one of multiple segments using the MECE principle. This assignment can happen, for instance, when leveraging methods such as K-Means clustering, or running RFM segmentation (Recency/ Frequency/ Monetary value).

Overlapping Segments (Audience Eligibility Rule): Overlapping Segments type of segmentation allows to evaluate different eligibility rules, for instance, whether the customer should receive different types of communication. The overlapping segment evaluation only considers one set of segmentation rules at the time.

Besides the distinction between MECE segments and overlapping segments, there is as well a difference between the method of segmentation, be it static or dynamic segmentation.

Static segmentation refers to one-off segmentation, can be useful for ad-hoc campaigns, such as newsletter mailing campaigns. This type of segmentation can be easily be implemented by running a query on a database, filtering a customer list in excel, or running a script on a dataset.

Dynamic or “Adaptive” Segmentation: refers to segmentation that doesn’t result in a fixed set of users (for example, a single query run) who belong to the segment, but rather a segment from which users can join and drop of based on certain conditions.

  • Event-triggered. The segmentation engine evaluates the segment conditions when events are triggered. This evaluation can happen, for instance, with front-end SDKs, or when leveraging event-based architecture. Some of the use cases for event-triggered include geofencing or A/B testing eligibility checking.
  • Recurring / Schedule Triggered. Schedule based triggers are needed segmentation relies on dates or includes calculated time-dependent attributes (e.g., Trailing twelve months purchases). As customers can drop off and join at each “epoch”/ time interval. An example of this type of adaptive segmentation is that of a birthday segment. Identifying the list of customers whose birth date is today. A different kind of segment requiring this kind of trigger is customers that need to receive a repurchase reminder. This segment requires to identify customers who had made a purchase X months ago but didn’t repurchase since.

System design

For a CDP to properly tackle both static and dynamic segmentation, as well as being able to cater to real-time processing requirements, an effective system design is needed.

There are different components needed to power this segmentation engine. A Segment Metadata Store contains the segmentation rules and definition of the segments and a data loader that provides the initial load of customer data for the segmentation to happen. Additional components, in a segment store and segment loader, are sometimes also needed to deliver at the same time an easy extraction of the data and to handle the connection to the target systems more efficiently.

Segmentation Engine

The segmentation engine is responsible for providing the specific segment output. There are different methods for handling the segmentation, segmentation through filtering and segmentation through assignment.

Some CDP such as Unomi have the logic of a segmentation engine managed instead in a more generic “Rule Engine.”

Segmentation through filtering handles well segmentation on an initial customer base that can cater to 'Overlapping Segments.' It would not, however, be able to handle well change of audience memberships. Segmentation through filtering also works well when evaluating entry/eligibility conditions for marketing journeys. There are ways to palliate some of the drawbacks of change in audience memberships by leveraging a segment store and having a separate workflow for customers who already belonged to the audience, but this is at the cost of increased technical complexity. Another way is by leveraging the platform in a purely event-based manner with entry and exit triggers.

Segmentation through assignment The segmentation through assignment takes in a reducing function. It provides a classification of the customer, either binary (i.e., in the audience or not) or multi-class. As such, it provides a more flexible and comprehensive method of segmentation than the one obtained through filtering. It is possible to handle both the “MECE Segment” classification and “Overlapping segments” classification through it. The downside is that it can generate a reasonably large amount of data downstream. It is sometimes required to add checks on which profiles have been processed, using a segment store to only process the deltas downstream

Segment Metadata Store

The Segment Metadata Store contains specific segment definitions. In the case of simple filtering or binary classification through filter evaluation, this can be through segmentation rules defined using Simple and Compound/Complex (using and/or operators) operands.

Below, an example of how the segmentation metadata can be stored:

The segment metadata store can handle more complex pieces of logic, for instance, for MECE Segment assignment using case statement logic:

Finally, a much more sophisticated approach would be to embedded predictive modeling formats such as Predictive Model Markup Language (PMML) onto the segmentation engine.

The handling of time-based comparisons such as to handle cases such as customers have purchased in the past 30 days can be handle through a templating engine, interpolating the string with their expected comparison value.

Data Loader

The data loaders role is to provide the initial customer evaluation for the segment. Depending on the approach taken, filtering conditions or assignments conditions might have been already placed in this initial load. For more complex cases, such as running a predictive model, while some datastore allows for in database prediction, it is often preferable from a resource utilization perspective to have the processes separated.

The data loader can be also responsible for running scheduled/recurring run of the dataset, this can be needed when there are time-based conditions in the segment rules.

Segment Store

The segment store allows us to retain the history of the customers belonging to the different audiences. Segment stores allow for better manage the load on the downstream export flow. Either by providing a means to check the delta of audience memberships/segment attribution or through branching out the evaluation logic for customers belonging to the audience.

Segment Loader

The segment loader allows us to directly export already calculated audience memberships/segment membership to downstream systems.

Summary

Some different components and approaches go into designing a segmentation engine for a CDP. In setting up the segmentation approach, it is essential to take into account the tradeoffs of speed, frequency, downstream.

Privacy Policy
Sitemap
Cookie Preferences
© 2024 WiseAnalytics