There is some debate about the effectiveness and future potential of Customer Data Platforms (CDPs). One aspect is the increasing competition from cloud data (warehouse) platforms. Overlap seems to be growing as it became much easier to build (some) of the functionality that existing CDPs are offering. Referencing this approach, a term I hear more often is the Headless (Customer) Data Platform. What is this headless approach all about and how does it compare to existing, stand-alone tools and suites?
The Headless Customer Data Platform (CDP) or Data Management Platform (DMP) refers to an increasingly popular solution (as there is no exact definition and probably never will be):
- A cloud data warehouse acts as the foundation (think of GCP / AWS / Azure). Different data sources are ingested, transformed and joined.
- Create insights and actions on all kinds of levels (not limited to customer data). Think of product or campaign level data.
- Activate data in channels, platforms and channels directly from the data warehouse environment
- Without a stand-alone CDP tool or suite, but a combination of cloud (data) services and specialized tools or frameworks.
Headless architecture gained popularity in the Content Management (CMS) world and is widely used these days. The concept: decoupling the back-end (content management) and the front-end (website). The CMS will act as a central content hub. Through an API, content can be shared with all sorts of applications, such as websites, apps or even dynamically generated emails . The front-end is no longer intertwined with the content management system. The result: all content centralized in one system and front-end applications can be developed independently from the CMS.
The headless CDP or DMP acts as a (marketing) data hub where insights and actions can be created on all kinds of levels and are activated / integrated with channels, tools or platforms (think of audiences, events or triggers). This is done directly from the cloud data warehouse, without the need for a stand-alone CDP tool or suite. However, there is no centralized user interface. The solution consists of a combination of services within the cloud environment (building blocks) and other (specialized) tools that fulfill a specific need (e.g. connectivity or decision making).
Not limited to customer data only
I'm comparing the headless approach to existing stand-alone CDP solutions, which are mainly build around customer data. However, a headless data management system is not limited to customer data only. Some examples:
- Identity management, building a single (360°) customer view and (customer) audiences.
- Enriching product data / product feeds. Gaining insights into product performance.
- Create models, such as product recommenders, purchase of churn prediction models or forecasting sales (both in batch as real-time).
- Creating business rules or “triggers” (decisioning) for follow-up / next-best actions.
- Sales and engagement reports, based on multiple data sources. Creating notifications and alerts based on deviations within these reports.
The headless approach has a lot of overlap with the idea behind The Modern Datastack. However, I will try to make the comparison with existing CDP tools, so limiting myself to functionalities these platforms generally offer.
A data warehouse 'with benefits'
Many organizations recently invested in a data warehouse in one of the well-known cloud providers such as AWS, Azure or GCP. Data can be retrieved from various sources and joined together. Usually this is based on periodic batches (once per day or hour), but real-time data ingestion is also perfectly possible (due to the rise of real-time architecture). Think of (live) orders or clickstream data from websites / apps.
Working with such a data warehouse and all the different services within these platforms became much easier. Services within the cloud platforms can be stacked on top of each other (building blocks), making it easier to import, transform and analyze large amounts of data. Skills that are important for working with such a data environment, such as SQL, R or Python, are also increasingly present within organizations (but yes, in most cases not nearly enough).
Activating this data in channels and/or tools directly from the data warehouse is what increases the overlap with existing CDPs, thus becoming an alternative.
The headless approach vs. CDP tools: overlap in functionality
Lately, when talking about a CDP when a cloud data warehouse environment is already present (or will be), generally the following comes up:
- The data sources that already flow into your data warehouse, also have to be integrated with the CDP.
- Capturing and collecting real-time (on-site) behavioral data most of the times is already done by systems such as Google Analytics or Snowplow (and more recently leveraging server-side tag management solutions). So why is it necessary to create another online behavioral database (in which measurement methodology certainly will differ)?
- Duplicating business rules already present in your warehouse environment. Think of identity resolution, creation of a 360 customer view or audience selections.
If the above applies to your organization depends on your specific situation, but in most cases there will be overlap. And most CDP and data management tools do not (yet) work together in such a way that they can "piggyback" on everything that is already collected and calculated in your data warehouse.
Disadvantages of a headless CDP
Of course, stand-alone CDPs have their advantages. Mainly this is time and cost. First off all if you don't have a data warehouse and/or resources are limited, a stand-alone CDP suite could fit your needs and get you started quickly, seen the out-of-the box functionality. The cost factor heavily depends type of system, volume and integrations. When zooming in, the headless CDP has the following disadvantages:
- No central user interface & ease of use: a headless system does not have an integrated user interface where settings can be easily adjusted or insight can be given into all processes. It is certainly possible to gain insight into the processes of a headless system, but those insights have to be created yourself or is available in multiple systems. So there is some technical / SQL knowledge required to set up audiences or journeys for example.
- (Real-time) decision engine / flow builder: making decisions based on real-time data flows (e.g. web or app data). Creating such a real-time decision engine (e.g. online personalization use cases) within the cloud is possible, but is much more complex than clicking some rules together in a user interface. Also, creating flows / journeys is easily done within most CDP solutions.
- Connectivity to channels and tools is maintained by the platform. In a headless CDP, you have to build the integrations yourself (mostly trough API connections), although there are currently several (open-source) frameworks available, such as Google Tentacles. Interesting development is the rise of "reverse ETL" tools.
This last one deserves some some explanation. There are already a lot of tools that ingest data into your warehouse, but less tools that can activate that data to your marketing channels or other platforms. Reverse ETL tries to solve this problem. Ultimately, a tool like this should enable you to focus on building use cases and not to worry about connector development and maintenance (e.g. the Facebook API). Next to that, the tools is providing insights in the process and data that flows out of your cloud platform. It also seems that some of these tools are developing a user interface where non-technical users can create audiences or flow on top of your data warehouse data.
An example setup
When to choose for a headless approach?
When your organization already has a cloud data warehouse or planning to set one up, the headless approach can be interesting if you:
- Don't want to purchase and implement an (expensive) CDP tool, for example if you just want to share audiences, triggers or other data directly from the data warehouse with your (marketing) channels or (external) platforms;
- Want to realize advanced and more complex use cases based on different sources and combinations, which are not possible or hard to achieve in stand-alone tools (by clicking stuff together) or without moving the data out and in the CDP platform to perform the necessary actions.
- Do not want to duplicate data and recreate business rules in different environments (single source of thruth).
- Unlock the potential and activate non-customer data, such as product or campaign data.
Using various cloud services and dedicated / specialized tools (such as reverse ETL), it's possible to connect data and tools relatively easily. Keep in mind that certain skills must be present in the team that will work with the headless CDP on a daily basis and at scale. Think of data engineers, tech-savvy marketers and analysts. This is especially important as use cases become more complex.
Start with your use cases
But most importantly, when choosing between a headless or a stand-alone CDP (or whatever solution), start with defining clear use cases and determine the roles / persons that will work with such a system. “The ability to create audiences in a drag-and-drop interface and share these with our marketing channels” is not a solid use case. Creating 100 audiences, in which the same customer is in 80 of these 100 audiences, and is targeted 80 times with a different message, does not work. Try to work out the details.
Insight and ease of use are certainly advantages of a stand-alone CDP, although the promise that every marketer can work with a CDP is often not fulfilled. Whether that depends on the marketer or the tool will differ per situation and person. The fact is that in reality, the more technical marketers and/or data engineers become the main users of such systems when the "low-hanging-fruit" use cases are live and CDP needs to be build out.
So, the best tip I can give is to involve the right people, work out clear requirements, detailed use cases and then pick a solution.