The pitch for a composable, warehouse-native CDP is clean and it lands. Stop paying a vendor to store a second copy of customer data you already keep in Snowflake. Stop letting marketing's profile drift out of sync with the business. Run identity resolution, segmentation, and activation directly on the warehouse, and pay only for a thin activation layer on top. The data never moves. The duplicate copy disappears. The math looks better.
It often is better. But the part that gets quiet in the demo is this: a composable CDP does not remove the cost of running a CDP, it moves that cost. The work still happens, and the work is computation: every identity match, every segment rebuild, every sync. In a packaged CDP that computation is folded into a license you negotiate once a year. In a composable CDP it runs on metered warehouse compute, and the meter has no annual cap. The bill lands on the Snowflake or BigQuery invoice rather than the martech one, and it grows with how active your customers are rather than how many seats you bought.
This is not an argument against composable. It is an argument for knowing where the meter is before you turn it on.
How the packaged model hid the cost
For most of the CDP era the cost question was simple, because packaged CDPs answered it for you. A packaged vendor (Adobe, Salesforce, Treasure Data, Tealium, the older Segment model) ingests a copy of your customer data into its own infrastructure and runs everything inside that walled environment. Identity resolution, profile storage, segment computation, and activation queries all happen on the vendor's hardware, and the vendor absorbs that compute into the price.
You pay a license, usually keyed to profiles or monthly tracked records. The number is large and climbs as your customer base grows, but a finance team likes it for two reasons. It is predictable, agreed in advance for the contract term. And it is bounded: the vendor cannot bill you more this quarter than the contract says, even if your customers triple their activity. The compute risk sits with the vendor.
The composable model unbundles exactly that. You keep the data in your own warehouse, bring your own tools for resolution and activation, and pay each piece separately: a reverse ETL or composable platform such as Hightouch or Census, the warehouse underneath, and whatever else the stack needs. The platform fee is genuinely smaller than a packaged license. CDP.com puts a typical composable stack at roughly 80,000 to 250,000 dollars a year across four to six tools. What that figure leaves out is the warehouse compute those tools consume, which lands on a different invoice, billed by the drink.
Where the meter actually runs
To see why composable compute can outrun a packaged license, you have to know how a cloud warehouse charges. Snowflake bills credits for the time a virtual warehouse spends running, per second after a 60-second minimum each time it resumes. A credit runs about 2 to 4 dollars by edition, and each step up in warehouse size roughly doubles the credit burn per hour. BigQuery and Databricks differ in detail but share the principle: you pay for the compute a query consumes, every time it runs.
A composable CDP is, mechanically, a generator of warehouse queries. Four jobs drive the spend.
Segment recomputation. A segment is a query. "Customers who bought in the last 90 days, opened no email in 30, and have lifetime value above 500 dollars" is SQL that scans tables and joins them. Defined once, that costs little. The cost is in how often it reruns. Audiences are not static: a customer crosses the 90-day line constantly, so the segment has to be recomputed to stay current. Run it nightly and you pay for one scan a day. Run it hourly and you pay for 24. Many of these queries do a full scan of large history tables on every pass. Multiply that by dozens of live segments and the warehouse is rarely idle.
Identity-graph rebuilds. Identity resolution, the stitching of anonymous sessions, devices, emails, and accounts into one profile, is among the heaviest things a CDP does. It is fuzzy matching across large datasets, and it is never finished, because new events arrive continuously and the graph has to be recomputed to absorb them. CDP.com reports that CDP-related workloads routinely raise a Snowflake or BigQuery bill by two to three times. The trap is the obvious lever to cut that: running the matching less often or over less data directly lowers profile quality, the thing the whole CDP exists to produce.
Real-time and near-real-time refresh. This is the single sharpest cost lever, and the one buyers reach for without pricing it. Faster activation means recomputing more often, and the cost does not rise gently. Research published by mParticle, a packaged vendor and therefore an interested party, found that moving audience refreshes from daily to hourly raised compute cost about 25 times, and going from daily to a 5-minute refresh raised it at least 50 times. The reason is structural: a warehouse query carries fixed overhead each run, so running it constantly forfeits the efficiency of batching. A "near-real-time" sync can quietly become one of the largest line items in the stack.
Reverse ETL syncs. Reverse ETL is the plumbing that pushes a computed segment from the warehouse out to the email tool, the ad platforms, the contact center. It carries cost on two sides. On the warehouse side, every sync runs a query to read the rows it needs. On the tool side, composable platforms commonly price by rows synced or by destination, which scales with both audience size and frequency. CDP.com gives the shape of it directly: a reverse ETL sync that costs 500 dollars a month at launch can reach 5,000 a month at scale as channels and frequency grow. Full-refresh syncs over large tables are the expensive case; incremental syncs that move only what changed are far cheaper, so how a tool syncs matters as much as what it costs per row.
The pattern across all four: cost scales with audience activity and data volume, not with a seat count. A composable compute bill is a function of how many customers you have, how much they do, how many segments you run, and how fresh you insist those segments be. Those are the same variables that go up when your marketing program succeeds.
Why the bill surprises people
If the mechanics are this knowable, why do teams get caught? Three reasons, and they compound.
The bill arrives on the wrong invoice. The compute does not show up as a martech line item. It shows up as a larger Snowflake or BigQuery bill, owned by data engineering or central IT, often in a different budget from the marketing team that drove the spend. By the time anyone connects the CDP project to the warehouse overage, a quarter has passed.
The cost is invisible at the decision. Compare a packaged license against a composable platform fee and the composable number wins, because the warehouse compute is not in the comparison. It is real, sometimes larger than the platform fee itself. CDP.com makes the point plainly: composable stacks often look cheaper on tool licenses alone, yet the three-year total cost of ownership, once warehouse compute, sync pricing at scale, and engineering headcount are added, frequently exceeds a packaged platform.
It scales the wrong way. Teams model the cost at launch, with a handful of segments and modest volume, and assume linear growth. Composable compute does not grow linearly. It compounds across several axes at once: more rows, times more segments, times higher refresh frequency, times more destinations. The curve bends upward exactly as the program gets more ambitious.
The other bill: engineering
Warehouse compute is the metered cost. The second bill does not appear on any invoice at all.
A composable CDP is assembled, not bought. Someone has to build the data models the segments query, write and tune the identity-resolution logic, configure the syncs, monitor them, and debug them when they fail. That is data engineering work, and it is continuous, not a one-time setup. CDP.com estimates at least one full-time data engineer for ongoing operations, often more, and puts a typical composable stack at three to five dedicated engineers once warehouse modeling, identity pipelines, connector maintenance, and on-call rotation are counted. CDP.com pegs that at roughly 450,000 to 1 million dollars a year in staffing, a cost that never shows up as a CDP line item.
There is a debugging tax inside that headcount. A packaged CDP is one vendor, so a broken cart-abandonment flow has one support line to call. A composable stack has four or five moving parts, so a flow that fails to fire could be a warehouse query that timed out, a reverse ETL sync that hit an API rate limit, a schema change that broke an upstream model, or an identity job that has not finished. The modularity that makes composable flexible makes every incident a cross-system investigation, and investigations cost engineer hours.
And the warehouse becomes mission-critical. When the CDP runs on Snowflake, a slow warehouse is no longer slow reporting, it is a marketing program that stops activating. That raises the reliability bar, and with it the effort to keep the warehouse fast and monitored.
How to model and control the bill
None of this means do not go composable. For a team with warehouse maturity, real data engineering capacity, and activation needs that mostly tolerate daily or hourly freshness, composable can be both cheaper and cleaner than a packaged CDP. The point is to size the real number and manage it. Four concrete moves.
Model three years, not the platform fee. Build a total cost of ownership that includes the platform licenses, the projected warehouse compute at the volume and refresh cadence you actually intend to run, sync pricing at your expected scale, and the engineering headcount to operate it. Ask each vendor for a three-year model at your real numbers, not a starter footprint. Composable can still win that comparison. It should win it on the honest figure.
Set refresh frequency per use case, not as one global default. Refresh cadence is the most powerful cost dial you hold. Most marketing work does not need real-time: weekly audience syncs for paid media, daily segment refreshes, churn scoring, and lifecycle email triggers all run fine on a batch. Reserve minute-level or streaming refresh for the few cases that genuinely move revenue in the moment, such as in-session personalization or a high-value cart abandon, and price that decision knowing the jump from daily to hourly can be roughly 25 times the compute. Paying for real-time everywhere is the most common and most expensive mistake.
Engineer the queries for cost, not just correctness. Incremental processing that touches only changed rows can cut warehouse compute substantially against full-refresh approaches. Materialize resolved profiles and heavy aggregates once and read them many times instead of recomputing on every segment pass. A dedicated, right-sized warehouse for CDP workloads, with auto-suspend tuned tight so it is not billing while idle, keeps the spend contained.
Make the warehouse bill someone's job. Composable spend surprises people because no single owner watches it. Put CDP-driven compute under active FinOps: tag the workloads, attribute the cost back to the marketing program that creates it, alert on spikes, review it monthly. A composable CDP turns the warehouse into operational infrastructure for marketing, and it has to be budgeted and monitored like infrastructure, not like a fixed subscription.
Composable vendors push back on the cost critique, and parts of the pushback hold. Hightouch argues that cloud warehouse compute has grown cheaper over time, that composable tools are built to make queries efficient, and that packaged CDPs incur their own warehouse costs anyway through the constant reconciliation needed to keep their separate copy in sync. That last point is fair: a packaged CDP is not compute-free either. The honest synthesis is not that composable is expensive and packaged is cheap. It is that composable converts a fixed, vendor-absorbed cost into a variable, self-managed one. That can be the better deal. It is simply a different shape of risk, and it has to be modeled, owned, and watched as one.
Where this is heading
The cost question is about to get sharper, because the next consumer of the CDP is an AI agent, and agents query customer data far more than dashboards do. A human builds a segment a few times a day. An agent reading profiles, deciding, and acting generates a continuous stream of lookups and recomputations. Gartner's 2026 Magic Quadrant frames the CDP's future around agentic AI and describes a market splitting two ways: platformization, the CDP as the foundation of a broad application suite, and agentification, the CDP as thin plumbing feeding autonomous agents. The agentification path runs straight on warehouse compute: more autonomous activity on the profile means more metered queries.
That makes the discipline above a prerequisite, not a nicety. The teams that succeed with composable, and later agentic, customer data will be the ones who treated compute as a forecasted operating cost from the start. The composable CDP genuinely can be the cleaner and cheaper architecture, but only for the team that knows where the meter is. The pitch is right that the duplicate copy is waste. It is incomplete about what replaces it: not nothing, but a bill that grows every time your marketing works.
Council summary
This post argues that a composable CDP does not delete the cost of running a CDP, it converts a fixed, vendor-absorbed license into variable warehouse compute that scales with customer activity and lands on the Snowflake or BigQuery invoice. The council verified the load-bearing figures against their sources: mParticle's finding that daily-to-hourly refresh raises compute about 25 times and daily-to-5-minute at least 50 times, CDP.com's 80,000 to 250,000 dollar composable stack range, its reverse ETL example of 500 dollars a month growing to 5,000 at scale, and Snowflake's per-second billing after a 60-second minimum. We corrected the engineering-headcount passage to match CDP.com exactly, including its 450,000 to 1 million dollar staffing estimate, trimmed an identity-resolution claim that overstated the source, and tightened the agentic-AI section to attribute the platformization-versus-agentification framing solely to Gartner. The takeaway: composable can be the cheaper and cleaner architecture, but only for a team that models a three-year total cost of ownership, sets refresh frequency per use case, and puts the warehouse bill under active FinOps ownership.
Comments