A retail bank in 2012 ran a model that did one job well. A customer logged into online banking, and in the few hundred milliseconds before the page rendered, the model scored that person against a shelf of offers: a credit card, a personal loan, a savings product, a fee waiver. It picked the one with the highest predicted propensity to convert, and a banner appeared. That was next best action, and for the channel it ran in, it mostly worked.
Look closely at what that model never decided. It did not choose whether to show a banner at all. It did not pick the channel, because it lived inside one channel. It did not pick the moment, because the moment was whenever the customer happened to log in. It did not choose the tone, and it had no concept of restraint. Given a customer and a login, it always had an answer, because producing an answer was the only thing it was built to do.
That gap is the subject of this piece. Next best action started as a recommender: given a person, which offer. The work happening now under the label AI decisioning is expanding it into something broader, what some vendors call next best everything. Not just what to say, but the channel, the timing, the tone, the order across a sequence, and the decision the old model could never make, whether to say anything at all.
Where next best action came from
Next best action grew out of two industries with the same problem: banking and telecom. Both had millions of customers, long relationships, expensive churn, and a fat catalog of products to cross-sell. Both had transaction histories rich enough to predict behavior. And both had a structural flaw the discipline was invented to fix.
The flaw was product-led marketing. A bank would decide to push personal loans this quarter, build a campaign around them, then go looking for customers who looked like loan prospects. Next best action inverted that. It started from the customer and asked which single action, out of everything available, best fit this person right now. The framing is customer-centric by construction, and genuinely better than blasting a product at a list.
The engine underneath was the propensity model: a supervised machine learning model trained on history. It learns the pattern of customers who took an action, then scores everyone else on how closely they match. Propensity to buy a credit card. Propensity to churn. Propensity to respond to a retention offer. Run several of these, rank the scores, show the top one. That is the original next best action in one sentence.
The commercial lineage runs through a company most marketers have never heard of. Chordiant built predictive decision management software used heavily by banks and telecom operators. In March 2010, Pegasystems agreed to acquire Chordiant for roughly 161.5 million dollars, and that technology became the spine of what Pega now sells as Customer Decision Hub, still one of the best-known next best action engines in enterprise.
Where the old model hit its ceiling
Propensity-based next best action has three limits, and they are not bugs. They are consequences of how the approach was built.
The first is that a propensity model only knows the past. Trained on what already happened, it can recommend variations of what has already worked, but it cannot try something genuinely untested, because there is no historical data for the untested thing. The model converges on a comfortable answer and stays there. If a slightly different offer, sent at a different hour, would have worked better, a pure propensity model has no way to find out. It was not built to experiment.
The second limit is timing, and telecom shows it sharply. Operators have spent two decades building churn propensity models, and a churn model can be accurate and still useless, because it usually fires too late. As Databricks describes the pattern, the typical churn journey runs from a service problem, to declining usage, to a support call, to the customer leaving weeks later. A model that flags the customer on the way out is closing the barn door after the horse has gone. The score was right. The window was already shut.
The third limit is the one in the bank example. The old model decides which action, never whether to act. It runs inside a channel and assumes the channel. It treats every eligible moment as a moment to message. Stack several single-channel decisioners across email, SMS, app, and web, each independently certain it has a worthwhile thing to say, and the customer gets hit from four directions at once, each hit defensible and the total a mess. Next best action with no shared brake over-messages by design.
What changes when an agent decides everything together
AI decisioning is the attempt to widen the question. Instead of which offer, the system decides action, channel, timing, sequence, and silence as one joint decision, for one customer, continuously. Vendors including Braze, Hightouch, Treasure Data, and BlueConic now describe their products in roughly these terms, and the shared phrase is next best everything.
The technical change underneath is a move from propensity modeling to reinforcement learning. A reinforcement learning system is built to do the thing a propensity model cannot: explore. It tries actions, observes the outcome, opened, clicked, bought, unsubscribed, ignored, and updates toward whatever earns the reward. It tests the untested option, because testing is how it learns. That is what lets the decision widen past the historical comfort zone.
Here is what each new dimension means.
Action stays, but it stops meaning only an offer. It can be a service nudge, a piece of content, a loyalty reminder, a question, or a hard discount, all competing in the same decision.
Channel becomes a decision rather than an assumption. The same message has very different value as an email, an SMS, a push notification, or an in-app card, and the right one depends on the person and the moment. A single-channel decisioner cannot weigh that. A joint decisioner can.
Timing stops being whenever the customer showed up. The system can act now, wait for a better hour, or hold for a behavioral trigger. Catching the telecom customer at the declining-usage stage rather than the exit call is a timing decision, and it separates a retention offer that works from one that arrives at a funeral.
Sequence enters the frame. A single decision is one message. A sequence is the order and spacing of messages over weeks, and a system that optimizes the sequence can decide the right move today is the one that sets up a better move next Tuesday.
And then silence. This is the genuinely new capability and the one worth dwelling on. A next best everything system can decide that the best action for this customer today is no action. Treasure Data names this directly: sometimes no message is the optimal output. The old propensity model could not produce that answer, because it ranked offers and something always ranked first. A reinforcement learning system can learn that messaging a particular person right now lowers long-term value, and hold. Restraint becomes a move the system can choose, not a frequency cap bolted on afterward.
Why silence is the part that pays
It is tempting to treat silence as a soft brand-safety nicety. The data says it is closer to a hard financial control.
Optimove's 2026 Marketing Fatigue Report, based on a survey of 1,034 consumers run in late 2025, is blunt about the cost of saying too much. Fifty-five percent reported switching brands multiple times because of marketing bombardment. Eighty-three percent said they unsubscribe from repeated offers, because they know the same offer will reach them on another channel anyway, the exact failure mode of several single-channel decisioners in parallel. Ninety-three percent reported receiving poorly timed messages. Seventy-nine percent said brands that send fewer but well-targeted messages earn loyalty faster.
Read those together and the picture is clear. Over-messaging is not a tone problem. It is a churn driver. The classic next best action setup, every channel with its own model, every model always producing an answer, manufactures exactly the bombardment that pushes customers out. A system that can stay quiet, and that decides it centrally with a view across every channel, removes a real and measurable cost. Silence is not the absence of a decision. It is one of the more valuable decisions the system makes.
Tone belongs in the same conversation. The same recommendation can be framed as urgency, as reassurance, or as a quiet heads-up. With generative models now writing the message itself, tone becomes another variable the system can vary and learn from, not a fixed property of a template.
The CDP and the real-time signal
None of this works without a data foundation, and this is where the honest version of the story has to slow down. A joint decision across action, channel, timing, and silence is only as good as the profile it reads. Every vendor selling AI decisioning depends, underneath, on a customer data platform or an equivalent unified profile. Treasure Data puts it about as plainly as a vendor will: the hard part is the data, and without a unified foundation, AI decisioning is just a more elaborate way to send emails.
Two properties of that foundation matter most. One is completeness. The decision needs purchase history, channel preferences, consent state, service interactions, and value, drawn into one profile rather than scattered across systems that do not talk. The second is freshness. A decision about whether to act now is worthless on a profile that is a day stale. The telecom timing problem is a data-latency problem in disguise: catching the customer at declining usage rather than the exit call means the usage signal has to reach the decision close to real time. A profile rebuilt overnight cannot support a decision that needs to be made this afternoon.
This is the link to the wider shift in customer data infrastructure. Real-time signals, warehouse-native architectures, and agent-readable profiles keep coming up because decisioning is the thing consuming them. Next best everything is one of the clearest answers to what all that real-time data plumbing is actually for.
What is real, and what is still a pitch
The line between shipping and aspirational matters here, because the marketing has run ahead of the product.
What is real today: reinforcement learning systems that optimize across channel, timing, and frequency, and can choose to send nothing, are in production. Optimizing the message, the channel, and the cadence as a joint decision is a current capability, not a promise.
What is closer to the frontier: a system that designs the options itself, rather than picking among variants a human prepared, is mostly still emerging, as is fully autonomous, no-human-in-the-loop decisioning across an entire customer base. Treasure Data, to its credit, frames the agent-designs-the-strategy step as where things are heading rather than where they are. Most live deployments are reinforcement learning choosing among human-supplied options inside human-supplied guardrails. That is genuinely useful. It is not an autonomous agent, and a buyer should not pay for one and receive the other.
Then there is governance, the real bottleneck. A system that decides action, channel, timing, and silence on its own makes thousands of consequential choices a day with no human reading each one. That demands real guardrails: frequency caps, budget limits, brand and tone rules, and consent and regulatory constraints enforced as hard boundaries the system cannot cross. It also demands measurement, and measurement is hard here. When the system changes channel, timing, and tone at once and never stops adjusting, you cannot isolate the effect of any single change the way an A/B test does. The discipline that survives is the holdout group: a randomized, persistent slice of customers kept out of the decisioning entirely, so lift is measured against people the system never touched. If a vendor cannot explain how they prove incremental lift against a clean holdout, the optimization claim is unverified.
Where this is heading
The direction is set by where the rest of the stack is going. Gartner expects 40 percent of enterprise applications to feature task-specific AI agents by 2026, up from under 5 percent in 2025. Decisioning is a natural home for those agents, because deciding action, channel, timing, and silence for one customer at a time, continuously, is an agent-shaped job.
The honest near-term picture: the capability is real and widening, the autonomy is partial, and the hard part is unglamorous. Most of what ships is reinforcement learning inside human guardrails, not an unsupervised agent, and that gap is where the marketing is loosest. The data foundation has to be unified and fresh, the guardrails enforced rather than decorative, the lift provable against a holdout. Get those right and next best action stops being a banner that always appears, and becomes a system that knows, sometimes, to leave the customer alone.
Council summary
This post argues that next best action has outgrown its original job. It started as an offer recommender and is widening, under the label AI decisioning, into a joint decision over action, channel, timing, sequence, and the genuinely new option of silence. The review checked the load-bearing figures against primary sources: the Pega acquisition of Chordiant for roughly 161.5 million dollars agreed in March 2010, the Optimove 2026 survey of 1,034 consumers with its 55, 83, 93, and 79 percent findings, the Databricks account of the telecom churn timing gap, and the Gartner forecast of 40 percent of enterprise apps carrying task-specific AI agents by 2026. All held up; the Gartner and Optimove lines were reworded to track the sources exactly. The takeaway for a buyer: the capability is real, but most live systems run reinforcement learning inside human guardrails rather than as autonomous agents, so make any vendor prove incremental lift against a clean holdout before believing the optimization claim.
Comments