Ask an older marketing mix model what television returned last quarter and it gave you a number. One point four times its cost. A clean figure, a single row in a slide, the kind of output a finance team could paste into a budget without a second thought. The trouble was the certainty. The model was never that sure, and the number never told you so.
Modern marketing measurement has quietly stopped working that way. The open-source frameworks most new marketing mix modeling programs are built on, Google's Meridian and PyMC-Marketing, do not answer with a point. They answer with a distribution: a curve of possible values, each with a probability attached. The headline figure is still there, but it now arrives wrapped in a range, and the range is the honest part. This is the Bayesian turn, and understanding it changes how a data-minded marketer should read every measurement result that crosses their desk.
Where the idea came from
The mathematics is old. In December 1763, two years after Thomas Bayes had died, his friend Richard Price had an essay of Bayes's read to the Royal Society, titled "An Essay Towards Solving a Problem in the Doctrine of Chances." It described how to revise a probability when new evidence arrives. Pierre-Simon Laplace independently rediscovered and generalized the idea into the form used today, applying it to problems in astronomy and jurisprudence around the turn of the nineteenth century. The approach picked up the name Bayesian only in the twentieth century, and for much of that century mainstream statistics kept it at arm's length, partly because the calculations were intractable by hand.
Computing changed that. Markov chain Monte Carlo methods, which approximate a probability distribution by drawing many samples from it, made Bayesian models practical for problems too messy to solve with a clean formula. That is the technique now humming inside an MMM. Meridian uses a sampler called the No U-Turn Sampler and runs several independent chains, checking that they agree before it trusts the result. The old method finally became usable, and marketing measurement, with its thin and awkward data, was a good fit for it.
The core idea in plain language
Bayesian inference is one sentence: you start with a belief, you observe data, and you update to a revised belief. The starting belief is the prior. The data enters through the likelihood, which scores how well each possible answer explains what you actually saw. The revised belief is the posterior. The relationship, in the way practitioners state it, is that the posterior is proportional to the likelihood times the prior.
The output is the part that matters. A classic regression hands back a single best estimate of each channel's effect. A Bayesian model hands back a posterior distribution, a whole probability curve over the channel's likely return. From that curve you can read a most likely value, but you can also read how wide the uncertainty is and which way it leans. The model is not refusing to answer. It is answering with everything it knows, including how much it does not know.
One useful property follows from the structure. When data is plentiful and clear, the likelihood dominates and the posterior is pulled firmly toward what the data says. When data is thin or noisy, the posterior leans more on the prior. The model degrades gracefully instead of producing a confident figure built on almost nothing.
What a prior actually is, and why it is a feature
The word prior makes some marketers uneasy. It sounds like an opportunity to put a thumb on the scale, to tell the model the answer you wanted before it looked at anything. That suspicion is worth taking seriously, and it is also mostly backwards.
A prior is a statement of what you believed before this dataset, expressed as a range rather than a guess. Meridian lets you set priors in terms a marketer already uses: not abstract regression coefficients but return on investment, marginal return on investment, and contribution percentage. You are telling the model something like "based on last year's geo test, paid search ROI is probably around 2, and I am fairly but not completely sure." The width you give that statement is itself information. A narrow prior says you are confident. A wide one says you are mostly guessing and the data should lead.
Here is why this is a feature and not a flaw in marketing specifically. Marketing data is thin: two years of weekly sales is about 104 rows, a small sample for separating many channels. It is noisy, because sales move for a hundred reasons that have nothing to do with media. And it is correlated, because marketers raise spend across channels together and often push hardest exactly when demand is already rising. Hand a model data like that with no structure and it will chase noise, confidently crediting the wrong channel because two channels moved in lockstep and the math cannot tell them apart. A sensible prior is a guardrail. It tells the model that a channel returning forty times its cost is not plausible, so it should not produce that estimate just because a quirk in 104 noisy rows allows it.
A prior is not a way to avoid evidence. It is a way to bring the evidence you already have, your past experiments and your category knowledge, into a model that would otherwise pretend it knew nothing. Meridian even ships sensible defaults for teams with no priors of their own. When the outcome is revenue, its default ROI prior is a LogNormal distribution that puts the mean return across channels a little below 2. When the outcome is not revenue, a per-channel ROI prior cannot be reasoned about directly, so it instead places a prior on the total share of the outcome driven by all paid channels, with a mean of 40 percent and a standard deviation of 20 percent. Either way the default is a deliberately loose belief, wide enough for the data to move it but structured enough to keep the model sane.
The credible interval, and the number a marketer can actually use
This is where the Bayesian turn pays off in practice. A Bayesian model reports its answer as a credible interval, and a credible interval means exactly what most people wrongly assume a confidence interval means.
A 90 percent credible interval can be read as: there is a 90 percent probability the true value sits inside this range. Plain, direct, and the thing a budget owner actually wants. A frequentist confidence interval, the kind a classic regression produces, does not support that reading. In frequentist logic the true value is fixed and the interval is the random thing, so the 95 percent refers to the long-run behavior of the procedure across many hypothetical repeats, not the probability that this particular interval caught the value. Almost every marketer who has ever seen a confidence interval has read it as a credible interval. With a Bayesian model that reading is finally correct.
The practical difference shows up the moment two channels compete for budget. Suppose channel A posts a 3.0 ROAS with a credible interval running from 1.0 to 5.0, and channel B posts 2.8 with an interval from 2.7 to 2.9. On point estimates alone, A wins. Read the ranges and the picture flips. The model barely knows what A does; its return could be excellent or close to breakeven. It knows B almost exactly. Depending on your risk tolerance the safer place for the next dollar is B, and a point estimate would have hidden that completely. This example follows the kind of comparison practitioners use to teach the Bayesian advantage, and it is the whole argument in miniature.
A single ROAS number is false precision. It invites a decision it cannot support. A range changes the decision itself: a wide interval is not a result to act on, it is a result that says run an experiment here before you move money. The model has told you where it is confident enough to scale and where it is only guessing.
Why Meridian and PyMC-Marketing are Bayesian, and Robyn feels different
The two open-source frameworks behind most new MMM work are Bayesian by design. Meridian is fully Bayesian and built to sample posteriors. PyMC-Marketing, built on the PyMC probabilistic programming library, is too, with deep support for custom priors and hierarchical structure. Both report credible intervals as a matter of course, and both let a team feed a past experiment in as a prior.
Meta's Robyn, the package that did the most to popularize open-source MMM, works on a different principle. Robyn uses ridge regression, a regularized linear regression, and pairs it with an automated search over thousands of model configurations. It is frequentist. Any single Robyn model outputs a point estimate, and rather than asking an analyst to state priors it tries to find good settings by optimization. It is a coherent answer to the same problem, automate the hard choices instead of stating them, and Robyn is still maintained and still in use. But the package is explicitly labelled experimental, its release cadence has slowed, and the loudest momentum in new MMM work has moved to the two Bayesian frameworks. The shape of the field now reflects a view: when data is this thin and correlated, a method that quantifies its own uncertainty and can absorb prior evidence is the better default. For the wider story of why this technique came back at all, see our piece on the MMM renaissance, and for a closer comparison of the three frameworks, open-source MMM compared.
One more thing the Bayesian structure makes possible. Because a prior is just a belief with a width, the result of an incrementality experiment can be poured straight into the model as one. Meridian's own guidance describes using an experiment's point estimate as the prior mean and its standard error as the prior width. That is how a correlational model gets anchored to causal ground truth: the experiment measures real lift, the prior carries that lift into the MMM, and the model is no longer guessing in the dark on that channel.
The honest tradeoffs
The Bayesian turn is not free, and pretending otherwise would be its own kind of false precision.
Priors require judgment, and judgment can be argued over. A prior is a choice, and in an organization with marketing, finance, and an agency in the room, reaching consensus on what a prior should be is genuinely hard. Worse, a prior that is too strong and too narrow can lock in a channel's result, forcing the model to explain everything else with whatever is left and producing distorted estimates elsewhere. And the calibration story has a dark side: feed in an experiment that was badly designed or contaminated, and you have laundered bad data through rigorous-looking math. Garbage in still gets garbage out. The Bayesian frame just makes the garbage harder to spot.
The models are also heavier. MCMC sampling across many channels and geographies costs real compute and real time, and successive samples are correlated, so a run needs far more iterations than it looks like to produce an honest answer. When the sampler fails to converge, the analyst is left debugging whether the priors or the spend data are at fault, and that debugging can take longer than building the model did.
Then there is the room. A point estimate is easy to present. A posterior distribution is not, and explaining to a skeptical executive that the answer is a curve, that the honest output is a range, that you want more budget for an experiment precisely because the model is uncertain, is a harder conversation than reading one confident number off a slide. The more sophisticated the model, the harder it travels outside the data team.
The practical takeaway
None of this argues for distrusting the model. It argues for reading it correctly.
Trust the ranges, not the midpoints. A credible interval is the most useful thing on the page, because it tells you the difference between a result solid enough to scale and a result that is really a request for an experiment. Interrogate the priors. Ask what beliefs went into the model, how wide they were, and where they came from, because a prior built on a clean geo test and a prior built on a vendor's self-reported number are not the same input even when they look identical. And stop asking the model for one number. The single confident figure was always a fiction; the Bayesian turn just stopped the model from telling it.
A marketer who reads a measurement result as a distribution, who treats a wide interval as a question rather than an answer, and who knows which prior is doing the heavy lifting, will allocate budget better than one still hunting for the one true ROAS. The number was never the point. The honest range always was.
Council summary
This post argues that the move to Bayesian marketing mix models, the engines inside Google's Meridian and PyMC-Marketing, changes what a measurement result means: the output is a posterior distribution, and the credible interval around a channel's return is the honest part. It explains priors, posteriors, and the real difference between a credible interval and a confidence interval in terms a non-statistician can hold onto, and it is candid that priors demand judgment, that calibration can launder bad experiments, and that a curve is harder to present than a number. The reader's takeaway is concrete: trust the range over the midpoint, treat a wide interval as a prompt to run an experiment, and always ask which prior is carrying the weight. The council judged the piece accurate after correcting the Bayes publication date, the Meridian default-prior description, and an overstated claim that Robyn had been abandoned.
Comments