Do this once and you will not look at a platform dashboard the same way again. Open Google Ads, Meta Ads Manager, your TikTok report and your Amazon console. Write down the conversions each one claims for last month. Add them up. Now compare that total to the orders in your commerce system.

The platform total will be bigger. Often much bigger. Practitioners who run this exercise routinely find the summed conversions land somewhere between 150 and 250 percent of actual closed sales, a pattern documented by Databox and echoed across vendor troubleshooting guides. One marketer's example: Google Ads claims 47, Meta claims 52, LinkedIn claims 31, and the CRM shows 38 real customers. The platforms together describe a business two to four times the size of the one that actually exists.

Nobody filed a false report. Every platform measured something real. The problem is who is holding the measuring tape. Each of these platforms sells you the ads, serves the ads, and then writes the report card on how well the ads worked. A vendor that grades its own exam will not fail itself.

Origin: how the garden got its walls

The phrase walled garden is older than the problem it now describes. In digital advertising it means a closed ecosystem where one company controls the whole stack. As Skai defines it, the platform owns or controls all the user data, the ad inventory, and the reporting. It collects first-party data from logged-in users, builds audience segments from that data, sells placements against those segments, serves the creative, and then reports the results. Nothing crosses the wall. That is the point of the wall.

Google, Meta and Amazon are the canonical three, with TikTok and Apple now firmly inside the same category. Their combined grip is not marginal. Skai notes that a handful of giant publishers account for roughly 81 percent of US digital ad revenue, and AI Digital puts Google, Meta and Amazon alone at around 55 cents of every US digital ad dollar in recent years. Each has a different source of strength. Google sells intent. Meta sells identity and engagement. Amazon sells proximity to the moment of purchase. What they share is the structure: the seller and the scorekeeper are the same entity.

That structure was not built to deceive. It was built because logged-in users, owned inventory and closed reporting are a genuinely good ad product. The measurement conflict is a side effect. But a side effect that pays the bills tends not to get fixed.

Present: the specific mechanics of flattering yourself

Self-attribution bias is not one trick. It is a set of small, individually defensible choices that all point the same way. Walk through them and the inflation stops being mysterious.

Each platform sets its own attribution window. Meta's default counts a conversion if a user clicked an ad within the last 7 days or saw one within the last 1 day, a window Measured describes plainly as designed to capture as much credit as possible. Google Ads windows can stretch to 30 days or more depending on settings. LinkedIn uses a 90-day click window for its longer B2B cycles. None of these is wrong. But a 90-day window will catch conversions a 1-day window never sees, so the same purchase can fall inside several windows at once, and every platform whose window it touches will claim it.

Then there is the view-through conversion, the most generous mechanism of all. A view-through means the platform showed someone an ad, the person did not click it, and later they converted. The platform counts that as a win. Sometimes it is. Often the ad scrolled past unwatched and the purchase was driven by a search, a friend, or a decision already made. An impression that precedes a conversion did not necessarily cause it, yet the impression gets the credit anyway. As campaigns reach millions of people across wide windows, view-through counting inevitably sweeps up conversions that would have happened with no ad at all.

Add modeled conversions. When a user declines tracking, platforms no longer see the outcome directly, so they estimate it statistically and fill the gap. Reasonable, given the privacy constraints. But the estimate is produced by the same party that benefits when the estimate is high.

Stack these together and you get the double-counting problem. Every platform is engineered to capture every conversion it might plausibly have influenced. A single customer who saw a Meta ad, later clicked a Google ad, and then bought, generates one conversion in Meta's report, one in Google's, and one real order in your warehouse. Two platforms, one sale, both honestly following their own rules. Run four channels at once and you have four overlapping claims on the same finite set of buyers. The arithmetic cannot reconcile because the systems were never built to reconcile. They were built to each maximise their own number.

It helps to be precise about what this is and is not. This is not fraud. As one attribution explainer puts it, no platform is lying, they are just measuring different things, and each chooses the measurement that flatters its own ads. Call it designed optimism. The incentive is structural and obvious: a platform that reports a higher return gets a bigger share of next quarter's budget. Generous windows, liberal view-through counting and confident modeling all push the reported number up, and the reported number is what your spend decisions feed on. If you want a deeper tour of why two honest systems disagree, that is the subject of why your attribution numbers never match. The short version: the gap is not a glitch to be fixed, it is the designed result of letting each seller keep its own books.

The evidence that platform numbers overstate

If platform self-reporting were roughly accurate, holdout tests would confirm it. They do not.

The cleaner way to measure an ad's effect is a counterfactual: split the audience, show ads to one group, withhold them from a statistically identical control group, and measure the difference. That difference is incremental lift, the conversions that would not have happened otherwise. Attribution describes which ads were near a sale. Incrementality tests whether the ads caused it. They are not the same question, and they do not give the same answer.

When marketers run incrementality tests against platform reports, the platform number is consistently the higher one. Haus describes a direct-to-consumer brand whose platform showed a 4 to 1 return, while an incrementality test found only about 60 percent of those attributed sales were truly incremental, putting the real return closer to 2.4 to 1. The pattern is sharpest in the channels that sit near the finish line. Haus, which runs these tests for a living, reports that branded search and retargeting are the channels where platform numbers and tested reality diverge most, with platform-reported return commonly running 5 to 10 times the true incremental return in a 2025 benchmark of direct-to-consumer advertisers that pegged branded search at an incremental return of just 0.7 to 1. In one Haus case study, three separate brand-search holdouts, run in slow season, peak season and a new market, found at most 1 percent lift and twice found none at all. Those are channels harvesting demand that already existed, then booking it as demand they created.

The academic record is older and blunter. A field experiment with eBay, written up for CEPR by the economists who ran it, switched off branded paid search and found that almost all of the lost paid clicks were immediately recaptured by organic results. The paid ads had been claiming conversions that organic search would have delivered free. More broadly, Randall Lewis and Justin Rao examined 25 large advertising field experiments in a paper in the Quarterly Journal of Economics bluntly titled "The Unfavorable Economics of Measuring the Returns to Advertising." Their finding: real ad effects are small relative to the noise in sales data, which makes observational and attribution-style methods deeply vulnerable to selection bias. The people who see an ad are already different from the people who do not. Crediting the ad with the difference is the original measurement error.

Brands have lived this. When Procter and Gamble cut what it considered ineffective digital spend, it reported to investors that removing well over 100 million dollars had no negative effect on its growth rate. Spend that platform reporting had presented as productive turned out, when withheld, to be carrying conversions that arrived anyway.

Future and impact: building a scorekeeper that does not sell ads

The fix is not to catch the platforms lying. They are not lying. The fix is to stop treating a seller's self-report as the verdict and to add measurement the seller does not control.

Independent attribution and measurement vendors are the first layer. Tools that sit across all your channels and reconcile platform claims against your own revenue at least produce one deduplicated view instead of four overlapping ones. They cannot create truth on their own, but they remove the worst of the double counting.

Marketing mix modeling is the strategic layer. MMM uses aggregate spend and outcome data, regressed against sales, with no user-level tracking at all. Because it never needed the cookie, privacy changes did not break it, and because it sees every channel at once it cannot be gamed by any single platform's window. It answers the portfolio question: if you moved budget between channels, what would actually change.

Incrementality and geo testing are the causal layer, and the closest thing to proof. A geo holdout that turns a channel off in matched markets and watches what happens to sales gives you a real counterfactual. The eBay branded-search experiment was exactly this kind of test, and P&G's spend cut was the same logic at full scale: withhold, then watch. It is slower and it costs experimental design, but it answers the question attribution cannot.

Data clean rooms are the more neutral middle ground, and the most nuanced piece. A clean room lets two parties match data and get aggregate answers without either seeing the other's raw records. Amazon Marketing Cloud and Google Ads Data Hub are the best known. They are genuinely useful, and they are still inside the wall. As AdExchanger has reported, a platform-run clean room is plugged into that platform's own ad business and identity graph, so it can verify platform-specific truth but is not a neutral cross-channel referee. Independent clean rooms, from providers such as Snowflake, LiveRamp and InfoSum, are designed as a neutral meeting place across platforms. That is the more impartial version, though interoperability is still a work in progress. The fuller picture is in data clean rooms explained.

The industry knows the single-scorekeeper model is broken. In February 2026 the IAB launched Project Eidos, an industry-wide effort to replace fragmented, channel-specific measurement with interoperable standards, after its own research found most buyers consider current measurement short on rigor and trust. Whether a body that includes the platforms can produce a referee the platforms do not control is the open question.

For a marketing decision-maker the practical posture is simple. Read platform dashboards as the seller's pitch, useful for tactical signal and fast iteration, never as the audited result. Never sum conversions across platforms and treat the total as real, because it never is. Hold the big budget decisions to incrementality tests and marketing mix modeling, the methods with no stake in the answer. The walled gardens are not dishonest. They are just sitting an exam they wrote, in a room they own, and marking it themselves. You would not accept that from anyone else. Bring your own examiner.

Council summary

This post argues that platform self-attribution is not fraud but a structural conflict: the company that sells the ads also writes the report card, and every defensible choice it makes, from generous attribution windows to liberal view-through counting to confident modeling, points the reported number upward. It shows why summed cross-platform conversions routinely exceed real sales by 150 to 250 percent, then sets that against the harder evidence, incrementality tests, the eBay branded-search experiment, the Lewis and Rao field-experiment review, and P&G's spend cut, all of which find platform numbers overstate true causal lift. The reader's takeaway is a working posture rather than a grievance: treat platform dashboards as a seller's pitch useful for fast iteration, never sum them into a number you trust, and hold budget decisions to incrementality testing and marketing mix modeling, the methods with no stake in the answer. The framing stays honest throughout, the platforms are not lying, they are simply marking their own exam, and the fix is to bring an independent examiner.

Walled Gardens Will Not Grade Their Own Homework Honestly

Origin: how the garden got its walls

Present: the specific mechanics of flattering yourself

The evidence that platform numbers overstate

Future and impact: building a scorekeeper that does not sell ads

Council summary

Comments

Leave a comment

Origin: how the garden got its walls

Present: the specific mechanics of flattering yourself

The evidence that platform numbers overstate

Future and impact: building a scorekeeper that does not sell ads

Council summary

Comments

Leave a comment

Agentic programming security: the fundamentals most teams skip

Privacy best practices for agentic AI: a consultant's checklist

AI agent governance: the framework most teams build too late