FIG. 14 Influencer Mix Modeling

Eight channels, one budget,
and an honest answer.

A Bayesian Marketing Mix Model for a creator-driven D2C brand. Eight channels, 104 weeks of synthetic-but-realistic weekly data, full posterior credible intervals on every channel coefficient. Ground-truth coverage is at expected 90% rate — the channel-by-channel breakdown is below in § VI.

A creator partnership runs in March. Sales lift in April. Was it the partnership? Was it the email campaign that ran simultaneously? Was it spring seasonality? This is the question that haunts every brand measurement conversation. Marketing Mix Modeling is the most credible answer the industry has produced.

Filed

Engine

Python · Meridian · BigQuery · Cloud Run

Weeks

104 (2024-W01 → 2025-W52)

Channels

8 (3 creator, 2 paid social, 3 direct)

Data is synthetic. Methodology is real. Ground-truth coefficients documented in § VII.

§ I The data

104 weeks of weekly observations for a fictional mid-market beauty brand selling on its own DTC site, Walmart, and Sephora. Eight marketing channels, weekly revenue, and four control variables (seasonality, holidays, competitor pressure, price promotions). All numbers are generated from documented ground-truth coefficients and a fixed seed, so the entire simulation is reproducible.

FIG. 14.1 · Weekly revenue, 2024-W01 → 2025-W52

FIG. 14.2 · Channel total spend, 2-yr panel

FIG. 14.3 · Representative SQL — base data pull

BigQuery · imm_lab.weekly_panel

-- Pull 104 weeks of channel spend + revenue for the modeling dataset.
-- Weekly grain. Sunday-anchored ISO weeks. All channels left-joined so
-- weeks with zero spend on a channel still appear (zero-spend weeks
-- are signal, not noise).
SELECT
  d.iso_week,
  d.year_week,
  rev.gross_revenue,
  rev.units_sold,
  SUM(IF(s.channel = 'tiktok_creator',    s.spend, 0)) AS tiktok_creator_spend,
  SUM(IF(s.channel = 'instagram_creator', s.spend, 0)) AS instagram_creator_spend,
  SUM(IF(s.channel = 'youtube_creator',   s.spend, 0)) AS youtube_creator_spend,
  SUM(IF(s.channel = 'meta_paid',         s.spend, 0)) AS meta_paid_spend,
  SUM(IF(s.channel = 'tiktok_paid',       s.spend, 0)) AS tiktok_paid_spend,
  SUM(IF(s.channel = 'paid_search',       s.spend, 0)) AS paid_search_spend,
  SUM(IF(s.channel = 'programmatic',      s.spend, 0)) AS programmatic_spend,
  SUM(IF(s.channel = 'retail_media',      s.spend, 0)) AS retail_media_spend,
  ctrl.competitor_idx,
  ctrl.price_discount_active,
  ctrl.holiday_label
FROM      `imm_lab.dim_week`              d
LEFT JOIN `imm_lab.fct_revenue_weekly`    rev   USING (iso_week)
LEFT JOIN `imm_lab.fct_channel_spend_wk`  s     USING (iso_week)
LEFT JOIN `imm_lab.dim_controls`          ctrl  USING (iso_week)
WHERE     d.iso_week BETWEEN '2024-01-01' AND '2025-12-28'
GROUP BY  d.iso_week, d.year_week, rev.gross_revenue, rev.units_sold,
          ctrl.competitor_idx, ctrl.price_discount_active, ctrl.holiday_label
ORDER BY  d.iso_week;

§ II The model

Marketing Mix Models decompose weekly revenue into channel contributions, baseline demand, and external factors. The two transformations that make the math work for marketing data are adstock (carryover effects: a creator post on Tuesday still drives sales on Friday) and saturation (each additional dollar buys less response than the dollar before it). A Bayesian fit gives credible intervals on every coefficient, which is what budget decisions actually need.

Model equation

y_t = β₀ + Σ_c α_c · Hill(Adstock(x_c,t; λ_c); κ_c, s_c) + Z_t^Tγ + ε_t Adstock(x; λ)_t = x_t + λ · Adstock(x; λ)_t−1 Hill(z; κ, s) = z^s / (κ^s + z^s)

α_c: max contribution · κ_c: half-saturation spend · s_c: Hill shape · λ_c: adstock decay · Z_t: control vector (seasonality, holidays, competitor, promo)

Library

Fit with Google Meridian (Bayesian MMM, NUTS sampler, 4 chains × 1,000 warmup × 1,000 draws). Meta lightweight_mmm and a hand-rolled PyMC implementation both ship as fallbacks in the notebook.

FIG. 14.4 · Sampling diagnostics

§ III Channel contribution

Each channel's share of total revenue across the 2-year panel, with 90% credible intervals. Total ROAS tells you the average dollar back per dollar in. Marginal ROAS (mROAS) tells you the next dollar back per next dollar in — which is what budget reallocation actually depends on.

FIG. 14.5 · Channel contribution to total revenue ($M, 2-yr)

§ IV Saturation curves

For each channel, the spend → response curve with 90% credible bands. The vertical line marks current weekly spend. Channels where the line sits deep on the plateau are saturated — the next dollar buys very little. Channels where the line is still on the steep part are still in linear territory and reward more spend.

§ V The what-if tool

Move budget between channels. The model recomputes weekly revenue under the new mix using each posterior sample, returning a 90% credible interval on the predicted lift versus the current allocation. Hold the total budget constant or let it flex.

API: checking…

Predicted weekly lift vs current

+$0 [$0, $0]

New weekly total

$0 $0/wk spend

§ VI Ground-truth recovery

Because the data is synthetic, the channel coefficients used to generate it are known. At a nominal 90% credible interval, expected coverage on 8 channels is 7.2 — over-coverage at exactly 8/8 would suggest the priors are too wide or the CIs over-inflated. Below, each row shows the ground-truth α (max contribution), the posterior mean, and the 90% credible interval. The recovery count is rendered live from the model.json that ships with the page; if you re-run the notebook the count below updates automatically.

Recovery summary

Loading…

§ VII Methodology & receipts

Synthetic-data ground truth

For each channel c: Hill saturation parameters (α_c, κ_c, s_c) and geometric adstock decay λ_c are documented constants used to generate the panel. The model is then asked to recover them from revenue alone. At a nominal 90% credible interval, expected coverage on 8 channels is 7.2 — observed coverage is rendered live in § VI. Specific values: see notebooks/imm/imm_lab.py and notebooks/imm/generate_synth.js.

Inference

Meridian's NUTS sampler, 4 chains × 1,000 warmup × 1,000 draws. R-hat < 1.02 on every parameter, ESS > 400 minimum. Posterior samples for the frontend are precomputed and ship as /assets/data/imm-lab/model.json.

GCP architecture

Synthetic data → BigQuery (dataset imm_lab, 4 tables, ~50 KB) → Vertex Workbench Python notebook (Meridian) → trained model artifact in Cloud Storage → FastAPI predictor on Cloud Run (min 0, max 1) → static Cloudflare frontend. Hard-capped at $5/month with billing alerts; current monthly cost ~$0.

Stack

Python 3.11, Meridian 1.0.5, JAX, Pandas, NumPy, BigQuery client, FastAPI, Docker, gcloud. Frontend is vanilla JS + SVG, no build step, no analytics.

FIG. 14.6 · Receipts

Honest caveats

The data is synthetic. The methodology is real. Real-world MMM validation requires incrementality holdouts on top of ground-truth recovery.
Bayesian intervals reflect model uncertainty — not omitted-variable bias, not channel-coding errors, not promotion cannibalization that wasn't measured.
The 8-channel grouping is itself a modeling choice. Real beauty brands often disagree about whether to model creators in aggregate or per-creator.
Influencer measurement carries complications a synthetic panel cannot capture: creator-specific quality variance, audience overlap, brand-safety incidents, gifting vs paid distinctions.
Hill and geometric adstock are convenient, not gospel. Both are defensible defaults; both can be wrong.

Reading list

← Back to the portfolio View the notebook on GitHub ↗

Eight channels, one budget, and an honest answer.

Eight channels, one budget,
and an honest answer.