flowchart LR
A[Cohort week 0<br/>100%] --> B[Week 1<br/>retention]
B --> C[Week 4<br/>retention]
C --> D{Curve shape}
D -->|Decay to zero| E[Product-market<br/>fit not found]
D -->|Flat plateau| F[Retained user<br/>base exists]
D -->|Smile up| G[Strong product-market<br/>fit, network effects]
style E fill:#FCE8E6,stroke:#D93025
style F fill:#FFF7E6,stroke:#F4B400
style G fill:#E6F4EA,stroke:#137333
58 Technology Analytics: User Behavior and A/B Testing
58.1 Why Technology Analytics Matters
A consumer-tech product ships a new feature on Monday and knows by Wednesday whether it works — analytics is the only feedback loop short enough to keep up.
Technology — software, SaaS, consumer apps, internet-platform businesses — is the industry where analytics moved from reporting to operating system. Every click, scroll, tap, and swipe generates a row in an event stream. Product managers run dozens of experiments in parallel. Engineering deploys code multiple times a day. Marketing optimises acquisition funnels in hours, not quarters. The role of the BI analyst here is unusual — embedded in product squads, building self-serve dashboards, designing experiments before features ship, and adjudicating between competing interpretations of the same A/B test.
For a BI analyst, technology clusters into three jobs that close out the industry-applications module of this book. User-behaviour analytics answers what are people doing in the product, and where are they getting stuck? — funnels, retention curves, feature adoption, session-recording analytics. Experimentation analytics answers did the change actually work, and at what cost or risk? — A/B testing, holdout analyses, sequential testing, guardrail metrics. Growth and revenue analytics answers is the business model healthy? — North-Star metric movement, AARRR pirate funnel, cohort LTV, unit economics. Ron Kohavi et al. (2020) is the definitive reference on online controlled experiments — the visualisation idioms (the lift confidence interval, the cumulative-sample power curve, the segment-level forest plot) come from this lineage. Alistair Croll & Benjamin Yoskovitz (2013) frames lean analytics around the discipline of one metric that matters at a time, and the BI dashboard is the artefact that prevents the team from drifting away from it.
Three rules separate technology dashboards from every other kind:
- Self-serve over delivery. The PM, designer, and engineer must be able to slice the data themselves. A dashboard that requires the BI team for every cut becomes a bottleneck.
- Event-grain by default. Aggregate dashboards lose the per-user, per-session detail that product investigations need. Build the event-grain spine and aggregate as a view.
- Experiment hygiene over speed. A wrong A/B-test conclusion shipped to production is more expensive than a slow one. Pre-register, define guardrails, and respect statistical power.
58.2 User-Behaviour Analytics
User-behaviour analytics tracks the actions a user takes inside a product over time, at the grain of individual events tagged with user, session, timestamp, and properties. The dashboards built on this stream answer three recurring product questions: who is using us, what are they doing, and what is keeping them or losing them.
The product funnel is the same idea as the marketing funnel (Chapter 49) and the talent funnel (Chapter 52), at higher resolution. A signup funnel might be Landing page view → Email entered → Verification clicked → Profile completed → First action taken → Paid conversion. The visualisation is sequenced bars with stage-to-stage conversion percentages labelled.
The product variant adds two things — segment overlay and time-cohort comparison. The same funnel by acquisition channel, by device class, by geography, or by hire-cohort tells very different stories. A 41 percent overall first-week activation hides web-cohort 62 percent and mobile-cohort 18 percent; the dashboard must show both.
The product retention curve plots, for each cohort (defined by signup week), the percentage of users who returned in week 1, week 2, week 3, and so on. The shape of the curve tells the product team whether they have product-market fit:
- Decay to zero — most users drop off and never come back. The product has not yet found its hook.
- Decay to a flat line — users who stick form a stable base. Better than decay-to-zero; this is retained user shape.
- Smile (curve up after a dip) — the rare and beloved shape where dormant users return weeks or months later, often after a feature change or seasonal event. The hallmark of platforms that have crossed product-market fit.
After a feature ships, the dashboard tracks adoption (percent of eligible users who tried it), retention of feature users (percent who came back to the feature), and deepening (frequency and breadth of usage). The visualisation idiom is a 2x2 of Adoption percent × Retention percent with each feature as a point, sized by user reach. Features in the top-right quadrant are core; bottom-left features are candidates for sunset.
Session analytics layers behavioural data onto product screens. Click heatmaps, scroll-depth heatmaps, and rage-click maps (multiple frustrated clicks on a non-interactive element) reveal where users try to interact with what was not designed to be clicked. Pair the heatmap with a user-flow Sankey — actual paths through the product, with link width = session count — and the dashboard answers both where they tried to do something and what they did instead.
Page views, registrations, downloads, app installs — these grow with marketing spend and tell the product team almost nothing about whether the product is working. Alistair Croll & Benjamin Yoskovitz (2013) argue that the only metrics worth a dashboard tile are actionable (you can change them), accessible (you can interpret them), auditable (you can verify them), and comparable (you can benchmark them). Apply the four-question filter to every tile before promoting it to the headline.
58.3 Experimentation Analytics: A/B Testing Done Right
The defining capability of modern technology companies is the ability to ship features as experiments — randomly assigning users to variants, measuring the difference in outcome metrics, and deciding whether to ship, iterate, or kill. Ron Kohavi et al. (2020) distil the hard-won lessons of running tens of thousands of experiments at Microsoft, Google, and LinkedIn into a discipline; the BI dashboard is what makes that discipline visible to product teams.
flowchart LR
A[Hypothesis<br/>and design] --> B[Power analysis<br/>required N]
B --> C[Random assignment<br/>treatment / control]
C --> D[Run experiment<br/>collect data]
D --> E[Analyse<br/>primary + guardrails]
E --> F{Significant<br/>and aligned?}
F -->|Yes| G[Ship to 100%]
F -->|Mixed| H[Iterate or<br/>investigate]
F -->|No| I[Kill or learn]
style G fill:#E6F4EA,stroke:#137333
style H fill:#FFF7E6,stroke:#F4B400
style I fill:#FCE8E6,stroke:#D93025
The dashboard supports each step: pre-test power calculator, real-time enrolment monitor, primary-metric lift plot with confidence interval, guardrail-metric panel, segment forest plot, and the ship/iterate/kill decision panel.
The headline visual of any A/B test is the lift CI chart: treatment-minus-control as a horizontal bar centred on zero, with the 95 percent confidence interval drawn around it. If the interval crosses zero, the result is not statistically distinguishable from zero. If the interval is entirely above zero, the lift is positive at conventional significance.
The chart usually carries three numbers — observed lift (e.g., +2.4 percent), confidence interval ([+0.8 percent, +4.0 percent]), and p-value or sample-ratio mismatch flag. Ron Kohavi et al. (2020) stress that the interval matters more than the point estimate; a 0.5 percent lift with a [-2 percent, +3 percent] CI is no result at all, while a +0.3 percent lift with a tight [+0.2, +0.4] CI is a genuine — if small — improvement.
A primary metric improvement that destroys a guardrail metric is not a win. Guardrails typically include latency, error rate, retention, and revenue per user. The dashboard view is a multi-metric panel — primary metric plus 4-6 guardrails — each rendered as a lift CI chart, with red flags on any guardrail confidence interval crossing into the danger zone.
A common product mistake is celebrating a click-through-rate lift while page-load time and bounce rate quietly worsen. The multi-metric panel makes the trade-off explicit; without it, the team optimises one number and breaks the product.
A flat overall result often hides offsetting segment effects — a feature that helps power users and hurts new users may show zero average lift. The forest plot stacks per-segment lift CI bars on a single axis: rows are segments (new users, returning users, mobile, web, India, US, etc.), the x-axis is lift, and each row’s confidence interval is drawn around its point estimate. Heterogeneity is visible immediately.
Ron Kohavi et al. (2020) caution against the multiple-comparisons trap — the more segments you slice, the more likely some will show spurious significance. The dashboard should pre-register the segments of interest and use Bonferroni or FDR adjustments to keep the inference honest.
Before any conclusion is read off an A/B-test dashboard, the sample-ratio mismatch (SRM) check verifies that the actual treatment/control split matches the intended one (50/50, 80/20, etc.) within statistical noise. SRM failures indicate broken experimentation infrastructure — biased randomisation, inconsistent assignment, attribution bugs — and any analysis on top is invalid until the SRM is fixed. The dashboard runs the chi-squared SRM check automatically and refuses to display lift numbers when it fails.
58.4 Growth and Revenue Analytics
The third leg of technology analytics rolls user behaviour and experiment outcomes into the business-model frame. Two structures dominate.
Dave McClure’s AARRR (the pirate funnel) is the most-cited framework in startup analytics:
- Acquisition — visitors arriving from any channel.
- Activation — users completing a first meaningful action.
- Retention — returning to use the product over time.
- Referral — bringing other users in.
- Revenue — converting to paying.
The dashboard is a series of funnel stages, but the more useful view is a cohort matrix with rows = cohort week and columns = AARRR stage, cell value = percent reaching that stage. The diagonal pattern reveals which cohort is improving on which dimension.
A North-Star metric (NSM) is the single number that captures whether the product is delivering value to users at scale. Examples: daily active users (social), nights booked (Airbnb), messages sent (messaging apps). The dashboard tracks the NSM trended over time, plus a tree of input metrics that drive it (visits → activation rate → retention rate → frequency = NSM). When the NSM moves, the input tree shows which lever moved.
flowchart LR A[Visits] --> B[Activation rate] B --> C[Activated users] C --> D[Retention rate] D --> E[Retained users] E --> F[Frequency<br/>per user] F --> G[North-Star Metric<br/>e.g. weekly active users] style G fill:#E6F4EA,stroke:#137333
Subscription and SaaS businesses live and die by the relationship between Customer Lifetime Value (LTV) and Customer Acquisition Cost (CAC). The standard tile is the LTV/CAC ratio — above 3 is healthy for SaaS — alongside the payback period (months to recover CAC). The visualisation is a cohort matrix of cumulative gross profit by month-since-acquisition, divided by month-of-acquisition CAC, with the breakeven line crossed when cumulative profit equals CAC.
Growth dashboards routinely report 7-day moving averages to four decimal places. The illusion of precision encourages over-reaction to noise. Alistair Croll & Benjamin Yoskovitz (2013) emphasise that growth metrics deserve confidence bands, smoothing, and explicit tolerance ranges — not pixel-perfect daily numbers that fluctuate within their natural variance. Tighten the dashboard to the precision the data supports.
58.5 Common Pitfalls
- Vanity metrics on the headline tile. Page views, signups, downloads. Replace with actionable, accessible, auditable, comparable metrics.
- A/B-test dashboards that show point estimates without confidence intervals. A 2.4 percent lift with [-1, +6] CI is not a 2.4 percent improvement.
- Ignoring sample-ratio mismatch. The most common silent killer of experimentation results.
- Cherry-picking segments. Running 30 segment analyses and reporting the two that are significant is statistical malpractice; pre-register segments and adjust for multiple comparisons.
- Funnels without cohorts. A funnel is a snapshot; the funnel-by-cohort matrix is the diagnostic.
- Retention curves cut from arbitrary windows. The first-action-anchored cohort is more useful than calendar-week cohorts for engagement work.
- Heatmaps without context. A click heatmap on a screen the BI team does not understand visually is just noise.
- Single-metric optimisation without guardrails. CTR up, latency up, retention down — a net loss being celebrated as a win.
- Dashboards rebuilt for every experiment. A standardised experiment-readout template applied across all tests is what creates the discipline Ron Kohavi et al. (2020) calls trustworthy experimentation.
- Self-serve gone too far. When everyone defines metrics differently, the same dashboard tile means different things in different rooms. A governed semantic layer (Chapter 39) is the corrective.
58.6 Illustrative Cases
Yuvijen Apps onboarding A/B test. Product analytics team runs a 14-day A/B test on a new onboarding flow for the company’s consumer app. Headline activation lift is +2.1 percent with a 95 percent CI of [+0.6 percent, +3.7 percent] — significant. The guardrail panel reveals that 7-day retention rises by +1.2 percent (positive) but session length drops by -8 percent (negative), driven by users skipping a now-redundant tutorial. Segment forest plot shows the lift concentrated in returning users; new users show flat activation. The team ships the change but adds an investigation into the session-length regression and an optional tutorial path for first-time users.
Yuvijen SaaS retention smile. Lifecycle analytics team builds a Tableau retention-curve dashboard showing 24 weekly cohorts. The recent six cohorts trace a smile shape — week 8 retention is higher than week 4 — driven by a recently launched collaboration feature that re-activates dormant accounts. The chart wins board approval for a doubled investment in the feature; the smile is itself the artefact of product-market fit the founder uses in the next funding round.
Yuvijen Telecom growth-engine North-Star tree. Digital channel team builds a Power BI North-Star dashboard with weekly-active-recharger as the headline metric and a tree of inputs (app installs → registration rate → first-recharge rate → recharge frequency). When weekly-active-rechargers stagnate for three weeks, the input tree reveals that registration rate has held but first-recharge rate has fallen — payment-flow regression after a code release. Engineering rolls back the offending change; the metric recovers within 10 days. The dashboard is what cut the diagnosis time from weeks to hours.
58.7 Hands-On Exercise: Build a Product Analytics and Experimentation Dashboard
Aim. Build a three-page technology-analytics dashboard in Power BI that ties user-behaviour analytics, A/B-test readouts, and growth metrics together, with the self-serve, event-grain, and experiment-hygiene discipline the function requires. Tableau equivalents are noted.
Scenario. You are the BI lead in product analytics at Yuvijen Apps (the consumer-app subsidiary of the Yuvijen group). The Chief Product Officer has asked for a dashboard that lets her see, in the Monday product-review meeting, what users are doing, which experiments shipped or got killed last week, and whether the North-Star metric is on track.
Deliverable. A three-page Power BI report — Behaviour, Experiments, Growth — with self-serve filters, an event-grain semantic layer, an experiment-readout template page, and a CPO summary that consolidates the highest-impact items.
58.7.1 Step 1 — Load and model the event-grain spine
Use Get Data in Power BI to load four event-stream extracts (or connect via DirectQuery to your event warehouse — Snowflake, BigQuery, Databricks):
-
events.csv— EventID, UserID, SessionID, EventName, EventProperties (JSON), Timestamp, AppVersion, Platform, Country. -
users.csv— UserID, SignupDate, FirstActionDate, AcquisitionChannel, PlanTier. -
experiments.csv— ExperimentID, UserID, Variant (Control / Treatment), AssignedAt. -
revenue_events.csv— UserID, Timestamp, Amount, Product.
Build a DimDate calendar; mark it. Build a DimEvent table with semantic event groupings (Onboarding, Engagement, Conversion, Churn). Build a DimExperiment table with ExperimentID, Hypothesis, PrimaryMetric, Guardrails, PlannedN, StartDate, EndDate. Crucially, model the event grain as the fact table — most BI dashboards aggregate too early.
58.7.2 Step 2 — Page 1: User behaviour
Build five visuals.
Activation funnel. Funnel visual showing Signup → Email Verified → Profile Complete → First Action → Day-7 Active. Slicer for AcquisitionChannel and Platform enables segment cuts.
Retention cohort matrix. Matrix with SignupCohort on rows, WeeksSinceSignup on columns, cell value = retention percent. Conditional formatting: green plateau, yellow decay, red rapid drop.
Feature engagement matrix. Scatter with AdoptionPct on x-axis, RetentionPct on y-axis, points = features sized by user count. Quadrant lines at 30 percent and 50 percent.
User-flow Sankey. Power BI Sankey of top 20 paths through the first-session: Landing → Screen 1 → Screen 2 → Screen 3. Filterable by signup cohort.
Click heatmap link. A drill-through to a separate page that overlays click density on screen wireframes (typically rendered as a custom visual with image background).
DAX measures:
RetentionWeekN =
DIVIDE(
CALCULATE(
DISTINCTCOUNT(events[UserID]),
events[Timestamp] >= MIN(users[SignupDate]) + (N * 7),
events[Timestamp] < MIN(users[SignupDate]) + ((N + 1) * 7),
events[EventName] = "session_start"
),
DISTINCTCOUNT(users[UserID])
)
ActivationRate =
DIVIDE(
CALCULATE(DISTINCTCOUNT(users[UserID]),
NOT(ISBLANK(users[FirstActionDate]))),
DISTINCTCOUNT(users[UserID])
)
Tableau alternative: funnel as sorted bar; cohort matrix native; scatter native; Sankey via extension; heatmap via image-overlay.
58.7.3 Step 3 — Page 2: Experiment readout (templated)
Build the canonical experiment readout layout, used identically across every test:
Header strip. Hypothesis, primary metric, guardrails, planned vs actual N, days running, SRM check status. SRM in red kills the rest of the page.
Lift CI chart for primary metric. Horizontal bar centred on zero with 95 percent CI bracket. Numbers labelled: lift, CI bounds, p-value.
Guardrail panel. 4-6 mini lift CI charts in a row — latency, error rate, retention, revenue per user. Conditional red flag when CI crosses into the danger zone.
Segment forest plot. Stacked horizontal bars showing per-segment lift CI for pre-registered segments (new vs returning, mobile vs web, geographic regions). Heterogeneity flag when ranges do not overlap.
Power and enrolment trend. Line chart of cumulative sample size against the pre-calculated required N for the chosen MDE; reveals whether the test has hit power.
Decision panel. Three-button mock-up showing Ship / Iterate / Kill with the analyst’s recommendation and reasoning. The actual decision is recorded for institutional memory.
DAX measures:
PrimaryLift =
VAR treat = CALCULATE([PrimaryMetric], experiments[Variant] = "Treatment")
VAR ctrl = CALCULATE([PrimaryMetric], experiments[Variant] = "Control")
RETURN DIVIDE(treat - ctrl, ctrl)
SRM_Pvalue =
VAR n_t = CALCULATE(DISTINCTCOUNT(experiments[UserID]), experiments[Variant] = "Treatment")
VAR n_c = CALCULATE(DISTINCTCOUNT(experiments[UserID]), experiments[Variant] = "Control")
VAR expected = (n_t + n_c) / 2
VAR chisq = ((n_t - expected)^2 + (n_c - expected)^2) / expected
RETURN 1 - CHISQ.DIST(chisq, 1)
SRM_Flag = IF([SRM_Pvalue] < 0.001, "FAILED", "OK")
Tableau alternative: the same template with calculated fields for CI bounds and an SRM control chart.
58.7.4 Step 4 — Page 3: Growth and North-Star
Build four visuals.
North-Star trend. Line chart of weekly NSM (e.g., weekly-active-rechargers) over 52 weeks, with prior-year overlay and target band.
Input-tree decomposition. Cards in tree layout: visits, activation rate, activated users, retention rate, retained users, frequency, NSM. Conditional formatting on prior-period delta.
AARRR cohort matrix. Matrix with SignupCohort on rows and AARRR stage on columns, cell value = percent reaching that stage.
LTV/CAC dashboard. Cohort matrix of cumulative gross profit per acquired user by month-since-acquisition, with breakeven-month tile and LTV/CAC ratio card. Conditional flag when ratio falls below 3.
Tableau alternative: line chart with reference band; cards in container; matrix native; cohort matrix with calculated cumulative.
58.7.5 Step 5 — Self-serve filter pane and semantic layer
Build a synced slicer pane that travels across all three pages: Date, Platform, Acquisition Channel, Plan Tier, Country. Document the metric definitions in a separate metadata page so that activation rate means the same thing in every dashboard the company builds. This is the governed semantic layer — without it, two PMs arguing about retention may be using two different definitions and never realise.
58.7.6 Step 6 — Self-serve enablement
Publish to a Power BI workspace as an app. Embed self-serve query builders (Field Parameters in Power BI 2023+) that let a PM create their own funnels and retention curves without touching DAX. Time-box this enablement: a 30-minute onboarding for new PMs covers the dashboard, the semantic layer, and the experiment-readout template.
58.7.7 Step 7 — Experiment hygiene controls
Configure the workspace so that experiment readouts cannot be marked shipped without:
- SRM check passing.
- Power achieved (cumulative N ≥ planned N).
- All guardrails reviewed.
- Decision recorded with reasoning.
This is enforced via a Power BI deployment pipeline with a checklist app. Without these gates the dashboard becomes a tool for confirmation bias; with them it becomes the discipline Ron Kohavi et al. (2020) calls trustworthy experimentation.
Technology analytics closes the industry tour and the book. Funnels (Chapter 49 marketing, Chapter 52 HR, Chapter 53 sales, Chapter 54 retail) carry the activation flow. Cohort retention curves (Chapter 52 HR, Chapter 54 retail) carry product retention. Confidence intervals and forest plots (Chapter 21 statistics, Chapter 28 correlation, Chapter 50 finance, Chapter 55 healthcare funnel plots) carry the experiment readout. Sankey diagrams (Chapter 49 marketing, Chapter 51 operations, Chapter 54 retail) carry user flow. Heatmaps (Chapter 12, throughout) carry click density and retention matrices. Mobile design (Chapter 47) means the PM checks the experiment status on a phone in a meeting. The storytelling discipline (Chapter 48) is what turns a 2.4 percent lift with a wide CI into a defensible product decision. The visualisation grammar this book has built — chart types, perceptual rules, accessibility, dashboard design, BI tooling, statistics, and storytelling — comes together in technology analytics because the feedback loop is the shortest, the data the richest, and the audience the most demanding. The same grammar applies in every other chapter and every other industry; the tools change, the questions change, but the visualisation layer stays the bridge between data and decision.
Power BI three-page product analytics dashboard with experiment-readout template (yuvijen-apps-product.pbix), Tableau equivalent (yuvijen-apps-product.twbx), event-grain workshop dataset (yuvijen-apps-events.xlsx), CPO summary export (yuvijen-apps-cpo-summary.pdf), and a screen recording of the dashboard tour and a sample experiment readout (yuvijen-apps-walkthrough.mp4) will be embedded here.
Summary
| Concept | Description |
|---|---|
| Technology-Dashboard Contract | |
| Self-Serve over Delivery | PM, designer, engineer slice the data themselves; BI team is not the bottleneck |
| Event-Grain by Default | Aggregate dashboards lose per-user, per-session detail product investigations need |
| Experiment Hygiene over Speed | Wrong A/B-test conclusion shipped is more expensive than a slow conclusion |
| Three Technology Jobs | |
| User-Behaviour Analytics | What are people doing in the product, and where are they getting stuck? |
| Experimentation Analytics | Did the change actually work, and at what cost or risk? |
| Growth and Revenue Analytics | Is the business model healthy? |
| User-Behaviour Visuals | |
| Product Funnel | Sequenced bars from landing through paid conversion with stage conversions labelled |
| Segment Overlay | Same funnel by channel, device, geography tells very different stories |
| Cohort Retention Curve | Per-cohort percentage returning at week 1, 2, 3 over time |
| Decay-to-Zero Shape | Most users drop off and never come back; product-market fit not found |
| Flat-Plateau Shape | Stable retained user base; better than decay but not yet network effect |
| Smile Curve | Curve up after a dip; rare and beloved shape of platforms with PMF |
| Feature Engagement Matrix | Adoption percent vs retention percent scatter with quadrant lines |
| Click and Scroll Heatmaps | Reveal where users tried to interact with what was not designed clickable |
| User-Flow Sankey | Top paths through a product session with link width as session count |
| Vanity-Metric Trap | Page views and signups grow with marketing spend and tell little about working |
| Experimentation Toolkit | |
| Lift Confidence-Interval Chart | Treatment-minus-control horizontal bar with 95 percent CI bracket |
| Guardrail Multi-Metric Panel | Primary metric plus 4-6 guardrails as small lift CI charts in a row |
| Segment Forest Plot | Per-segment lift CIs stacked on a single axis to surface heterogeneity |
| Multiple-Comparisons Adjustment | Bonferroni or FDR adjustment when many segments are sliced |
| Sample-Ratio Mismatch (SRM) | Chi-squared check that actual treatment-control split matches intended |
| Pre-Registered Segments | Pre-register segments of interest to avoid post-hoc fishing |
| Power and Enrolment Trend | Cumulative N versus required N reveals whether test has reached power |
| Ship or Iterate or Kill Decision | Three-button decision panel with reasoning recorded for institutional memory |
| Growth and Revenue | |
| AARRR Pirate Funnel | Acquisition, Activation, Retention, Referral, Revenue from McClure |
| North-Star Metric | Single number capturing whether product delivers value to users at scale |
| Input-Output Tree | Tree of input metrics that drive the North-Star — visits, activation, retention, frequency |
| AARRR Cohort Matrix | Cohort weeks by AARRR stage with cell value as percent reaching that stage |
| LTV and CAC | Customer Lifetime Value over Customer Acquisition Cost; above 3 is healthy SaaS |
| Payback Period | Months to recover CAC; breakeven crossing in cohort cumulative-profit matrix |
| Common Pitfalls | |
| Pitfall: Vanity Headline | Page views, signups, downloads on the headline are weak product signals |
| Pitfall: Point Estimate Only | A 2.4 percent lift with [-1, +6] CI is not a 2.4 percent improvement |
| Pitfall: Ignored SRM | Most common silent killer of A/B-test conclusions |
| Pitfall: Cherry-Picked Segments | Reporting only the segment cuts that came out significant is malpractice |
| Pitfall: Funnels Without Cohorts | A funnel is a snapshot; the funnel-by-cohort matrix is the diagnostic |
| Pitfall: Arbitrary Retention Window | First-action-anchored cohorts often more useful than calendar-week cohorts |
| Pitfall: Heatmap Without Context | Click heatmap on a screen the audience does not understand is just noise |
| Pitfall: Single-Metric Optimisation | CTR up, latency up, retention down — net loss celebrated as win |
| Pitfall: Bespoke Experiment Readouts | Standardised experiment-readout template applied identically across all tests |
| Pitfall: Ungoverned Self-Serve | When everyone defines metrics differently, dashboard tiles mean different things |
| Hands-On Product Dashboard | |
| Page 1 — Behaviour | Activation funnel, retention cohort, feature matrix, user-flow Sankey, click drill-through |
| Page 2 — Experiment Readout | Lift CI, guardrail panel, forest plot, power trend, ship-iterate-kill panel |
| Page 3 — Growth | North-Star trend, input tree, AARRR cohort matrix, LTV-over-CAC dashboard |
| Self-Serve Filter Pane | Synced slicers across pages with documented metric definitions |
| Governed Semantic Layer | Activation rate means the same thing in every dashboard the company builds |
| Experiment Hygiene Gates | Cannot mark experiment shipped without SRM, power, guardrails, decision recorded |