Split Test Street
120 subscribers
6 photos
Real tests run by real operators this week — what we shipped, what won, what flopped, and the exact change you can steal today.
Download Telegram
Channel created
Channel photo updated
I kept a 5% holdout. It exposed my fake gains.

This is the move that changed how I trust my own dashboard.

All year I shipped 'winners.' Dashboard said I was up huge, cumulatively. Felt like a genius.

Then I checked my 5% holdout — a slice of traffic that NEVER got any of my changes, frozen on the original page.

My 'winners' vs the holdout? The real cumulative lift was about 40% of what the individual tests claimed. Wins overlap, decay, and interact. The dashboard double-counted.

The holdout is the only honest mirror you've got. It tells you what your work was ACTUALLY worth.

— Carve out a small permanent holdout that gets zero changes
— Compare your live page to it quarterly

Go set up a holdout group today. Brace yourself. Report back.
I optimized clicks and accidentally tanked revenue.

Product grid test. New layout pushed cheaper items up top. Add-to-carts jumped 14%. I cheered.

Guardrail metric: average order value. Down 22%. People bought, just bought the cheap junk I'd promoted to the top.

Net revenue per visitor: negative. My "win" lost money.

Now every test has one primary metric AND a guardrail it's not allowed to crater.

— Pick your one true north metric (usually revenue/visitor)
— Set guardrails: AOV, refund rate, churn, support tickets
— A primary win that breaks a guardrail is a loss

Go add a revenue guardrail to your running test before you call it. Report back.
Day 2 of the test and I peeked. Big mistake.

Variant B was crushing. +22%. I texted my partner 'we found it.'

Day 5: +3%. Day 8: dead even.

That early spike was noise. Small samples swing wild. If you call it on day 2 you're just gambling on randomness and calling it skill.

The fix that saved me: I now set a fixed sample size BEFORE the test starts. No reading results until I hit it. I literally hide the dashboard.

If you must peek for sanity, use a sequential test (always-valid p-values) so early looks don't inflate your false positives.

— Decide your sample size first
— Don't call a winner before you hit it

Go set a stop number on your running test. Then close the tab.
Testing a pricing page without nuking revenue

Pricing tests scare people because you can torch real money. I run them in a safe order, lowest risk to highest:

— Start with layout, not price. 3 tiers vs 2. Toggle position. Zero revenue risk.
— Test the 'most popular' badge placement. Anchoring is free money.
— Test annual-vs-monthly default toggle. Defaulting to annual lifted my AOV without changing a price.
— Test feature framing and ordering inside each tier.
— Test the order of tiers (high-to-low anchors differently than low-to-high).
— ONLY then touch actual numbers, and cap exposure to 50% of traffic.

Never change price AND layout in one test. You won't know what moved.

Go test your 'most popular' badge first. It's free, it's fast, it anchors.


В @ScaleOrStall такого cbo vs abo scaling ещё много
I almost shipped a fake winner

This week I ran a hero test. Variant B up 14%. I was reaching for the ship button.

Then I checked the split. 52/48. Supposed to be 50/50.

That gap is sample ratio mismatch. Means something upstream broke the randomization — a redirect, a cache, a bot bucket. The 14% was probably an artifact, not a win.

Killed the test. Found a cached page serving B to returning users only. Of course B looked better. It was talking to warm traffic.

Day 1 lesson I relearn every quarter: before you read the lift, read the counts.

— Pull your variant traffic numbers right now
— If the split is off by more than ~1%, your result is garbage

Go check the SRM on your live test. Report back.
This week: button color vs button verb

Everyone wants to test green vs orange. Snooze.

I tested the VERB instead. Same button, same color. Just changed 'Get Started' to 'Get My Free Audit.'

Setup: landing page for a CPA offer, ~6k clicks split over 9 days.

Result: the specific-value verb pulled +19% clicks to the form. The color test I ran last month? Flat. Couldn't tell them apart with a microscope.

The lesson: color is decoration. The verb is the promise. 'Get My Free Audit' tells them what they walk away with. 'Get Started' tells them about work they have to do.

— Open your hero CTA
— Swap the generic verb for one that names the payoff

Go change your CTA verb to claim the reward, not start the chore. Report back.