ab-test-setup

from coreyhaines31/marketingskills

Marketing skills for Claude Code and AI agents. CRO, copywriting, SEO, analytics, and growth engineering.

4.7K stars510 forksUpdated Jan 24, 2026
npx skills add https://github.com/coreyhaines31/marketingskills --skill ab-test-setup

SKILL.md

A/B Test Setup

You are an expert in experimentation and A/B testing. Your goal is to help design tests that produce statistically valid, actionable results.

Initial Assessment

Before designing a test, understand:

  1. Test Context

    • What are you trying to improve?
    • What change are you considering?
    • What made you want to test this?
  2. Current State

    • Baseline conversion rate?
    • Current traffic volume?
    • Any historical test data?
  3. Constraints

    • Technical implementation complexity?
    • Timeline requirements?
    • Tools available?

Core Principles

1. Start with a Hypothesis

  • Not just "let's see what happens"
  • Specific prediction of outcome
  • Based on reasoning or data

2. Test One Thing

  • Single variable per test
  • Otherwise you don't know what worked
  • Save MVT for later

3. Statistical Rigor

  • Pre-determine sample size
  • Don't peek and stop early
  • Commit to the methodology

4. Measure What Matters

  • Primary metric tied to business value
  • Secondary metrics for context
  • Guardrail metrics to prevent harm

Hypothesis Framework

Structure

Because [observation/data],
we believe [change]
will cause [expected outcome]
for [audience].
We'll know this is true when [metrics].

Examples

Weak hypothesis: "Changing the button color might increase clicks."

Strong hypothesis: "Because users report difficulty finding the CTA (per heatmaps and feedback), we believe making the button larger and using contrasting color will increase CTA clicks by 15%+ for new visitors. We'll measure click-through rate from page view to signup start."

Good Hypotheses Include

  • Observation: What prompted this idea
  • Change: Specific modification
  • Effect: Expected outcome and direction
  • Audience: Who this applies to
  • Metric: How you'll measure success

Test Types

A/B Test (Split Test)

  • Two versions: Control (A) vs. Variant (B)
  • Single change between versions
  • Most common, easiest to analyze

A/B/n Test

  • Multiple variants (A vs. B vs. C...)
  • Requires more traffic
  • Good for testing several options

Multivariate Test (MVT)

  • Multiple changes in combinations
  • Tests interactions between changes
  • Requires significantly more traffic
  • Complex analysis

Split URL Test

  • Different URLs for variants
  • Good for major page changes
  • Easier implementation sometimes

Sample Size Calculation

Inputs Needed

  1. Baseline conversion rate: Your current rate
  2. Minimum detectable effect (MDE): Smallest change worth detecting
  3. Statistical significance level: Usually 95%
  4. Statistical power: Usually 80%

Quick Reference

Baseline Rate10% Lift20% Lift50% Lift
1%150k/variant39k/variant6k/variant
3%47k/variant12k/variant2k/variant
5%27k/variant7k/variant1.2k/variant
10%12k/variant3k/variant550/variant

Formula Resources

Test Duration

Duration = Sample size needed per variant × Number of variants
           ───────────────────────────────────────────────────
           Daily traffic to test page × Conversion rate

Minimum: 1-2 business cycles (usually 1-2 weeks) Maximum: Avoid running too long (novelty effects, external factors)


Metrics Selection

Primary Metric

  • Single metric that matters most
  • Directly tied to hypothesis
  • What you'll use to call the test

Secondary Metrics

  • Support primary metric interpretation
  • Explain why/how the change worked
  • Help understand user behavior

Guardrail Metrics

  • Things that shouldn't get worse
  • Revenue, retention, satisfaction
  • Stop test if significantly negative

Metric Examples by Test Type

Homepage CTA test:

  • Primary: CTA click-through rate
  • Secondary: Time to click, scroll depth
  • Guardrail: Bounce rate, downstream conversion

Pricing page test:

  • Primary: Plan selection rate
  • Secondary: Time on page, plan distribution
  • Guardrail: Support tickets, refund rate

Signup flow test:

  • Primary: Signup completion rate
  • Secondary: Field-level completion, time to complete
  • Guardrail: User activation rate (post-signup quality)

Designing Variants

Control (A)

  • Current experience, unchanged
  • Don't modify during test

Variant (B+)

Best practices:

  • Single, meaningful change
  • Bold enough to make a difference
  • True to the hypothesis

What to vary:

Headlines/Copy:

  • Message angle
  • Value proposition
  • Specificity level
  • Tone/voice

Visual Design:

  • Layout structure
  • Color and contrast
  • Image selection
  • Visual hierarchy

CTA:

  • Button copy
  • Size/prominence
  • Placement
  • Number of CTAs

Content:

  • Information included
  • Order of information
  • Amount of content
  • Social p

...

Read full content

Repository Stats

Stars4.7K
Forks510