Production Parity Test Designer

Design test hierarchies that catch production failures before production.

No API Required

Download Skill Package (.skill) View Source on GitHub Workflow

Table of Contents

Overview
When to Use
Workflows
Key Outputs
Resources
Best Practices
Related Skills

Overview

Production Parity Test Designer ensures production failures are caught before production. Rather than increasing test count, this skill focuses on which test tier should cover which production gap, eliminating proxy metrics (tests that pass but miss real failures) and structuring a layered defense from PR CI through release packaging.

Core philosophy: Tests exist to reproduce production failure modes. If a test cannot fail in the same way production fails, it provides false confidence.

When to Use

PR CI is too lightweight to detect production divergence
DB dialect differences exist (e.g., SQLite vs PostgreSQL)
UI shows success but data is not persisted to the database
Mocks hide runtime import errors
Timezone, locale, OS, or dependency differences surface only in production
The boundary between unit tests and smoke tests is unclear
Past critical defects need to be structured as regression prevention tests
Packaging or container build integrity needs to be guaranteed

Workflows

Eight steps drive the test hierarchy design:

Step 1: Production Gap Inventory

Enumerate all differences between dev/CI and production environments across categories: DB dialect, OS/container, dependency installation, environment variables, timezone/locale, real vs mock, serialization, and packaging/deployment.

Step 2: Failure Mode Enumeration

For each gap, define concrete failure modes. Examples: SQLite passes but PostgreSQL throws syntax error on UPSERT; UI shows success toast but INSERT silently fails; import cv2 works in dev but fails in production container. Classify by visibility (silent/loud), blast radius, and detection difficulty.

Step 3: Test Tier Allocation

Assign each failure mode to the optimal test tier:

Tier	Scope
Unit	Pure logic, boundary values, input validation
Integration	Real DB operations, repository operations, multi-component
E2E	UI action -> persistence verification -> business-visible outcome
Smoke	Minimum parity checks in every PR
Packaging	Install, import, build, container image integrity
Nightly / Heavy	Full parity suite, performance baselines

Step 4: Proxy Metric Elimination

Identify and remediate tests that provide false confidence: UI-only verification (never queries DB), mock-only coverage (never tests real dependencies), coverage theater (high line coverage but no boundary testing), happy-path bias.

Step 5: PR Smoke Suite Definition

Define the minimum production parity checks for every PR within a 2-5 minute runtime budget: DB dialect smoke, import smoke, persistence smoke, timezone smoke, and serialization smoke.

Step 6: Adversarial Regression Backlog

Create regression tests from past defects and attack patterns. For each incident, define the exploit/failure pattern, minimal reproducible scenario, expected protected behavior, and regression scope.

Step 7: Packaging / Dependency Integrity Checklist

Verify that the application builds, installs, and imports correctly in a production-equivalent environment: lockfile alignment, clean install, all top-level imports succeed, main entry point starts, and container image matches production.

Step 8: Standard Command Map

Define named test commands for each execution context: local fast, PR CI required, nightly parity, staging E2E, and release packaging.

Key Outputs

Deliverable	Content
Production Gap Inventory	Dev/CI vs production difference list
Test Tier Allocation Matrix	Failure modes mapped to optimal test tiers
PR Smoke Suite Proposal	Minimum parity checks for every PR
Adversarial Regression Backlog	Regression prevention tests from past defects
Packaging / Dependency Integrity Checklist	Install, import, build verification
Standard Test Command Map	Named commands per execution context

Resources

Resource	Type	Purpose
`references/production_gap_catalog.md`	Reference	Production gap taxonomy
`references/test_tier_strategy.md`	Reference	Tier responsibilities and tradeoffs
`references/adversarial_test_patterns.md`	Reference	Attack and failure pattern catalog
`references/persistence_verification_guide.md`	Reference	Persistence verification patterns
`references/packaging_integrity_guide.md`	Reference	Packaging and dependency integrity
`references/timezone_dialect_boundary_guide.md`	Reference	Timezone, DB dialect, locale
`assets/test_tier_matrix_template.md`	Template	Test tier allocation table
`assets/smoke_suite_template.md`	Template	Smoke suite specification
`assets/adversarial_regression_template.md`	Template	Regression backlog
`assets/packaging_checklist_template.md`	Template	Packaging checklist
`assets/command_map_template.md`	Template	Command map

Best Practices

Production gaps first, test count second – start by asking “what production failures are invisible to our current tests?” not “how many tests do we need?”
Persistence over presentation – every E2E test checking UI display must also verify the underlying data store. “Success toast appeared” is not a valid assertion.
Runtime budget discipline – PR smoke suites must have a 2-5 minute budget. Tests over budget get demoted to nightly, not skipped.
Mock minimization – do not mock your own database, file storage, or message queue. Every mock should have a corresponding integration test with the real dependency.
Environment parity in CI – CI should match production: same DB engine, same OS family, same timezone config. Use service containers, not in-memory substitutes.

TDD Developer – implement test code
Completion Quality Gate Designer – design quality gates