Synthetic-Data Fidelity Self-Assessment
Fidelity is the property that lets a synthetic corpus stand in for real data. It has four legs: the math is internally consistent, the demographics are plausible, the trajectories are coherent across time, and the scenarios actually exercise the code paths the buyer cares about. Score each leg honestly; the radar chart will tell you where the corpus is brittle.
What you walk away with
~8 min · 4 categories · 16 items- A fidelity index across arithmetic, demographic, longitudinal, and scenario dimensions.
- A radar shape that exposes asymmetric corpora (good math but flat demographics, or rich demographics but no longitudinal evolution).
- A ranked remediation list with cross-walks to the Fidelity QA checklist.
- Calibration against the WealthSynth fidelity bar — the same bar applied to the shipped corpus.
Answer every item (0 of 16 so far) to lock in a banded score and unlock the remediation roadmap. Live category scores update as you go.
Arithmetic invariants
0 / 4 answeredThe non-negotiable math. If balances don't reconcile, no other fidelity property matters because every consumer will hit a contradiction first.
- Account balances reconcile to transaction history
Each account's ending balance equals starting balance plus net transactions, including reinvested distributions and fees, with rounding tolerances documented.
- Total household income equals the sum of source-level income
W-2, 1099-INT/DIV/B, K-1, Schedule C, rental, retirement distribution, and Social Security all roll up to the total without gaps or overlaps.
- Tax liability matches AGI / brackets / credits / withholding math
For every tax year, taxable income, regular tax, AMT, credits, and withholding produce a tax-due figure that matches the Form 1040 calculation chain to the cent.
- Net worth equals assets minus liabilities at every snapshot
Across all snapshots and across the longitudinal trajectory, the net worth identity holds. Negative-equity scenarios are flagged but allowed where realistic.
Demographic plausibility
0 / 4 answeredWhether the joint distribution of demographics, income, and assets resembles a real wealth-management book — not a uniform random sample.
- Joint distributions match a calibration source
Age × income × asset distributions match a documented benchmark (Survey of Consumer Finances, peer book, internal anchor) within stated tolerances.
- Right and left tails are deliberately shaped
UHNW outliers and low-asset edge cases exist in the corpus in deliberate proportions — not absent, not over-represented.
- Geographic distribution is intentional
State-level distributions reflect the firm's actual or target footprint. Multi-state filers exist where appropriate.
- Household compositions are diverse and realistic
Single-earner, dual-earner, blended-family, multi-generational, and single-parent households exist in proportions that match the target book.
Longitudinal coherence
0 / 4 answeredWhether the corpus moves through time the way real households do — life events change cash flow, market shocks change behavior, and ages advance consistently.
- Each household has at least 24 months of trajectory
Sufficient trajectory exists to exercise time-dependent logic: RMD, IRMAA brackets, tax years, multi-year carryforwards.
- Life events propagate coherently
When a household member retires, dies, divorces, or has a birth, downstream cash flows, account ownership, and beneficiary designations all update consistently.
- Market exposure produces coherent return sequences
Account values move with market exposure; allocation drift and rebalancing events are realistic; dividend yields are plausible.
- Behavioral signals are time-coherent
Risk-tolerance changes follow plausible triggers (market shock, life event), not random noise. Contribution patterns match income trajectory.
Scenario coverage
0 / 4 answeredWhether the corpus actually exercises the structurally tricky cases. Coverage is what turns fidelity into business value.
- Tax scenarios exercise wash-sale, AMT, NUA, QSBS, multi-state
Each tax-edge case has at least one household where the case fires within the trajectory, with documented expected outcomes.
- Retirement scenarios exercise RMD, Roth conversion, IRMAA, NUA
Each retirement-edge case has at least one household where the case fires.
- Compliance scenarios exercise Reg BI, AML typology, GLBA NPI
Each compliance-relevant scenario has a documented synthetic case that fires it.
- Scenario coverage is documented per archetype
For every archetype in the corpus, it's clear which scenarios that archetype is included to exercise.
Banded score reference
Brittle
0–30%The corpus has serious fidelity gaps. It will fail under any rigorous downstream consumer (validator, regulator, demanding pilot customer).
Next step: Fix arithmetic invariants first; nothing downstream matters until the math is internally consistent.
Plausible
30–55%Math holds and demographics are reasonable, but longitudinal coherence or scenario coverage is uneven.
Next step: Add longitudinal trajectories and close scenario gaps in the highest-leverage code paths first.
Defensible
55–80%All four legs are present at production grade. The corpus stands up to a structured fidelity review.
Next step: Tighten the calibration source citation and rehearse a fidelity walkthrough.
Audit-Grade
80–100%The corpus meets the WealthSynth audit-grade fidelity bar — version-pinned, calibrated, fully covered, fully documented.
Next step: Operate the steady-state; re-run after every corpus refresh.
Key takeaways
- Arithmetic is the floor. A corpus with broken math has no value to a wealth-tech buyer regardless of how rich the demographics look.
- Longitudinal coherence is the leg most teams skip. Point-in-time corpora are quick to build and useless for retirement, tax, and trajectory testing.
- Scenario coverage without documentation is invisible. Tag each archetype with the scenarios it exists to exercise.
- Fidelity decays without a refresh cadence. Calibrate, document, and re-validate on every release.
FAQ
Why does WealthSynth use a strict validation gate?
Because anything looser produces a corpus where a small subset of households are silently broken. Buyers can't tell which until a code path exposes the gap. Strict gating keeps the corpus shippable as a coherent whole.
Can I score an existing in-house corpus with this?
Yes. The four categories are vendor-agnostic. If you score below 'Defensible' across more than one category, the build-vs-buy comparison gets short.
How does this differ from generic synthetic-data quality metrics?
Generic synthetic-data metrics (TVD, k-anonymity, marginal-utility) measure statistical similarity. This assessment measures usefulness for wealth-tech testing — which depends on arithmetic correctness, scenario coverage, and longitudinal coherence as much as on distributional similarity.