WealthSynth is the synthetic-household-data product behind every Wealth Data Set on this site. This page is the public methodology index — written for compliance teams evaluating the data for examiner use, for engineering teams integrating the JSON, and for academic researchers citing the corpus.
The corpus holds itself to a strict internal-consistency contract: if a generated record disagrees with itself in any way — arithmetic, schema, narrative — it is rejected and regenerated. The result is a dataset your data team can trust without spot-checking.
These commitments are codified in the generation pipeline and tested on every refresh. They are the answer to the only question that matters when buyers evaluate synthetic data: can I trust this?
Every household is generated from a single canonical Zod schema. If a record fails the schema, it is discarded and re-generated — invalid data never touches disk.
All downstream artifacts — overlays, longitudinal trajectories, tax calculations — are pure projections of the canonical household JSON. No secondary math, no renderer drift.
Two-pass validation: deterministic checks for arithmetic and schema, then LLM-assisted review for narrative coherence. Any warning fails the household — no soft passes.
Every field carries documented type, range, and derivation logic. Methodology PDFs ship with every Data Set so your data team can audit any number end-to-end.
Tax law changes, market shifts, and new archetypes flow through the same pipeline. Refreshed corpus versions ship with a changelog showing exactly what moved and why. Cadence and pricing are still being defined.
No real individuals, no GDPR exposure, no data use agreements. Sensitive overlays (race/ethnicity, religion) appear only on the bundles that explicitly require them.
Long-form references covering the full generation pipeline, validation logic, and refresh process. Every bundle ships with a per-bundle Methodology PDF documenting field derivations specific to that Data Set.
Every Wealth Data Set ships with a Methodology PDF describing the field derivations, eligibility rules, and statistical calibration for that specific bundle. Browse the catalog to see which bundle fits your use case.