The friction in RIA onboarding isn't on the happy path. It's on the long tail: the H-1B prospect whose visa status the intake form doesn't capture, the gig worker whose income doesn't fit the W-2 dropdown, the divorcing client whose accounts are in the middle of being retitled, the artist whose royalty income confuses the cash-flow modeler. The RIA Onboarding Stress Test Pack is 400 simulated prospects engineered to exercise every part of your firm's intake process — KYC fields, goal capture, initial recommendation logic, CRM mapping — against the full variety of clients an RIA actually sees.
Most onboarding workflows are designed for the modal client and tested against three or four hand-built test fixtures. The result is friction that surfaces only in production: the form that requires a Social Security Number when an ITIN is what the prospect has; the goal-capture flow that doesn't have a category for caregiver-of-aging-parent; the initial-recommendation engine that breaks when the prospect's income is a six-month-trailing-average rather than a current monthly figure.
These aren't edge cases — they're the modal cases for important RIA growth segments. Cross-border professionals, gig-economy earners, recent immigrants, and life-transition clients are the highest-LTV prospects most firms underserve because the onboarding system rejects or misroutes them. Firms find out only after the fact, when win rates from those segments are quietly half what they should be.
This Data Set surfaces the friction in advance. The 400 prospects span 12 archetypes covering every wealth tier, life stage, employment type, and family structure your intake system needs to handle. The broadest archetype coverage of any single bundle in the catalog — that's intentional.
Runs the firm's onboarding intake against all 400 prospects in a test environment to find every form field, validation rule, and routing decision that breaks on a non-modal prospect. The friction points get prioritised before launch or before a redesign goes live.
Demos the product's ability to handle complex prospects to RIA buyers using realistic households spanning H-1B tech workers, gig artists, military officers, and dual-residency UHNW couples — without exposing any prospect's real data.
Tests the firm's CRM data mapping (Salesforce / Wealthbox / Redtail) by ingesting all 400 prospects and verifying every field round-trips correctly, including custom fields for goal categories and KYC documentation status.
Walks the SEC examiner through the firm's KYC documentation process using realistic prospects covering CIP, beneficial ownership, and identity-verification edge cases — demonstrating controls without using actual client data.
Validates the initial-plan generation logic across all archetype types, ensuring the engine produces sensible recommendations for prospects whose financial situation doesn't fit a textbook 401k-saver-with-mortgage profile.
The 400 prospects span 12 archetypes — the widest archetype coverage of any single Wealth Data Set: from F-01 (New Graduate Tech Worker) and AR-01 (Artist with Royalty Income) on the formation end, through dual-income professionals and small business owners in mid-career, to retirees and complex life-transition cases. Wealth tiers span $50K to $30M+ so every advisory engagement model is represented.
Every prospect has a complete KYC record: identity verification fields (with realistic edge cases — ITIN filers, dual citizens, recent name changes), beneficial ownership for entity-owning prospects, source-of-wealth narrative, and risk-tolerance questionnaire results. Goal capture covers the full goal taxonomy (retirement, college, home purchase, business sale, charitable, transition) with structured priority rankings and time horizons. Initial recommendation outputs include the canonical first-meeting deliverables: asset allocation suggestion, account-type recommendation (which bucket to fund first), insurance gap analysis, and estate planning readiness flags (will/POA/HCD presence, beneficiary completeness).
Field names follow CRM-compatible conventions so direct ingestion into Salesforce, Wealthbox, or Redtail requires minimal transformation. The Data Set ships as JSON (one file per prospect plus a manifest) and CSV (long-format with normalized account/goal/recommendation tables for SQL ingestion), accompanied by the WealthSynth Methodology PDF — covering the field schema, the CRM-mapping appendix, and the calibration source for each archetype's KYC/goal/recommendation defaults.
A redacted summary of one household from this Data Set — names, employers, exact balances, and metro area are stripped. Ages are bucketed, income and net worth are reported as bands. The full record (and all 400 like it) ships in the ZIP.
{
"demographics.household_profile": <value>,
"accounts.summary_balances": <value>,
"goals.primary_financial_goals": <value>,
"kyc.identity_verification_fields": <value>,
"planning.initial_recommendations": <value>
}Returns every prospect whose primary income source is non-W-2 (1099 contractor, K-1 distribution, royalty, business owner draw) — the population most likely to expose friction in income-capture forms.
prospects.filter(p =>
['1099', 'K-1', 'royalty', 'business_draw'].includes(
p.income.primary_source_type
)
)Filters prospects whose identity verification path is non-standard: ITIN holders, dual citizens, recent name changes, or thin-file credit history. The intake flow that handles these well wins clients competitors lose.
prospects.filter(p => p.kyc.identity_verification_fields.itin_filer || p.kyc.identity_verification_fields.dual_citizen || p.kyc.identity_verification_fields.recent_name_change || p.credit.thin_file_flag )
Returns prospects whose financial profile suggests a specific advisor specialty (equity-comp, tax-complex, estate-heavy) — useful for routing logic that pairs the right advisor with the right prospect at intake.
prospects.map(p => ({
id: p.id,
routing_specialty: p.equity_comp.grants?.length > 0 ? 'equity_comp'
: p.business.entity_type ? 'business_owner'
: p.estate.trust_structures?.length > 0 ? 'estate'
: 'general'
}))For each prospect, returns the percentage of the canonical initial-plan checklist that the firm has data to complete from intake alone — surfaces where additional data collection is needed before the first meeting.
prospects.map(p => {
const required = ['risk_tolerance', 'goals',
'income', 'expenses', 'assets', 'liabilities',
'beneficiaries', 'insurance'];
const present = required.filter(k => p[k] != null);
return { id: p.id, completeness: present.length / required.length };
})Each prospect is generated against a randomly weighted draw from 12 archetypes, with the weighting deliberately equalising across wealth tiers and life stages so no single segment dominates the 400-prospect corpus. KYC fields draw from realistic distributions of identity-verification scenarios (about 4% ITIN filers, 6% dual citizens, 2% recent name changes — calibrated against Pew demographic data on the financial-services-customer population). Goal-capture and risk-tolerance fields use FINRA Rule 2090–compliant structures. Initial recommendation outputs are produced by the same recommendation logic that generates the WealthSynth canonical onboarding outputs, so the recommendations exhibit realistic variation across archetypes. The full corpus passes the WealthSynth consistency validator (KYC fields reconcile, goal priorities sum correctly, recommendation logic is deterministic given inputs) and the LLM-as-judge gate. Annual refresh re-runs against current FINRA interpretive guidance and CFP Board CIP standards.
The 'stress test' framing is intentional. Most firms onboard the modal prospect cleanly; the friction emerges with the long-tail prospects whose data doesn't fit the form. Running 400 simulated prospects through the firm's intake stress-tests the workflow in the same way a backend load test stresses an API — surfacing the failures before real clients hit them.
Field names follow conventions compatible with Salesforce Financial Services Cloud, Wealthbox, and Redtail. The Methodology PDF includes an explicit field-mapping appendix for the four most common RIA CRMs. For systems with custom field schemas, the JSON nested structure is straightforward to remap.
Yes — about 18% of the corpus prospects are bringing entity-owned accounts (LLC, trust, family LP). These have the additional KYC fields (beneficial ownership, entity formation documents, signing-authority structures) that the entity-onboarding path requires. Entity-only test fixtures can be filtered with `prospects.filter(p => p.kyc.entity_owned).`
Yes. The Data License explicitly permits demonstration use, including in product demos, conference presentations, and sales calls. Many platform vendors use this Data Set as a 'show, don't tell' way of demonstrating onboarding capability to RIA buyers.
The KYC, goal-capture, and risk-tolerance fields align with FINRA Rule 2090 (KYC), Reg S-P (privacy), CFP Board CIP standards, and current SEC OCIE examination focus areas. Annual refresh updates against any new SEC interpretive guidance from the prior 12 months.
For workflow QA, the full 400 is recommended — the long-tail edge cases that surface friction are sparsely distributed. For initial demos or smaller integration tests, a 50-prospect subset stratified across archetypes works (filter to one or two prospects per archetype). The Data Set isn't subdivided for sale; the full 400 is the only purchase option.
The recommendations are generated by deterministic logic against each prospect's structured data, so they exhibit realistic variation (different age + income + goals → different allocation). They aren't intended as 'gold-standard' recommendations for benchmarking; they're a structurally valid set of outputs your platform can use to test recommendation-display logic, audit trail, and client-facing report generation.
B14 is a curated 400-prospect subset focused specifically on onboarding workflow testing — KYC, goals, initial recommendations. B31 is the full 1,451-household structured-JSON corpus with all 30 bundle overlays applied where eligible and 96 monthly longitudinal snapshots per household. If you only need onboarding testing, B14 is the right buy. If you're building multi-product platform infrastructure, B31 is more efficient than buying ten bundles individually.
130 synthetic households tuned for Reg BI suitability testing — concentrated holdings, age 75+, recent inheritance, cognitive decline markers, and risk-mismatch flags. Each record carries the eligibility triggers required to exercise broker-dealer supervisory workflows end to end.
130 affluent and HNW households with detailed fee structures: AUM-based advisory fees, tiered breakpoints, fund expense ratios, transaction costs, and tax-drag estimates. Includes complex fee arrangements (multi-firm, family-office, performance-based).
220 households mid-transition: divorce in progress, post-bankruptcy recovery, medical-debt crisis, sandwich-generation caregivers, recent windowhood, sudden wealth, distressed mortgages, and blended-family formation. High behavioral-event density and asset reshuffling.
Purchases are for internal use only. Redistribution or resale of data is prohibited under the WealthSchema Data License.
View data license →