Household facing significant medical debt from catastrophic illness, insurance gaps, financial stress.
S-03 models the working household carrying significant unpaid medical debt — the segment that the CFPB's 2023 medical-debt rulemaking and the major credit bureaus' coordinated removals of under-$500 medical tradelines specifically target. It is the cleanest test for products that must handle medical collections differently from other consumer debt.
S-03 exists because medical debt is structurally unlike credit-card or auto debt — and the rules governing how it can appear on credit reports, in collections, and in underwriting models changed materially in 2022–2024. The corpus surfaces the household that bridges insurance gaps, surprise out-of-network billing under the No Surprises Act, and the post-acute-care payment-plan negotiation. Every household carries credit-card debt that often represents pushed-off medical balances; HSA balances appear where the household had an HDHP at the time of the event. Underwriting and KYC products need realistic data for the 'high income, high debt, low credit score' profile that medical debt produces — a profile that conventional risk models score as high-default but that actually has near-zero recidivism risk because the underlying spend was not discretionary.
Cash flow tells a specific story: median gross income of $69,122 with a tight p25–p75 range ($65k–$80k), but liquid net worth of only $142k against a net-worth median of $329k — the lower-than-typical liquid ratio reflects assets stretched against unpaid bills. 9 of 17 households carry an active home-purchase goal that is competing with debt-payoff for the same savings dollars. The age range (32–58, median 51) skews older than the S-tier average because catastrophic illness clusters in the late-40s-and-up bracket. Insurance gaps appear in roughly half the corpus — gap between jobs, ACA marketplace cost-sharing failures, or a high-deductible plan where the deductible wasn't bridged.
What separates S-03 from S-02 (post-bankruptcy) is that the household has not yet defaulted in a way that triggers the BK process. The debt is in collections, on payment plans, or in pre-collection hospital-financial-assistance review. That distinction is the entire diagnostic value: a fintech product testing 'medical-debt-aware' underwriting needs households where the medical balance is large and active, not households where it has been discharged. HC-02 (SSDI claimant) and HC-03 (COBRA gap) are adjacent but cover different surfaces — disability-driven loss of income, and continuous-coverage decisions respectively, where S-03 covers the unpaid-bill aftermath itself.
Aggregated across the 17 S-03 households in the shipped v3 corpus corpus. Numbers describe the corpus, not population claims.
Jessica is the headline S-03 case: $69k income, $30k liquid, and barely positive net worth at $27k against $52,530 of total liabilities — of which $32,000 is a single `other_liabilities` line coded as medical_debt, an emergency-hospitalization balance carrying a 5.55% rate and a $640 monthly payment. That medical-debt line is the entire archetype in one row — it's more than 60% of her liabilities and dwarfs the $7.3k student loan, $10.9k auto loan, and $2.4k credit-card balance combined. The diagnostic feature for software is the credit-report and underwriting branch: under the post-2023 bureau rules a $32k medical collection at this size is reportable but should be treated differently from a $32k credit-card tradeline, and the home-purchase goal is materially off-track precisely because most AUS overlays still don't apply the medical-tradeline carve-out correctly. Debt payoff (the student loan) is on track; retirement is off-track against a $1.31M target with $27k accumulated.
Every S-03 household ships with — at minimum — these JSON fields populated. The full schema is documented in the data set you purchase.
Three buyer profiles drive S-03 demand. Healthcare-fintech teams (medical-bill negotiation platforms, hospital financial-assistance workflow products, HSA administrators) use the corpus to test scenarios where the balance source matters — original billed amount, negotiated discount, charity-care eligibility threshold under §501(r). Credit-bureau and credit-scoring teams test the post-2022 medical-tradeline treatment: tradelines under $500 suppressed, paid medical collections removed from reports, and the one-year reporting delay for unpaid medical collections. Fair-lending compliance teams at community banks and CDFIs use it to validate that mortgage and personal-loan underwriting does not penalize the medical-debt-driven credit-score depression in a way that produces ECOA disparate-impact concerns.
S-03 deliberately excludes households where medical debt has already driven a Chapter 7 filing — those move to S-02 with a medical-debt cause flag. SSDI-receiving households where the disability has materially replaced wage income are HC-02; S-03 is the working household still earning W-2 or partial-disability wages. Long-term-care expenses for an aging family member belong in S-04 (caregiver), not here — the patient in S-03 is the household member, not an external relative. UHNW households that incur catastrophic medical bills without financial distress are not modeled; medical debt at H-02 or H-03 wealth tiers is a planning topic, not a crisis. Finally, dental and elective-procedure debt is included only where it co-occurs with a major medical event; pure-elective debt belongs in B-03 (lifestyle inflation) instead.
Income and net-worth bands were anchored during v3 synthesis to the segment of CFPB consumer-credit-panel data tagged with active medical collections, with the insurance-gap structure informed by Kaiser Family Foundation employer-coverage and ACA marketplace continuity surveys. The corpus does not encode the specific medical condition (oncology, cardiac event, ICU stay) — that level of clinical detail is out of scope and would risk over-fitting to a particular cohort. Per CLAUDE.md §9, the v3 corpus is frozen and not regenerable; calibration claims describe synthesis intent rather than auditable distribution fits.
HC-02 (SSDI / LTD claimant) is the case where disability has materially replaced wage income. S-03 keeps the wage income intact and surfaces the unpaid-bill aftermath instead.
HC-03 (COBRA / benefits gap) covers the active coverage-decision phase. S-03 picks up after the coverage lapse has produced an actual bill.
S-02 is where S-03 ends up when the medical balance triggers a Chapter 7 filing. Use S-02 once the discharge is complete; use S-03 while the debt is still active.
S-04 (caregiver) covers eldercare cost flows for an aging parent. S-03 is the household where the patient is a household member, not an external relative.
S-03 — Medical Debt Crisis represents the working household carrying significant unpaid medical debt from a catastrophic illness, surgical event, or accumulated chronic-care charges. The corpus models the post-event, pre-resolution phase: collections, payment plans, hospital-financial-assistance review, and credit-report impact under the 2022–2024 bureau policy regime.
The data was synthesized to be useful against that regime. The corpus does not encode the credit-report state directly, but income, net-worth, and liability structures are intended to plausibly exercise the under-$500-suppressed, paid-medical-removed, and one-year-reporting-delay rules now in effect.
No. The corpus tags households as having medical debt but does not record the specific clinical condition. That granularity is out of scope and would risk over-fitting to a cohort; buyers needing condition-specific data should layer that on their own.
HC-02 models the household whose disability income now dominates the cash-flow story — wage income has materially dropped. S-03 keeps the W-2 or partial-disability income intact and focuses on the unpaid-bill aftermath. The two can overlap in the real world but the corpus separates them to give buyers cleaner testing surfaces.
S-03 is tagged for six bundles — B04, B10, B14, B15, B18, and B27 — covering cash-flow stress, family-coverage edge cases, behavioral finance, healthcare planning, insurance transitions, and life-event coverage.
No. The shipped v3 corpus is frozen and not regenerable from current code (CLAUDE.md §9). Sampler improvements land in a future v4 release with per-archetype golden fixtures in CI to prevent silent drift.
Download households matching this archetype as part of a Wealth Data Set.
Browse Data Sets