wealthschema/data sets/mortgage-stress-test-pack
All Data Sets

Mortgage Stress Test Pack

Mortgage underwriting is where institutional risk concentrates. A 30-year loan with a 20% down payment is a long-duration, leveraged commitment whose success depends on factors the underwriter can only partially assess at origination — the borrower's career trajectory, the local housing market, the macroeconomic regime — and one factor they can fully assess at origination but often don't: the household's ability to absorb a stress event without falling into default. The Mortgage Stress Test Pack is 90 households spanning the mortgage lifecycle, built for the lenders, servicers, and risk modelers who need realistic test data covering both the smooth cases and the cases that go sideways.

Households
90
Archetypes
4
Formats
JSON, CSV
Deviation
High

Why this Data Set exists

Mortgage origination tooling is well-developed for the modal case: prime credit, stable W-2 income, conforming loan, 30-year fixed. The tooling falls over interestingly outside that lane: the gig-worker borrower whose income is structured as 1099 receipts and deposit history; the small-business-owner borrower with K-1 distributions and depreciation add-backs; the post-divorce borrower whose income just changed and DTI calculation depends on which look-back window you apply; the underwater homeowner evaluating modification eligibility under FNMA flex modification or VA partial claim.

These non-modal cases dominate the actual loan-modification, default-risk, and fair-lending workstreams that consume servicer resources. They're also the cases most poorly represented in fixture data. Production loan-tape data captures the originated loans but not the rejected applications or the modification candidates; servicing data captures ongoing performance but not the structural details of the original underwriting decision; and the rare integrated views are heavily customer-controlled.

This Data Set provides 90 households where the full mortgage lifecycle is structured as JSON fields: the application surface (income, assets, employment history, credit profile); current LTV computed from actual property valuation and outstanding balance; DTI calculation pre-computed with the realistic income-source breakdown; forbearance and modification flags where applicable; and the mortgage-origination edge cases (DPA programs, FHA, VA, USDA loans, ITIN mortgages) that test the breadth of underwriting logic.

Use Cases

Mortgage origination QA
Loan modification eligibility scoring
Default risk model training
DTI / LTV calculation validation

Who uses this Data Set

Mortgage Origination Software Engineer

Validates the firm's automated underwriting system across 90 households whose application profiles span the realistic range — including the sub-prime, near-prime, and gig-economy cases where naive AUS logic produces incorrect approval recommendations or DTI calculations.

Servicing Operations Lead

Tests the firm's loan-modification eligibility scoring against realistic distressed-borrower profiles, ensuring the eligibility logic correctly handles the FNMA flex modification, VA partial claim, and FHA-HAMP scenarios — including the cases where the borrower qualifies for one program but not another.

Default Risk Modeler

Trains and validates default-risk models on a corpus structured for the cases that actually drive default — high DTI at origination, recent income volatility, distressed-mortgage modification history, underwater equity. Production loan-tape data under-represents these cases relative to their default-risk weight.

Fair-Lending Audit Analyst

Tests the firm's mortgage-decisioning system for disparate-impact concerns using a corpus where the protected-class indicators are not present in the household record (per the privacy contract) but the structural-application data (DTI, LTV, credit profile) is fully populated for fair-lending analysis.

FinTech Builder Targeting ITIN Mortgage Market

Validates the platform's ability to underwrite ITIN-filer mortgage applications with the alternative-documentation field set (deposit history, 1099 income, employer-letter income flags) — surfacing the friction points before launch.

What's inside

The 90 households span four mortgage-relevant archetypes: young dual-income couples with first-mortgage applications (F-03), young families with new mortgages already in place (A-01), first-time homebuyers with DTI stretching (MB-01), and distressed-mortgage households evaluating modification (MB-02). The mix is intentional — about 35% are at the application stage (testing origination logic), 30% are recent originations within the first 24 months (testing servicing-onboarding logic), 25% are distressed or in forbearance (testing modification logic), and 10% are underwater (testing strategic-default and short-sale workflows).

Every household carries the structured fields a mortgage application requires: W-2 income, 1099 income, deposit/seasoning balances, paystub-equivalent year-to-date earnings, employment history (at least 2 years of continuous tenure or a structured gap explanation), credit tradeline records, and property appraisal data. Current state includes outstanding balance, current LTV (computed from current appraisal value rather than original purchase price), DTI ratio with the income-source breakdown, escrow account balance, and any forbearance / modification flags. Forbearance / modification scenarios include the structured eligibility-determination data needed for FNMA flex, VA partial claim, FHA-HAMP, and other modification programs.

The Data Set ships as JSON and CSV, accompanied by the WealthSynth Methodology PDF. The methodology covers the application-field schema, the DTI calculation methodology (with the realistic income-source-specific add-back rules for K-1 distributions, depreciation, capital gains, and other non-W-2 income), the modification-program eligibility decision trees, and the calibration source for typical loan-program distributions (FHA, VA, USDA, conforming, jumbo, ITIN). No rendered documents (W-2 PDFs, paystubs, bank statements, etc.) are shipped — every value is in the JSON, ready for buyers to render into their own document templates if needed.

Preview a sample household

A redacted summary of one household from this Data Set — names, employers, exact balances, and metro area are stripped. Ages are bucketed, income and net worth are reported as bands. The full record (and all 90 like it) ships in the ZIP.

F-03·Young Dual-Income Couple (No Kids)
representative archetype household
Household
Married Separate
State
FL
Gross income (band)
$100k–$200k
Net worth (band)
Dependents
0
Income source types
w2 salary, w2 bonus
Members (2)
primary
Age 30–34
professional services
spouse
Age 25–29
retail

Technical Highlights

Complete mortgage application package
Current LTV per household
Forbearance & modification flags
DTI calculation pre-computed

Sample Schema Fields

sample_record.json
{
  "liabilities.mortgage.balance": <value>,
  "liabilities.mortgage.original_amount": <value>,
  "real_estate.current_appraisal": <value>,
  "credit.dti_ratio": <value>,
  "real_estate.ltv_pct": <value>
}

Sample queries

Find DTI-stretched applications near approval boundary

Returns mortgage applications where DTI is between 43% and 50% — the band where automated underwriting decisions become marginal and manual underwriting overrides matter. Useful for testing the firm's manual-underwriting decision documentation.

households.filter(h =>
  h.events.life_events.some(e =>
    e.type === 'mortgage_application' &&
    e.status === 'pending') &&
  h.credit.dti_ratio >= 0.43 &&
  h.credit.dti_ratio < 0.50
)
Identify modification-eligible distressed borrowers

Returns households currently behind on mortgage payments AND meeting the eligibility criteria for at least one modification program (FNMA flex, FHA-HAMP, VA partial claim) — the work queue for servicer outreach.

households.filter(h =>
  h.liabilities.mortgage.payments_past_due >= 1 &&
  h.liabilities.mortgage.modification_eligibility_flags
    .some(f => f === 'FNMA_flex' || f === 'FHA_HAMP' ||
               f === 'VA_partial_claim')
)
Surface underwater households for short-sale outreach

Returns households whose current LTV exceeds 100% (the underwater threshold) — the population for whom short-sale outreach, principal-reduction modification, or strategic-default counseling is appropriate.

households.filter(h =>
  h.real_estate.ltv_pct > 1.0 &&
  h.real_estate.ownership_status === 'primary'
)
Identify gig-worker income-documentation cases

Returns mortgage applicants whose primary income is non-W-2 (1099 contractor, self-employed, K-1 distribution, royalty) — the documentation cases that require alternative-income-verification logic.

households.filter(h =>
  h.events.life_events.some(e =>
    e.type === 'mortgage_application') &&
  ['1099', 'self_employed', 'K-1', 'royalty']
    .includes(h.income.primary_source_type)
)

Methodology

Each household's mortgage profile is generated against archetype-specific patterns. Young dual-income couples are weighted toward conforming conventional loans with 15-20% down. First-time homebuyers (MB-01) carry realistic FHA, VA, USDA, or DPA-program structures with the higher LTV and DTI typical of first-time-buyer programs. Distressed mortgages (MB-02) have realistic delinquency patterns and modification-eligibility profiles. Underwater scenarios are seeded at realistic frequencies for declining housing-market regimes. DTI calculations use the actual income-source-specific add-back rules: depreciation add-back for self-employed Schedule C income; K-1 distribution treatment that varies by guarantor type; income-volatility-adjusted documentation for 1099 contractors. The corpus passes the WealthSynth consistency validator (DTI/LTV math is correct; modification-eligibility flags fire when underlying conditions are met; income-documentation completeness is internally consistent) and the LLM-as-judge gate. Annual refresh tracks FNMA / FHLMC eligibility changes, FHA / VA / USDA loan-limit updates, and any major-modification-program statutory changes.

Included Archetypes (4)

Frequently asked questions

Are loan limits current?+

Yes. The corpus uses current-year FNMA / FHLMC conforming loan limits, FHA loan limits by county, VA loan entitlement, and USDA income eligibility thresholds. Annual refresh updates against the FHFA's annual conforming loan limit announcement.

How are non-W-2 income types represented?+

Each non-W-2 income type has the realistic structured-field set: 1099 contractors carry 2 years of 1099 income amounts plus year-to-date 1099 receipts and reconciliation flags against deposit history; self-employed have Schedule C income, business income, and year-to-date P&L fields; K-1 recipients carry 2 years of K-1 distribution amounts plus the underlying entity's pass-through fields; royalty income carries the receipt history plus the licensing-agreement metadata. The Methodology PDF documents each pattern.

Are ITIN mortgage cases handled?+

Yes. About 8% of the corpus is ITIN-mortgage applications using alternative documentation. The structured ITIN-filer mortgage data includes the borrower's ITIN, alternative-income fields (foreign tax-return amounts where applicable, employer-letter flags, deposit-history balances), and the realistic LTV / DTI / down-payment requirements typical of ITIN-mortgage programs.

How are modification programs structured?+

Distressed-mortgage households have the structured modification-eligibility decision data for the major programs: FNMA flex modification, FHA-HAMP, VA partial claim, and proprietary servicer modifications. The eligibility-flag fields fire correctly based on the household's specific situation (delinquency status, income trajectory, hardship documentation).

Are forbearance scenarios included?+

Yes. About 15% of the corpus is currently in or recently exited forbearance. Forbearance entries include the entry date, the duration, the COVID-related vs. non-COVID-related classification, and the post-forbearance resolution path (deferral, modification, repayment plan, or rolled into delinquency).

How are gift funds for down payment handled?+

About 22% of first-time-buyer applications include gift funds for down payment. The structured gift-funds data includes the donor relationship, the gift letter, and the donor's source-of-funds documentation — the cases that require additional underwriting scrutiny per FNMA / FHA / VA gift-fund rules.

Does the corpus include real estate investor mortgages (DSCR loans)?+

About 8% of the corpus is real estate investor borrowers with DSCR (debt-service-coverage-ratio) loans on rental properties. These are non-QM products with property-cash-flow-based qualification rather than borrower-income qualification. The structured DSCR data lets your tools test the property-cash-flow underwriting logic specifically.

How does this fit alongside B04 (Cash Flow Stress Test)?+

B04 is broader — household cash flow across all liability types with 96-month longitudinal trajectories. B13 is mortgage-specific — origination, servicing, and modification logic for the housing-loan product surface. Mortgage tech builders typically buy B13; broader cash-flow planning engines buy B04. Many lenders buy both — B04 for the cash-flow-trajectory side, B13 for the mortgage-product specifics.

Related Wealth Data Sets

$5,000
one-time purchase
90 households (ZIP)
Methodology PDF
JSON, CSV formats
Account required to purchase

Purchases are for internal use only. Redistribution or resale of data is prohibited under the WealthSchema Data License.

View data license →