wealthschemaresourcesarticlesStress-testing a mortgage origination engine — synthetic borrowers, synthetic properties, real failure modes
Article

Stress-testing a mortgage origination engine — synthetic borrowers, synthetic properties, real failure modes

The borrower with three 1099 income streams across four state filings, buying a manufactured home in a CRA-assessment area, with a guaranteeing co-borrower. That's not a fringe case — it's a real percentage of every mortgage lender's intake.

WealthSchema StaffLending & risk modelingMay 9, 20261 min read

A mortgage origination engine has to evaluate the borrower (income, debt, credit, employment), the property (appraisal, type, location, encumbrances), the loan structure (LTV, DTI, ratio compliance, program eligibility), and the regulatory overlay (TRID, RESPA, ECOA, HMDA, state-specific rules) — usually within a 24-72 hour timeline from application to commitment letter. The engine has to be both accurate and fast, with regulator-defensible decisions.

The typical test corpus for a mortgage engine reflects the typical borrower: W-2 income from a single employer, single-state residence and employment, single-family residence in good condition, conventional conforming loan. Real intake is much messier than that. This article is the working note on the messier scenarios — and the synthetic-data shape needed to exercise them.

What the engine has to do

A mortgage engine's decision pipeline:

  1. Stage 1
    Application intake and pre-qualification
    Receive 1003 (Uniform Residential Loan Application). Validate completeness. Compute initial DTI, LTV, eligibility for programs.
  2. Stage 2
    Documentation collection and verification
    Pay stubs, W-2s, bank statements, tax returns. Employment verification. Asset verification. Each document type has parsing and reconciliation logic.
  3. Stage 3
    Property evaluation
    Appraisal ordering, review, and reconciliation. Title search. Property type classification (SFR / condo / manufactured / multi-family / mixed-use).
  4. Stage 4
    Underwriting decision
    Compute final ratios, evaluate program eligibility (conventional / FHA / VA / USDA / non-QM), apply guideline overlays, produce approve / counter-offer / decline decision.
  5. Stage 5
    Disclosure and closing
    TRID-compliant disclosures (Loan Estimate, Closing Disclosure). Tolerance compliance. Final closing instructions.
  6. Stage 6
    Funding and post-closing
    Funding instructions to title/escrow. Post-closing audit, HMDA reporting, secondary market sale.

Each stage is its own engine. A complete test corpus has to exercise all of them.

The borrower edge cases

The borrower scenarios that produce production failures:

Borrower edge cases

  • Self-employed / 1099 with two-year averaging — income is highly variable. Engine has to compute the lower-of-current-year or two-year-average rule. Recent declines should reduce calculated income, not be averaged away.
  • Gig-platform income (Uber, DoorDash, Upwork) — 1099-NEC with no employer letter possible. Verification flow differs from W-2.
  • Multiple income streams — W-2 plus 1099 plus K-1 plus rental. Each income type has its own qualifying-income calculation. Engines that sum naively over-state qualifying income.
  • Bonus / commission income — qualifying income requires 2-year history minimum. Engines that include current-year bonus without history are non-compliant with most agency guidelines.
  • ITIN-only filers — apply with Individual Taxpayer Identification Number rather than SSN. Engines that assume SSN-shaped IDs reject these. ITIN-only mortgage products exist but require specific underwriting paths.
  • Asset-only qualification — borrowers with substantial assets but limited income (retirees, between jobs). Asset-depletion calculations are program-specific.
  • Foreign nationals — non-resident aliens, foreign-employer income, foreign-currency holdings. Specialized programs exist; verification is non-trivial.
  • Recently-bankrupt or foreclosed borrowers — Chapter 7 discharged 4+ years ago is conventionally lendable; foreclosure 3+ years ago is conventionally lendable. Engines that auto-decline are non-compliant with fair-lending guidelines.
  • First-time homebuyers in CRA areas — eligible for special programs (down-payment assistance, lower rates). Engine has to identify CRA applicability and route to the appropriate path.
  • Cosigner / non-occupant co-borrower structures — common in student-loan-pressed first-time buyer scenarios. Income aggregation rules vary by program.

The property edge cases

Property scenarios that produce production failures:

Property edge cases

  • Manufactured / modular homes — different appraisal requirements, different program eligibility (some programs exclude these).
  • Mixed-use property — owner-occupied with commercial component (typical bodega-with-apartment setup). Loan-to-value calculations differ; programs may not allow.
  • Condo with non-warrantable status — if HOA financials, owner-occupancy ratios, or insurance don't meet agency standards, the property may not be saleable to Fannie/Freddie. Engine has to identify before commitment.
  • Properties with deed restrictions — affordable-housing covenants, age-restricted communities, ground leases. Each has program implications.
  • Properties in flood zones — flood insurance required; affordability impact. Engine has to compute total housing payment including flood insurance.
  • Non-arm's-length transactions — purchase from family, employer-mediated transactions. Different documentation requirements.
  • Rural USDA-eligible properties — eligibility map check. Income-eligible as well as property-eligible.
  • Investment properties — different LTV / reserves requirements. Rental income contribution to qualification varies by program.
  • Multi-unit (2-4 unit) owner-occupied — eligible for primary-residence programs but requires multi-unit-specific calculations.

The compliance overlay

Beyond underwriting accuracy, the engine has to comply with a thicket of regulations:

 RegulationWhat it requiresCommon engine failure
TRID (Reg Z + RESPA combined disclosure)Loan Estimate within 3 days of application; Closing Disclosure 3 days before closing; tolerance compliance on cost categoriesTolerance violations in re-disclosed Loan Estimates
ECOA / Reg BNo discrimination on protected basis; adverse-action notices with principal reasons; HMDA reportingDisparate-impact patterns across geographic / ethnic dimensions
TILA / QM rulesDTI < 43% for QM safe harbor; rebuttable presumption above; ATR (ability to repay) requirementsNon-QM loans extended without proper ATR documentation
RESPA Section 8No kickbacks for referrals; affiliated business arrangement disclosuresMarketing arrangements that drift into kickback territory
HMDA reportingPer-application reporting of demographic, loan-amount, decision data; LAR (loan-application register) submissionLAR data quality issues that produce HMDA fair-lending audit findings
State licensing and rate capsState-by-state lending license; usury caps; consumer protection statutes (NY DFS, California DBO, etc.)Rate quotes above state caps; product offerings unauthorized in particular states

A test corpus has to include scenarios that exercise each regulatory path. An engine that's underwritingly correct but TRID-non-compliant ships disclosures that produce CFPB findings.

What a working test corpus looks like

A 2,000-loan stress-test corpus, distributed roughly:

  • 60% nominal applicants spanning the realistic credit, income, and property distribution
  • 25% borrower edge cases covering the inventory above
  • 10% property edge cases covering the inventory above
  • 5% adversarial / compliance test cases specifically engineered to exercise validation gates

Each loan in the corpus has full documentation: 1003, pay stubs, W-2s, tax returns, bank statements, employment verification, appraisal, title commitment, property data. The synthetic data has to be document-grade — engine code parses documents, not abstract records, and document-level synthesis is the surface area where most bugs ship.

Key takeaways

  • Mortgage origination has 6 stages (intake, doc collection, property evaluation, underwriting, disclosure, closing/post-close) each with its own engine. A complete stress test exercises all 6.
  • Borrower edge cases include self-employed two-year-averaging, multiple income streams, ITIN filers, asset-only qualification, foreign nationals, recently-bankrupt, and first-time CRA buyers. Each is 1-10% of real intake.
  • Property edge cases include manufactured homes, mixed-use, non-warrantable condos, deed-restricted, flood-zone, non-arm's-length, USDA-eligible, investment, and multi-unit. Each requires distinct underwriting paths.
  • Compliance overlay is real — TRID, ECOA, TILA QM, RESPA, HMDA, state licensing. Engines that are underwriting-correct but compliance-non-compliant ship disclosures that produce CFPB findings.
  • Test corpus has to be document-grade (1003 + pay stubs + W-2s + tax returns + bank statements + appraisal) and include 5% specifically adversarial / compliance test cases to exercise validation gates.

Frequently asked questions

How does the test corpus differ for non-QM lenders vs. agency lenders?+
Substantially. Non-QM lenders see more self-employed and asset-only borrowers, more investment property scenarios, more credit-event-after-bankruptcy cases. The non-QM corpus should over-represent these (40%+ of cases vs 5-10% in agency-focused). Agency lenders see more first-time homebuyers, more conforming-loan-amount cases, more CRA-area scenarios. Both corpora need the regulatory-overlay scenarios; they differ in the borrower / property mix.
What about the appraisal portion — can synthetic data realistically test appraisal review?+
Partially. Synthetic appraisals can test reconciliation logic (does the engine notice when the appraisal value is below the contract price), bracketing logic (are comparable sales within reasonable range), and adjustment logic. They don't test the visual review of property photos or the field-judgment portion of underwriter review. Hybrid testing is the right call: synthetic for the structured portion, real (anonymized) for the visual portion if the engine extends there.
How does this scope change for a wholesale / correspondent lender vs a retail lender?+
The retail vs wholesale distinction affects the intake stage and the documentation flow but not the underlying underwriting and compliance engines. A wholesale lender's test corpus needs to include third-party-originator (TPO) documentation patterns and broker-relationship compliance rules. The underwriting engine portions transfer cleanly between channels.
What about second-lien / HELOC products — same engine?+
Different program rules but same architectural approach. HELOCs have specific draw-period mechanics, variable-rate disclosures (Reg Z Section 226.6), and lien-position considerations. Test corpora for HELOC engines need scenarios with varied draw patterns, rate-reset events, and combined LTV calculations including the first lien.