The single largest hidden surface in any wealth-tech platform is the integration layer. The product UI shows positions, balances, and performance. The integration layer is what populates them — pulling from one or more aggregators (Plaid, Yodlee, Akoya, MX), one or more direct custodian feeds (Schwab, Fidelity, Pershing, BNY Mellon), or both, then reconciling the streams into a single household record. Every line of integration code is a place where realistic test data can find a bug that mock data hides.
This theme covers what good integration test data looks like: the aggregator-output shapes, the custodian-direct quirks, the ACATS transfer flow, and the reconciliation contract that the two streams have to satisfy when both feed the same account.
Why integration is its own data problem
A naive view: an aggregator returns positions and balances; the platform stores them; that's the data layer. The view falls apart on the second case. Aggregators normalize across institutions, but the normalization is lossy — Plaid's investments product has a different schema from Yodlee's wealth product, both lose information present in the underlying custodian feed, and neither captures the lot-level data needed for any tax-aware feature. Direct custodian feeds (FIX, FpML, OFX, NACHA, custodian-specific REST/SFTP) are richer but heterogeneous; reconciling Pershing's daily file with Schwab's intraday API requires hand-written translation logic per source.
The integration layer is where the platform decides how much of this complexity to surface to its own engine. Most platforms decide too late, then have to refactor when the first major customer asks for cross-custodian tax-loss harvesting or the first audit reveals that aggregator-fed positions don't reconcile with custodian-direct feeds for the same account.
Realistic synthetic test data is how you find this before production. The four pieces below cover the major dimensions.
| Failure class | What breaks | Where it matters | |
|---|---|---|---|
| Aggregator-output drift | Schema differences across Plaid/Yodlee/Akoya/MX surface as silent type mismatches; account-type taxonomy is not portable | Onboarding flows, position aggregation, multi-aggregator deployments | |
| ACATS edge cases | Partial transfers, in-kind lots, fractional shares, transfer-during-corporate-action all produce inconsistent state during the 5-7 business day settlement window | New-account funding, account migration, broker-to-broker transfers | |
| Custodian-specific quirks | Account-number formats, statement frequency, lot-relief defaults, intra-month vs end-of-month snapshots all differ; mock data treats them as uniform | Multi-custodian platforms, RIA aggregation, family-office consolidation | |
| Reconciliation breaks | Aggregator and custodian disagree on the same account; duplicate-account detection fails; the institution_id taxonomy fragments | Any platform that uses both aggregator and direct feeds, which is most institutional ones |
The four pieces under this theme
Modeling aggregator outputs
Modeling Plaid, Yodlee, Akoya, and MX outputs in synthetic households is the schema-level walkthrough of the four major aggregators' output shapes, what they share, where they diverge, and what your mock data has to look like to exercise an aggregator-driven onboarding flow honestly. Includes the FDX-conformance question and what changes when you migrate from a non-FDX aggregator to an FDX one.
ACATS modeling
ACATS modeling: partial transfers, in-kind lots, settlement-window traps is the deep dive on the Automated Customer Account Transfer Service flow. Full vs. partial transfers, how in-kind lots preserve their cost basis and acquisition date through the transfer, what happens when a corporate action lands during the settlement window, and the rejection codes your engine has to handle.
Custodian quirks
Custodian-specific data quirks: Schwab, Fidelity, Pershing, BNY Mellon covers the four custodians most wealth-tech platforms have to integrate with directly. Account-number formats, lot-relief defaults, statement-cycle gotchas, the post-Schwab/TDA reconciliation problem, and what realistic test data has to model per custodian.
Reconciliation
Reconciling aggregator output with custodian source-of-truth is the cross-stream piece. When both an aggregator and a direct custodian feed populate the same account, the two will disagree on edge cases — fractional-share rounding, intraday vs. end-of-day balances, distribution-character classification. The article covers the reconciliation contract and the test-data requirements for exercising it.
The methodology comparison
Aggregator API vs. direct custodian feed is the procurement-side comparison: when each approach belongs in a wealth-tech stack, where the cost-and-fidelity tradeoff lands for different use cases, and the hybrid pattern most institutional platforms converge on.
Supporting glossary terms
- ACATS — the DTCC-operated transfer service that moves brokerage accounts between firms in 5–7 business days.
- DTC — the Depository Trust Company, the central securities depository that holds the actual share certificates and processes corporate actions on aggregate.
- ACH — the Automated Clearing House network that moves cash between bank accounts and is the backbone of most non-wire money movement.
- FDX — the Financial Data Exchange, the open standard that aggregators and institutions use to exchange consumer financial data via tokenized access.
- Aggregator API — the API category covering Plaid, Yodlee, Akoya, MX, and others — third-party services that normalize data across thousands of financial institutions.
- Tokenized account access — the OAuth-style authorization model that's replacing screen-scraping for aggregator-to-institution data flow.
Where this connects
Integration testing intersects with several other content threads:
- Time-Series Fidelity in Synthetic Wealth Data — the time-series properties that aggregator and custodian feeds carry differently.
- Onboarding fintech clients with no PII exposure — the playbook for using synthetic data in pre-production integration testing.
- Migrate prod to synthetic — the migration pattern most teams adopt when their integration stack outgrows production-data testing.
- Lot-level basis tracking data model — the lot-level fidelity that aggregator outputs typically lack and direct custodian feeds carry.
- Modeling corporate actions in synthetic portfolios — the corporate-action handling that interacts with ACATS settlement windows.