wealthschemaresourcesthemesAggregator & Custodian Integration — Test Data That Survives the Round Trip
Theme

Aggregator & Custodian Integration — Test Data That Survives the Round Trip

Every wealth-tech platform has an integration layer. Most ship with mock data that quietly assumes the integration always works. Production breaks on the round-trip.

Updated May 9, 20264 min read

The single largest hidden surface in any wealth-tech platform is the integration layer. The product UI shows positions, balances, and performance. The integration layer is what populates them — pulling from one or more aggregators (Plaid, Yodlee, Akoya, MX), one or more direct custodian feeds (Schwab, Fidelity, Pershing, BNY Mellon), or both, then reconciling the streams into a single household record. Every line of integration code is a place where realistic test data can find a bug that mock data hides.

This theme covers what good integration test data looks like: the aggregator-output shapes, the custodian-direct quirks, the ACATS transfer flow, and the reconciliation contract that the two streams have to satisfy when both feed the same account.

US households w/ aggregator link
60%+
Plaid claims ~12,000 institutions and 1 in 4 US adults; Yodlee/Akoya/MX cover the long tail
ACATS transfers / yr
~9.5M
DTCC reported volume; partial transfers are the majority and the harder test case
Reconciliation breaks / 1k accounts
5–20
Typical aggregator-vs-custodian discrepancy rate on actively-traded brokerage accounts
FDX-conformant institutions
~95%
By account count — but the spec has optional fields and conformance levels that aren't always met

Why integration is its own data problem

A naive view: an aggregator returns positions and balances; the platform stores them; that's the data layer. The view falls apart on the second case. Aggregators normalize across institutions, but the normalization is lossy — Plaid's investments product has a different schema from Yodlee's wealth product, both lose information present in the underlying custodian feed, and neither captures the lot-level data needed for any tax-aware feature. Direct custodian feeds (FIX, FpML, OFX, NACHA, custodian-specific REST/SFTP) are richer but heterogeneous; reconciling Pershing's daily file with Schwab's intraday API requires hand-written translation logic per source.

The integration layer is where the platform decides how much of this complexity to surface to its own engine. Most platforms decide too late, then have to refactor when the first major customer asks for cross-custodian tax-loss harvesting or the first audit reveals that aggregator-fed positions don't reconcile with custodian-direct feeds for the same account.

Realistic synthetic test data is how you find this before production. The four pieces below cover the major dimensions.

 Failure classWhat breaksWhere it matters
Aggregator-output driftSchema differences across Plaid/Yodlee/Akoya/MX surface as silent type mismatches; account-type taxonomy is not portableOnboarding flows, position aggregation, multi-aggregator deployments
ACATS edge casesPartial transfers, in-kind lots, fractional shares, transfer-during-corporate-action all produce inconsistent state during the 5-7 business day settlement windowNew-account funding, account migration, broker-to-broker transfers
Custodian-specific quirksAccount-number formats, statement frequency, lot-relief defaults, intra-month vs end-of-month snapshots all differ; mock data treats them as uniformMulti-custodian platforms, RIA aggregation, family-office consolidation
Reconciliation breaksAggregator and custodian disagree on the same account; duplicate-account detection fails; the institution_id taxonomy fragmentsAny platform that uses both aggregator and direct feeds, which is most institutional ones

The four pieces under this theme

Modeling aggregator outputs

Modeling Plaid, Yodlee, Akoya, and MX outputs in synthetic households is the schema-level walkthrough of the four major aggregators' output shapes, what they share, where they diverge, and what your mock data has to look like to exercise an aggregator-driven onboarding flow honestly. Includes the FDX-conformance question and what changes when you migrate from a non-FDX aggregator to an FDX one.

ACATS modeling

ACATS modeling: partial transfers, in-kind lots, settlement-window traps is the deep dive on the Automated Customer Account Transfer Service flow. Full vs. partial transfers, how in-kind lots preserve their cost basis and acquisition date through the transfer, what happens when a corporate action lands during the settlement window, and the rejection codes your engine has to handle.

Custodian quirks

Custodian-specific data quirks: Schwab, Fidelity, Pershing, BNY Mellon covers the four custodians most wealth-tech platforms have to integrate with directly. Account-number formats, lot-relief defaults, statement-cycle gotchas, the post-Schwab/TDA reconciliation problem, and what realistic test data has to model per custodian.

Reconciliation

Reconciling aggregator output with custodian source-of-truth is the cross-stream piece. When both an aggregator and a direct custodian feed populate the same account, the two will disagree on edge cases — fractional-share rounding, intraday vs. end-of-day balances, distribution-character classification. The article covers the reconciliation contract and the test-data requirements for exercising it.

The methodology comparison

Aggregator API vs. direct custodian feed is the procurement-side comparison: when each approach belongs in a wealth-tech stack, where the cost-and-fidelity tradeoff lands for different use cases, and the hybrid pattern most institutional platforms converge on.

Supporting glossary terms

  • ACATS — the DTCC-operated transfer service that moves brokerage accounts between firms in 5–7 business days.
  • DTC — the Depository Trust Company, the central securities depository that holds the actual share certificates and processes corporate actions on aggregate.
  • ACH — the Automated Clearing House network that moves cash between bank accounts and is the backbone of most non-wire money movement.
  • FDX — the Financial Data Exchange, the open standard that aggregators and institutions use to exchange consumer financial data via tokenized access.
  • Aggregator API — the API category covering Plaid, Yodlee, Akoya, MX, and others — third-party services that normalize data across thousands of financial institutions.
  • Tokenized account access — the OAuth-style authorization model that's replacing screen-scraping for aggregator-to-institution data flow.

Where this connects

Integration testing intersects with several other content threads: