Aggregator API vs. Direct Custodian Feed: Which Powers Your Integration?
Every wealth-tech platform has to make this decision early — and most platforms make it more than once as they scale. The choice between aggregator APIs and direct custodian feeds shapes the cost structure, the data fidelity, the customer-onboarding friction, and the engineering surface area. Get it wrong and you're either rebuilding your integration layer 18 months in (when the consumer-grade aggregator data turns out to be insufficient for tax-aware features) or paying for direct-custodian relationships you don't need (when single-aggregator coverage would have been fine for the use case). This comparison walks through the tradeoffs and the hybrid pattern most institutional platforms ultimately converge on.
The two options
Aggregator API (Plaid, Yodlee, Akoya, MX)
Third-party data services that normalize across thousands of financial institutions. The customer authorizes the aggregator to access their accounts; the aggregator returns normalized data via a single API. Coverage is broad (typically 12,000+ institutions); fidelity is constrained by the aggregator's normalization layer.
- Broad institution coverage — Plaid, Yodlee, MX each cover thousands of US institutions; cross-institutional household aggregation works without per-institution integration
- Low integration cost per institution added — the aggregator handles the per-institution work; you write one integration
- Consumer-friendly onboarding — OAuth-style account linking, recognized brand trust, broad UX patterns
- FDX adoption growing — tokenized access via FDX-conformant institutions improves security and reliability over screen-scraping
- Suitable for use cases where breadth matters more than depth — financial-planning aggregation, household net-worth, broad budgeting
- Lossy normalization — the aggregator's schema is portable but loses fidelity present in the underlying institution feed (lot-level basis, distribution character, full corporate-action history)
- Variable per-institution data quality — same aggregator might surface high-fidelity data from one institution and low-fidelity data from another; you don't always control which
- Per-account or per-call pricing — costs scale with usage and can become significant for high-frequency refresh patterns or large customer bases
- Token reliability — even with FDX, token expiry and re-auth flows are part of operational reality and are higher-friction than direct feeds
- No bidirectional flow — aggregators are read-only; cannot place trades or initiate transfers
Choose aggregator-first when: (1) breadth matters more than depth — your use case is household-level aggregation, financial planning, or budgeting where lot-level fidelity isn't required; (2) you need to support many institutions cheaply — direct integrations with each would be cost-prohibitive; (3) consumer-friendly onboarding is a primary requirement; (4) the customer base is heterogeneous and you can't standardize on a few institutions; or (5) your roadmap is read-only and bidirectional flow isn't needed.
Direct custodian feed
Direct integration with a specific custodian (Schwab, Fidelity, Pershing, BNY) via that custodian's own data API or batch file feed. Coverage is limited to that one custodian; fidelity is institutional-grade with lot-level basis, full corporate-action history, and tax-document data.
- Highest fidelity — lot-level basis, distribution character, corporate-action history, tax-document data all available; the right shape for tax-aware features
- Bidirectional flow possible — most custodian APIs support trade execution, transfer initiation, and account opening (subject to relationship and approval)
- Predictable per-account economics — direct relationships typically have flat or volume-tiered fees vs. per-call pricing
- Lower latency on critical flows — trade-execution and balance-update latencies are institutional-grade, not aggregator-batched
- Operational reliability — direct feeds have institutional SLAs; aggregator outages affect read access but not platform-issued instructions
- Single-custodian coverage per integration — you build one integration per custodian, and most platforms need 4+ to serve real customer bases
- High integration cost per custodian — each custodian has its own API, conventions, account-opening process, and certification requirements; weeks-to-months per integration
- Custodian-relationship requirement — direct feeds typically require a clearing or custody relationship, which has its own commercial and operational complexity
- Customer-acquisition friction — onboarding a new customer at a custodian you don't have a direct feed for either fails or falls back to aggregator
- Long-tail coverage gap — direct feeds cover the major custodians; smaller institutions, credit-union brokerages, and international institutions remain outside the direct-feed network
Choose direct-custodian when: (1) fidelity matters more than breadth — your platform does tax-aware features (TLH, gain-loss harvesting, charitable giving optimization) where lot-level data is required; (2) you serve an institutional or RIA customer base concentrated on a few custodians; (3) you need bidirectional flow (trade execution, transfer initiation, account opening); (4) the per-customer revenue justifies the integration investment; or (5) your platform is the system-of-record for the customer's wealth and the aggregator's lossy normalization isn't acceptable.
Decision framework
The decision usually reduces to four questions: how broad is your customer-institution distribution, how deep does the data have to be, do you need bidirectional flow, and how does the per-customer economics work?
If customer institutions are broadly distributed (consumer fintech with users at hundreds of small banks and credit unions), aggregator-first is the only practical answer. Building 100+ direct integrations is a multi-year program; an aggregator gets you there in a quarter.
If data depth requirements are heavy — lot-level basis, distribution character, tax-document depth — aggregator-only ships with a known correctness gap on tax-aware features. You can defer the gap, but a wealth-tech roadmap that includes TLH, charitable optimization, or sophisticated retirement planning will eventually need direct custodian feeds.
If bidirectional flow is on the roadmap (trade execution, transfer initiation, account opening), aggregators alone won't get you there. Direct custodian relationships are required.
If per-customer revenue is high (institutional, HNW, RIA) and customer institutions concentrate at a few custodians, direct feeds are economically tractable and provide the fidelity those customers expect. If per-customer revenue is low (consumer, mass-affluent) and institutions are scattered, aggregator economics are the only way the math works.
The hybrid pattern is what most institutional platforms converge on: direct feeds for the top 4–6 custodians (typically covering 70–85% of customer accounts by value), with an aggregator filling in the long tail. The platform's data model has to handle [reconciliation between the two](/articles/reconciling-aggregator-vs-source-of-truth) — the same household commonly has accounts at custodians-with-direct-feed and at custodians-via-aggregator, and the platform has to merge the streams coherently.
A detail that often gets missed: the choice has knock-on effects on the synthetic test data you need. Aggregator-only platforms can be tested with aggregator-shape mock data. Direct-custodian platforms need [custodian-specific test shapes](/articles/custodian-data-quirks-test-data). Hybrid platforms need both, plus the dual-source reconciliation cases. Mock-data tools that produce only one shape force the platform to test only one path; realistic synthetic data has to cover all the paths the platform will actually serve.
Bottom line
Aggregator-first for breadth and consumer onboarding; direct custodian for depth and bidirectional flow; hybrid for institutional platforms at scale. The single most-consequential question is whether the use case requires lot-level fidelity — if yes, aggregator-only will eventually be insufficient and the direct-custodian work is in your future. If no, aggregator alone is often enough. The WealthSynth catalog supports all three patterns with [aggregator-view overlays](/articles/modeling-aggregator-outputs-plaid-yodlee-akoya-mx) and [custodian-shape projections](/articles/custodian-data-quirks-test-data) per household, and dual-source reconciliation cases for the hybrid pattern.
FAQ
Can I switch from aggregator to direct custodian feeds later?+
Yes — and most institutional platforms do, in stages. The migration path is typically: start aggregator-only, prove the use case, identify the top 3–5 custodians that account for the majority of customer accounts, build direct integrations there, fall back to aggregator for the long tail. The migration takes 12–24 months and requires the platform's data model to handle dual-source reconciliation throughout.
Is FDX going to make the choice irrelevant?+
Eventually for some use cases, not for others. FDX-conformant institutions provide tokenized access via standard schemas, which closes part of the aggregator-vs-direct fidelity gap. But FDX defines optional fields and conformance levels; even Phase 4 conformance doesn't guarantee lot-level depth. For the next several years, direct custodian feeds will continue to provide depth that aggregator-via-FDX doesn't, especially on legacy accounts and edge cases.
What about the European Open Banking model — is that coming to wealth?+
Slowly. The CFPB's Section 1033 rule (finalized 2024) is the closest US analog and primarily covers consumer banking data; investment-account scope is limited. Industry direction is FDX-aligned tokenized access rather than a regulator-mandated open banking equivalent. Wealth platforms should plan around FDX evolution rather than an EU-style open banking arrival.
How do I evaluate aggregator vendors against each other?+
Three dimensions: (1) institution coverage relevant to your customer base — Plaid is broad, Yodlee is wealth-deep, Akoya is FDX-native, MX is enrichment-focused; (2) data fidelity per institution — request sample data for the top 5 institutions you care about, examine lot-level depth, distribution character, corporate-action history; (3) operational characteristics — refresh cadence, token-management UX, error-handling patterns, support for the use cases you're building. Most institutional platforms end up using two aggregators in production for resilience and coverage diversity.
What's the per-customer cost difference?+
Highly variable. Aggregator pricing is typically per-active-link per month or per-API-call, with negotiated volume discounts; mid-tier per-customer cost ranges from ~$0.50 to ~$5 per month depending on use case and volume. Direct custodian feeds typically have flat platform fees (often $50K-$500K annually) plus per-account or per-asset fees that scale with AUM; the per-customer cost is much lower at scale but the fixed cost is higher. The crossover point is typically around 10K-30K customers depending on revenue per customer.
Can synthetic data simulate aggregator vs. direct feed disagreements?+
Yes — and the realistic test corpus has to include the disagreements, not just clean parallel views. The WealthSynth institutional bundles include households with paired aggregator-and-custodian feeds for the same accounts, with deliberately calibrated disagreement rates (5–10% pending-vs-settled mismatches, fractional-share rounding gaps, distribution-character classification differences). This is the test shape the [aggregator-vs-source-of-truth reconciliation logic](/articles/reconciling-aggregator-vs-source-of-truth) actually needs.