Term

Aggregator API

Published May 9, 2026

Definition

An aggregator API is a third-party service that connects to thousands of financial institutions on behalf of consuming applications, normalizes the data into a portable schema, and presents a single API. The major US aggregator APIs — Plaid, Yodlee, Akoya, MX — together cover essentially every US financial institution by account count.

The aggregator-API category exists because direct integration with every relevant financial institution is prohibitively expensive for most consuming applications. An aggregator absorbs the per-institution work — credential or token management, screen-scraping or API integration, schema normalization, error handling, support — and presents a uniform interface. Building an integration that covers 12,000+ institutions in three months is essentially impossible without an aggregator; building it via aggregator is a routine engineering project.

The four major US aggregators have differentiated positions. **Plaid** is the largest by consumer-app integration count, with the broadest institution coverage and the most-adopted developer-facing API; the wealth/investments product is younger than the banking product and reflects that maturity gap. **Yodlee** (an Envestnet subsidiary) is the longest-established US aggregator, with the deepest wealth-management feature set — better lot-level coverage, more granular account-type taxonomy, more support for alternatives — at the cost of higher integration complexity. **Akoya** is bank-consortium-backed (Fidelity, US Bank, Wells), FDX-native, and routes through institutions' own published APIs rather than screen-scraping; the result is high-fidelity data on a smaller institution footprint. **MX** is enrichment-focused — categorization, transaction labeling, normalization layered on top of the raw aggregator data — with strong credit-union heritage.

For wealth-tech platforms, the aggregator decision shapes everything downstream. Aggregator-only platforms have a known fidelity ceiling on tax-aware features (no consistent lot-level basis, no consistent distribution-character classification). Direct-custodian platforms have a known coverage ceiling (the major custodians, no long tail). Hybrid platforms — aggregator for breadth, direct for depth — are where most institutional platforms ultimately land, and the integration layer for those platforms has to handle [aggregator-vs-custodian reconciliation](/articles/reconciling-aggregator-vs-source-of-truth) as a routine concern.

	Plaid	Yodlee	Akoya	MX
Position-level data	Yes	Yes	Yes	Yes
Lot-level basis	Inconsistent	Often	Inst-dep	Inconsistent
FDX conformance	Mostly	Hybrid	Native	Aligned
Best fit	Consumer breadth	Wealth depth	Bank consortium	Enrichment

Why this matters for synthetic data

Aggregator-API-aware synthetic data needs schemas projected through each major aggregator's shape — the same canonical household JSON re-projected as Plaid output, Yodlee output, Akoya FDX-shape output, MX output. The differences between the projections (account-type taxonomy, lot-level depth, distribution-character classification, pending-vs-settled conventions) are themselves the test surface; a corpus that produces only one shape can't exercise multi-aggregator integration.

Common pitfalls

Hard-coding to one aggregator's schema — when the platform expands to a second aggregator, the schema differences require non-trivial rework. Designing the data layer aggregator-agnostic from the start saves months of refactoring.
Treating aggregator data as fully equivalent to custodian-direct data — the lot-level and distribution-character gaps are real and matter for tax-aware features.
Not testing the token-management lifecycle — token expiry, revocation, re-auth flows are operational reality and have customer-facing UX implications.
Underestimating per-call costs — aggregator pricing typically scales with API call volume; high-frequency refresh patterns can produce surprising bills.

Examples

Multi-aggregator household

A household with accounts at Schwab (linked via Plaid), Fidelity (linked via Akoya FDX), and a regional credit union (linked via MX) is reflected as three distinct aggregator views with different account-type taxonomy mappings, different lot-level depths, and different transaction-categorization labels. The platform's data model has to merge these into a single household record while preserving the per-source provenance for reconciliation.

Frequently asked questions

Should I use one aggregator or multiple?+

Multiple, in most institutional cases. Reasons: (1) coverage diversity — no single aggregator covers every institution your customers use; (2) operational resilience — aggregator outages do happen; multi-aggregator routing lets the platform fall over gracefully; (3) per-institution fidelity — the right aggregator for any given institution depends on the institution; (4) negotiation leverage — having two aggregators in production reduces vendor lock-in. The cost is integration complexity; most institutional platforms run two aggregators concurrently.

What does aggregator pricing look like?+

Highly variable. Plaid is typically per-active-link per month or per-API-call, with negotiated volume discounts; Yodlee is more bespoke, often with platform-fee plus per-account components; Akoya is per-institution fee plus per-call; MX is bespoke. Per-customer cost ranges from ~$0.50 to ~$5 per month at typical wealth-tech volume; per-call costs can dominate for high-frequency refresh patterns. Most platforms negotiate rates rather than accept rate-card pricing.

How does aggregator-API depth compare to direct custodian?+

Direct custodian is consistently deeper. Aggregator data is a normalized projection of custodian data; the projection is lossy by design, and lot-level basis, distribution-character classification, and corporate-action history are the most-commonly-lost fields. For tax-aware features, the gap is structural — aggregator-only platforms ship with a known correctness gap on these features, and the gap matters most for the institutional and HNW use cases that have the highest revenue per customer.