Aggregator API
An aggregator API is a third-party service that connects to thousands of financial institutions on behalf of consuming applications, normalizes the data into a portable schema, and presents a single API. The major US aggregator APIs — Plaid, Yodlee, Akoya, MX — together cover essentially every US financial institution by account count.
The aggregator-API category exists because direct integration with every relevant financial institution is prohibitively expensive for most consuming applications. An aggregator absorbs the per-institution work — credential or token management, screen-scraping or API integration, schema normalization, error handling, support — and presents a uniform interface. Building an integration that covers 12,000+ institutions in three months is essentially impossible without an aggregator; building it via aggregator is a routine engineering project.
The four major US aggregators have differentiated positions. **Plaid** is the largest by consumer-app integration count, with the broadest institution coverage and the most-adopted developer-facing API; the wealth/investments product is younger than the banking product and reflects that maturity gap. **Yodlee** (an Envestnet subsidiary) is the longest-established US aggregator, with the deepest wealth-management feature set — better lot-level coverage, more granular account-type taxonomy, more support for alternatives — at the cost of higher integration complexity. **Akoya** is bank-consortium-backed (Fidelity, US Bank, Wells), FDX-native, and routes through institutions' own published APIs rather than screen-scraping; the result is high-fidelity data on a smaller institution footprint. **MX** is enrichment-focused — categorization, transaction labeling, normalization layered on top of the raw aggregator data — with strong credit-union heritage.
For wealth-tech platforms, the aggregator decision shapes everything downstream. Aggregator-only platforms have a known fidelity ceiling on tax-aware features (no consistent lot-level basis, no consistent distribution-character classification). Direct-custodian platforms have a known coverage ceiling (the major custodians, no long tail). Hybrid platforms — aggregator for breadth, direct for depth — are where most institutional platforms ultimately land, and the integration layer for those platforms has to handle [aggregator-vs-custodian reconciliation](/articles/reconciling-aggregator-vs-source-of-truth) as a routine concern.
| Plaid | Yodlee | Akoya | MX | |
|---|---|---|---|---|
| Position-level data | Yes | Yes | Yes | Yes |
| Lot-level basis | Inconsistent | Often | Inst-dep | Inconsistent |
| FDX conformance | Mostly | Hybrid | Native | Aligned |
| Best fit | Consumer breadth | Wealth depth | Bank consortium | Enrichment |
Aggregator-API-aware synthetic data needs schemas projected through each major aggregator's shape — the same canonical household JSON re-projected as Plaid output, Yodlee output, Akoya FDX-shape output, MX output. The differences between the projections (account-type taxonomy, lot-level depth, distribution-character classification, pending-vs-settled conventions) are themselves the test surface; a corpus that produces only one shape can't exercise multi-aggregator integration.
Common pitfalls
- Hard-coding to one aggregator's schema — when the platform expands to a second aggregator, the schema differences require non-trivial rework. Designing the data layer aggregator-agnostic from the start saves months of refactoring.
- Treating aggregator data as fully equivalent to custodian-direct data — the lot-level and distribution-character gaps are real and matter for tax-aware features.
- Not testing the token-management lifecycle — token expiry, revocation, re-auth flows are operational reality and have customer-facing UX implications.
- Underestimating per-call costs — aggregator pricing typically scales with API call volume; high-frequency refresh patterns can produce surprising bills.
Examples
A household with accounts at Schwab (linked via Plaid), Fidelity (linked via Akoya FDX), and a regional credit union (linked via MX) is reflected as three distinct aggregator views with different account-type taxonomy mappings, different lot-level depths, and different transaction-categorization labels. The platform's data model has to merge these into a single household record while preserving the per-source provenance for reconciliation.