Comparison

Domestic-Only vs. Cross-Border Test Data: When the Long Tail Becomes the Bulk

Published May 9, 2026

Most wealth-tech platforms start with a domestic-only test corpus — US-resident, USD-denominated, single-country households — because that's the simplest case to build, the largest customer segment, and the easiest target for an MVP. The decision usually works fine until the first cross-border customer arrives, the first foreign-fund holding hits the tax engine, the first internationally-mobile employee files an FBAR question to support, or the first auditor asks 'how do you handle PFIC?' This comparison walks through what changes when cross-border becomes part of the test surface, the cases where the platform can defer the work, and the cases where deferring it ships a known correctness gap from day one.

The two options

Domestic-only test data

A test corpus of US-resident, USD-denominated households with US-source income, US-domiciled holdings, and US-only filing requirements. The MVP test surface for most US-focused wealth-tech platforms; covers the customer-segment majority but ships with a structural gap on cross-border edge cases.

Pros

Smallest test surface — fewer schemas to build, fewer edge cases to reason about, fastest path to coverage of the modal customer
Matches the customer-base majority — most US-focused wealth-tech platforms have 80-95% US-resident, USD-denominated customers
Tax computation is tractable — single-country sourcing, single currency, single set of forms; the tax engine is essentially the IRS code
Aligns with most domestic regulatory frameworks — Reg BI, fair-lending testing, SR 11-7 model validation are all US-domestic-focused
Mock-data tooling is mature — every generic synthetic-data tool produces US-domestic shapes adequately

Cons

Structural correctness gap on any cross-border holding — foreign-domiciled mutual funds, ADRs, foreign equities all get incorrect tax treatment
FBAR/FATCA/PFIC reporting unreachable — the test corpus has no households that would trigger these forms; the code paths are untested
Multi-currency code paths unexercised — even if the platform supports non-USD holdings in code, no test data validates the FX-translation logic
Internationally-mobile employees unsupported — equity-comp test data assumes single-country residency throughout
First non-US customer triggers a significant rework — the gaps surface as production bugs rather than test failures

When to choose

Choose domestic-only when: (1) the platform's customer segmentation is genuinely US-resident-only — and the platform has a documented mechanism to reject non-US customers at onboarding; (2) the MVP timeline is short and the cross-border work can be sequenced as a Phase 2 effort with explicit acknowledgment of the coverage gap; (3) the regulatory framework being targeted (Reg BI, SR 11-7) is US-domestic-only; or (4) the platform is deliberately scoping to the modal customer and is willing to ship with an explicit 'we don't support cross-border' product position.

Cross-border test data

A test corpus that includes households with multi-currency portfolios, foreign-domiciled holdings (including PFICs), internationally-mobile employees, FBAR/FATCA reporting triggers, and treaty-tier-withholding scenarios. The test surface required for any platform that doesn't deliberately exclude non-US customers.

Pros

Covers the actual customer distribution — even US-focused platforms typically have 5-15% cross-border customers (expats, dual-residents, holders of foreign assets)
PFIC handling exercised — the test corpus has foreign-fund holdings under each of the three regimes (default §1291, QEF, mark-to-market); Form 8621 generation is testable
Multi-currency code paths validated — FX-consistency rules, hedge-overlay logic, dual-currency basis tracking all exercised
FBAR/FATCA forms exercisable — test households trigger filing thresholds, exercising the form-generation logic before a real customer needs it
Internationally-mobile employees supported — cross-border equity-comp logic, residency-day-counting, treaty interaction all testable

Cons

Larger test surface — more schemas, more edge cases, more per-country-pair logic to maintain
Cross-border tax law is fast-moving — annual treaty changes, IRS guidance updates, foreign-jurisdiction rule changes; test data has to be refreshed
Per-country expertise required — building or buying calibrated test data for each major treaty partner requires either internal expertise or vendor coverage
Reporting overlay is heavy — FBAR, FATCA, PFIC, FTC, GILTI, Subpart F each have their own form-generation logic and their own test-data requirements
More expensive to source — cross-border-aware synthetic data is a smaller market and typically commands a premium over domestic-only

When to choose

Choose cross-border test data when: (1) the platform's customer base includes any non-US-resident customers — even a small percentage means the cross-border code paths run in production; (2) the platform supports any foreign-domiciled holdings (including UCITS ETFs, foreign-listed ADRs, non-US mutual funds); (3) the platform serves multinational employees with cross-border equity grants; (4) the platform claims FBAR/FATCA/PFIC handling — the forms have to be testable; or (5) the platform's regulatory position requires demonstrating cross-border data handling (some institutional procurement explicitly requires this).

Decision framework

The decision usually reduces to two questions: does the platform support non-US customers (deliberately or accidentally), and does the platform claim to handle cross-border features?

If the platform support is genuinely US-only — with a documented onboarding mechanism that rejects non-US-resident customers and a clear product position — domestic-only test data is sufficient for the platform's claimed scope. The risk is that customer support, marketing, and sales sometimes onboard customers who shouldn't have been onboarded; if the platform has any history of cross-border customers, the test data probably needs to expand.

If the platform's customer base includes any non-US-resident customers — even at a 5-10% rate — the cross-border code paths are running in production and the test data has to exercise them. Shipping with domestic-only test data and live cross-border customers is a known-correctness-gap posture that often shows up in audit findings or customer-experience problems before it shows up in regulatory action.

If the platform claims any cross-border feature — multi-currency reporting, foreign-fund handling, internationally-mobile employee support, FBAR/FATCA form generation — domestic-only test data structurally cannot validate the feature. The feature has to ship without test coverage, which is a posture few platforms accept once they recognize what's happening.

The sequencing pattern that works for most teams: ship Phase 1 with domestic-only test data and explicit cross-border-not-supported product scope; build Phase 2 with cross-border test data and supported scope; gate the Phase 2 launch on the test data being in place. The pattern fails when Phase 1 quietly accepts cross-border customers anyway and Phase 2 gets deprioritized — a common path that produces production bugs that look like product bugs and are actually test-data gaps.

A cross-cutting consideration: the cross-border test data shape is qualitatively different from domestic-only, not just larger. Multi-currency rules, PFIC tracking, FTC carryforward state, residency-day-counting are all data structures that domestic-only doesn't have at all. The platform's data model has to accommodate these shapes from the schema design forward; trying to retrofit them onto a US-resident-only schema is the path that produces the rework most platforms regret.

Bottom line

Domestic-only test data is fine for platforms with explicit US-only scope and disciplined customer onboarding. Cross-border test data is mandatory for any platform that supports — or accidentally serves — non-US customers, foreign-domiciled holdings, internationally-mobile employees, or any of the FBAR/FATCA/PFIC reporting machinery. The WealthSynth cross-border bundles include households calibrated to each of the major cross-border test cases, with realistic distributions of treaty partners, PFIC regimes, and residency-mobility patterns; for platforms that need cross-border test data, the [WealthSynth catalog](/datasets) has the bundles to drop in.

FAQ

What percentage of customers are cross-border for a typical US-focused wealth-tech platform?+

Highly variable, but typically 5-15% even for explicitly US-focused platforms. The drivers are usually: US citizens living abroad (~9M total population), green-card holders splitting time between countries, dual-citizens, recent immigrants who retain home-country accounts, internationally-mobile employees, and US residents holding foreign-domiciled investments (Vanguard UCITS ETFs are surprisingly common in retail portfolios, often via inadvertent purchase through international-broker access). Platforms that haven't measured tend to under-estimate.

Can I add cross-border features incrementally without overhauling my test data?+

Partially. Some cross-border features (basic FBAR threshold checking, simple foreign-dividend FTC handling) can be added incrementally with targeted test cases. Others (PFIC tracking, residency-day-counting, multi-currency consistency) require schema changes that benefit from full cross-border test data from the start. The pragmatic path is to identify which specific cross-border feature is being added and request the corresponding subset of test data — most cross-border test corpora are organized to support this.

Is cross-border test data more expensive than domestic-only?+

Usually yes, modestly. The reason is calibration: the cross-border test data has to be calibrated against per-country tax-treaty rates, per-country withholding conventions, per-country fund domicile patterns. Domestic-only data has one country's calibration; cross-border has 20+ for the major treaty partners. The price differential at WealthSynth and at most cross-border-aware vendors is in the 30-100% premium range over domestic-only equivalents.

Does cross-border test data work for non-US-base-currency platforms (e.g. UK-based platforms)?+

Yes, with the same basic structure but with different sourcing. A UK-base-currency platform needs UK-domestic + cross-border test data; the cross-border layer covers the inverse direction (US-source income to UK-resident customer, etc.) plus the same multi-country complications. The WealthSynth cross-border bundles are USD-base-currency by default but support non-USD base currency on request; the underlying calibration generalizes.

How do I tell whether my platform has hidden cross-border customers?+

Three signals: (1) any account onboarded with a non-US tax-residency declaration; (2) any 1099-DIV with foreign tax withheld in the customer's recent tax history; (3) any aggregator-linked account at a non-US institution. A simple production query against any of these will surface the cross-border customer-count; most platforms find numbers higher than they expected. The follow-up question is whether those customers are being served correctly, which usually leads to the test-data conversation.

What about Canada, UK, Australia — close-allied countries with similar systems?+

Each has its own test-data requirements. UK has different ISA/SIPP retirement-account structures, different equity-comp tax treatment, different FX handling. Canada has TFSA/RRSP, different mutual-fund tax treatment, and is the largest single source of PFIC holdings for US holders. Australia has SMSF (self-managed super fund) and different employee-share-scheme rules. 'Close-allied' doesn't mean 'same'; cross-border test data has to be country-specific even for the major treaty partners.