Wash-sale tracking algorithms — why cross-account reconciliation is harder than most engines assume

WealthSchema StaffTax-aware investment softwareMay 30, 20265 min read

The wash-sale rule, codified at IRC §1091, is forty words of statute and a thousand pages of edge cases. For most retail investors with a single brokerage account, it's a minor annoyance — sell at a loss, buy back within 30 days, lose the loss. For tax-loss harvesting engines, direct-indexing platforms, multi-account households, and the wealth platforms that serve them, the rule is a structural problem whose data and algorithmic requirements are routinely underestimated.

This guide is for engineering teams building tax-loss harvesting algorithms, direct-indexing engines, or any system that has to track wash-sale risk across accounts and over time. It walks through what the rule actually requires (versus what most engines actually compute), the cross-account scenarios that break naive implementations, and the data-model decisions that determine whether an engine produces correct results or quietly produces ones that fail under audit.

What the rule actually says

§1091 disallows a loss on the sale of stock or securities if, within 30 days before or after the sale, the taxpayer (or the taxpayer's spouse, or a corporation controlled by the taxpayer) has acquired substantially identical stock or securities. The disallowed loss is added to the basis of the replacement security; the holding period of the replacement security is the original lot's holding period plus any time the replacement security has been held independently.

Three structural requirements emerge from those forty words:

Requirement 1

The 61-day window

30 days before the loss sale, the day of the sale, and 30 days after. Any purchase of a substantially identical security in this window triggers the rule.

Requirement 2

Substantially identical

Identical CUSIPs are clearly substantially identical. Different share classes are sometimes. Two ETFs tracking the same index are sometimes. The IRS has not published bright-line rules — material risk for direct-indexing engines.

Requirement 3

'You' includes related parties

Spouse counts. Controlled corporations count. Per Rev. Rul. 2008-5, traditional and Roth IRAs of the taxpayer count — meaning a loss in a taxable account can be disallowed by a purchase in the IRA.

Where most engines fall short

Most wash-sale tracking engines we've reviewed handle the basic single-account case correctly. They fall short in several specific ways:

	Failure	Mechanism
Cross-account tracking incomplete	Many engines track within a single account but not across accounts at the same firm; almost none track across firms. Schwab taxable + Fidelity IRA: usually invisible.
Spouse-account tracking is rare	Statute clearly extends to the spouse. Tracking requires household-level data structure or explicit spouse linkage. Most direct-indexing engines don't track spouse accounts at all.
Substantially-identical determination is simplistic	Many engines use only CUSIP equality. A correct implementation needs a substitutability matrix accounting for share-class variations and ETF-pair similarity.
Wash-sale carry-over computed incorrectly	When the rule fires, the disallowed loss adds to the basis of the replacement security. Engines that record disallowance but don't update replacement basis understate eventual taxable gain.
Re-purchase timing tracked incorrectly	The 61-day window is mechanically TRADE date, not settlement date. Engines using settlement date are off by 1–2 days — enough to generate false positives and negatives at window boundaries.
Partial-lot tracking incorrect	A 100-share lot sold at loss with a 30-share replacement triggers wash-sale on 30 of 100 shares only. Engines that treat the entire lot as disallowed overstate; engines that treat none as disallowed understate.

The data-model requirements

A correct wash-sale tracking algorithm requires a data model that captures, at minimum:

Per-lot data. Security identifier (CUSIP, plus a substantially-identical-key for cross-CUSIP matching). Account holding the lot. Acquisition date (trade date). Acquisition cost basis. Lot quantity. Sale date and proceeds. Realized gain or loss. Wash-sale disallowed amount. Reference to the replacement lot.

Per-account data. Account-holder taxpayer ID. Account-holder spouse taxpayer ID. Account type (taxable, traditional IRA, Roth IRA, 401(k), HSA — different rules apply). Linked accounts at the same firm; explicitly-linked accounts at other firms.

Per-taxpayer data. Taxpayer ID. Spouse taxpayer ID. Linked taxpayers (entity-controlled corporations). Filing status (which determines whether spouse aggregation is required).

Per-security data. Security identifier. Substantially-identical key (the abstraction that lets the algorithm match across CUSIPs when appropriate). Security type. Index reference (for index-tracking funds).

The substantially-identical key is the part most engines under-design. A clean implementation has, per security, a list of securities the system considers substantially identical, with the basis for each pairing documented (CUSIP equality, share-class variation, same-index-tracking with similar mandate). The substitutability matrix is a configuration the firm maintains; it embodies the firm's policy on how aggressively to interpret the substantially-identical test.

The algorithm

Given the data model above, the wash-sale tracking algorithm runs roughly:

1
Identify security + substantially-identical key
For each loss sale.
2
Define the 61-day window
Centered on the trade date.
3
Search all accounts of taxpayer + spouse + linked entities
For purchases of substantially-identical securities within the window.
4
Compute disallowance
Proportional to the lesser of loss-sale shares and matching-purchase shares.
5
Add disallowed amount to replacement-lot basis
Per §1091(d).
6
Tack the holding period
The loss lot's holding period transfers to the replacement.
7
Record the wash-sale event
In the loss lot's history with reference to the matching purchase.

Straightforward in pseudocode and treacherous in production. The treachery is in the cross-account search, the spouse aggregation, the substantially-identical matching, and the partial-lot proration.

The direct-indexing case

Direct indexing is a category whose entire economics depends on aggressive tax-loss harvesting. A direct-indexing engine running against a customer's portfolio harvests losses opportunistically, sometimes daily. Wash-sale risk is the single biggest constraint on the engine's operation.

Several direct-indexing-specific challenges:

Cross-customer-portfolio wash sales. A direct-indexing engine harvesting a loss on Apple in one customer's portfolio doesn't trigger the rule against another customer's portfolio (different taxpayers). But within a customer's portfolio, the engine has to coordinate: harvesting a loss in the taxable account while another part of the engine simultaneously rebalances a similar position in the IRA can trigger the cross-account rule.

Replacement-security selection. When the engine sells a position at a loss, it has to replace it with something. The replacement can't be substantially identical (or the loss is disallowed), but it should be close enough to maintain index tracking. Conservative engines substitute into clearly-different securities (S&P 500 to Russell 1000) at the cost of slightly worse tracking; aggressive engines substitute within the substantially-identical-uncertain space and accept the audit risk.

Re-balance interaction. Direct-indexing engines also run periodic rebalances. If the engine harvests a loss on Monday and rebalances on Friday by reintroducing the harvested security — that's a wash sale, even if the engine didn't intend it.

Tax-lot specification. Direct indexing typically uses specific-ID lot accounting to maximize harvesting. Wash-sale tracking interacts with lot specification in ways that simpler average-cost engines don't have to handle.

The engineering implication: direct-indexing engines have to model wash-sale risk as a constraint in the optimization that drives every transaction, not as a post-hoc check after transactions are made.

What we'd test against

A defensible wash-sale tracking engine should pass each of these structural test cases:

Ten structural wash-sale test cases

Single-account, in-window, exact-CUSIP match. Sell 100 XYZ at $1,000 loss; buy 100 XYZ 15 days later. Disallow $1,000; add to replacement basis.
Single-account, partial replacement. Sell 100 XYZ at $1,000 loss; buy 30 XYZ 15 days later. Disallow $300 (proportional); pass through $700.
Cross-account, taxable→IRA. Sell 100 XYZ at $1,000 loss in taxable; buy 100 XYZ in IRA 10 days later. Disallow per Rev. Rul. 2008-5; loss permanently disallowed (no IRA basis adjustment).
Cross-account, spouse aggregation. Sell 100 XYZ at $1,000 loss in customer's account; spouse buys 100 XYZ in spouse's account 5 days later (MFJ). Disallow per spouse-aggregation rule.
Substantially-identical ETF substitution. Sell VOO at loss; buy IVV 10 days later. Engine flags for review per the firm's substitutability matrix.
Trade-date-vs-settlement-date edge. Sell at loss on day T; buy substantially-identical with trade date T+30, settlement T+32. Engine treats trade date as trigger; wash sale fires.
Partial wash-sale with subsequent sale. Sell 100 at $1,000 loss; buy 30 (triggers $300 disallowance + basis bump); sell those 30 six months later at $200 gain. Engine computes $200 - ($300 + cost) = $100 loss with tacked holding period.
Replacement of replacement. Sell A; buy B (substantially identical), wash-sale fires; sell B; buy A again 10 days later. Engine tracks second wash-sale firing on B's loss with basis adjustment carried through.
QSBS interaction. QSBS-eligible position sold at loss with wash-sale-triggering replacement. Engine disallows per §1091 AND preserves QSBS attribution on replacement lot.
Direct-indexing reentry. Engine harvests loss on Monday; separate rebalancing process reintroduces security 25 days later. Engine detects cross-process wash-sale and either prevents the rebalance or correctly applies the rule.

Test data exercising each of these patterns is hard to construct from production systems — partly because cross-account scenarios require households with multiple linked accounts, partly because the trade-date edge cases require sub-day timing precision that production systems don't always preserve in their testing environments. Our Tax-Loss Harvesting Simulator pack is built specifically to provide this — 350 households with cross-account wash-sale conflict scenarios, holding-period edge cases, and the QSBS interactions most engines underweight.

The audit-defense angle

A wash-sale tracking engine's most consequential test is audit defense. When a customer's CPA, or the IRS, asks why a particular loss was disallowed (or wasn't), the engine has to produce documentation that traces the determination to the underlying transactions, the substantially-identical key, and the firm's policy. Engines that produce a flat "wash sale: yes/no" flag without the supporting derivation make audit defense harder, not easier.

Per disallowance event, the engine should produce:

The loss-sale transaction
The matching purchase transaction
The substantially-identical determination basis
The disallowance computation (proportional partial-lot if applicable)
The resulting basis adjustment to the replacement security

Without all five, the audit defense is weaker than it should be.

Closing

Wash-sale tracking is the kind of problem that looks simple in pseudocode and turns out to be a multi-system, multi-taxpayer, multi-account engineering problem in practice. The engines that handle it best are the ones that treat it as a constraint in the transaction-decision logic rather than a reporting concern after the fact, and that model the substantially-identical determination as a configurable matrix rather than CUSIP equality.

If you're building a tax-loss harvesting algorithm, a direct-indexing engine, or a multi-account tax-aware platform, the test corpus has to exercise the cross-account, spouse-aggregation, and substantially-identical edge cases explicitly. Production data, even at scale, generally doesn't contain enough of these patterns to validate the engine.

Our Tax-Loss Harvesting Simulator is the corpus we'd test against. The free sample on GitHub lets you inspect the schema and the lot-level structure before any commitment.

Key takeaways

Wash-sale tracking is mechanically simple at the single-account level and structurally treacherous across accounts, across spouses, across CUSIPs, and across the trade-date / settlement-date boundary.
Per Rev. Rul. 2008-5, IRA purchases can disallow taxable-account losses — and unlike normal §1091 firings, the loss is permanently disallowed because IRAs have no external basis to adjust.
The substantially-identical determination should be a configurable matrix (per security → list of substitutes with documented basis), not CUSIP equality. Direct-indexing engines depend on aggressive but defensible substitutions in this space.
The 61-day window is mechanically trade-date, not settlement-date — engines using settlement date are off by 1–2 days at the boundaries.
Direct-indexing engines must model wash-sale risk as a constraint in the optimization that drives every transaction, not as a post-hoc reporting concern. Per-disallowance documentation (loss tx + matching tx + substitutability basis + computation + basis adjustment) is what audit-defense actually requires.

Related reading:

This document is general guidance for engineering teams building tax-aware investment software. It is not tax or legal advice. Firms operating in this space must engage qualified tax counsel for product-specific validation.