Building a crypto / DeFi tax engine — every receipt is a basis event

WealthSchema StaffTax modelingMay 9, 20262 min read

A traditional brokerage tax engine has a clear taxonomy of events: buy, sell, dividend, split, return-of-capital. Crypto tax engines start with that vocabulary and discover a half-dozen new event types within the first month of real customer data — and a dozen more within the first year. The engine that handles all of them is a substantially more complex product than a securities tax engine; the engine that doesn't ships customers wrong tax filings.

This article is the working note for engineering teams building crypto / DeFi tax engines. The events that matter, the basis-tracking complications DeFi introduces, and the synthetic-data shape needed to test the engine's coverage of the long tail.

What "every receipt is a basis event" means

In traditional securities, basis is established at acquisition and adjusted at a few defined events (splits, returns of capital, wash-sale adjustments). In crypto, basis is potentially affected by every receipt — and "receipt" includes far more than purchases.

	Event	Tax treatment at receipt
Purchase (fiat → crypto)	Non-taxable acquisition	USD spent + fees
Trade (crypto → crypto)	Taxable disposition of source asset	FMV at trade for received asset
Hard fork	Ordinary income at FMV (per Rev. Rul. 2019-24)	FMV at receipt
Airdrop	Ordinary income at FMV	FMV at receipt
Staking rewards	Ordinary income at FMV (per Jarrett v. United States, IRS Notice 2014-21)	FMV at receipt
Mining rewards	Ordinary income at FMV (Self-employment if business)	FMV at receipt
Liquidity-pool token receipt	Disposition vs. non-disposition is unsettled	Multiple defensible treatments
Wrapped token (e.g. WETH ↔ ETH)	Most likely non-taxable; conservative treatment as taxable	Carry-over basis (most common interpretation)
Cross-chain bridge	Taxable if bridge is custodial; non-taxable if non-custodial	Carry-over basis if non-taxable
NFT mint	Generally non-taxable (cost-of-goods)	Cost of minting + gas
NFT receipt as gift	Non-taxable to recipient; donor's gift-tax considerations	Carry-over basis from donor
Wash-trading / disposition for capital loss	Capital loss; wash-sale rules apply (post-2024 final regs)	Loss disallowed if wash-sale triggered

Each row is a code path. The engine that handles only purchase + trade has gaps for every other row.

The basis-tracking model

A crypto basis-tracking model has to extend the traditional lot-level model with crypto-specific fields:

Formula

Crypto lot record

crypto_lot = {
lot_id, wallet_id, asset_id, network_id,
shares, acquisition_date, basis_per_share,
acquisition_method,
fmv_at_acquisition,
parent_lot_ids,
network_specific: {
  transaction_hash, block_number,
  on_chain_metadata
},
special_status: {
  is_staking_reward, is_airdrop, is_fork,
  associated_position (LP token), nft_metadata
}
}

wallet_id: = Identifier for the holding wallet — multiple wallets per user, each tracked separately for custody and movement events
network_id: = Blockchain — Ethereum, Bitcoin, Solana, etc. Cross-chain movement is a tax event-relevant transition.
fmv_at_acquisition: = Fair market value at the moment of receipt — required for ordinary-income events (forks, airdrops, staking)
transaction_hash: = On-chain identifier for the specific transaction creating this lot
associated_position: = For LP tokens — the underlying position they represent

The on-chain metadata isn't just an audit trail — it's the source-of-truth for the events that affect basis. Engines that don't capture transaction hashes can't reconcile their basis calculations against blockchain history if disputed.

The DeFi-specific complications

DeFi protocols introduce events that don't have securities analogs. The hardest:

DeFi complication inventory

Liquidity pool deposits — depositor receives LP tokens representing the pool position. Whether deposit is a taxable disposition is unsettled; conservative treatment as disposition; aggressive treatment as non-disposition. Engine has to support both interpretations.
Liquidity pool withdrawals — symmetric to deposit. Returns of pool tokens for underlying assets (rebalanced for protocol fees and impermanent loss).
Yield-farming compound rewards — periodic claim of rewards (yield) from staking LP tokens. Each claim is an ordinary-income event at receipt FMV.
Flash loan interactions — within-block borrow + transact + repay. Whether this creates basis events on the borrow side is contested; most engines treat as non-events.
Governance token receipts — protocol governance distributions, similar to airdrops. Some are pre-claimed (delivered automatically); some require active claiming, which itself may be a tax event.
Synthetic / wrapped derivatives — exposure to underlying assets without ownership. Generally treated as constructive ownership for tax purposes, but the line is unclear.
Bridges and rollups — cross-chain or layer-2 transfers. Custodial bridges typically taxable; non-custodial bridges typically non-taxable; verifying which is which is non-trivial.
Token migrations — V1 to V2 token swaps mandated by the protocol. Generally non-taxable carryover basis; depends on whether the swap is mandatory and the underlying value is preserved.
Slashing — proof-of-stake validators losing staked tokens for misbehavior. Capital loss event.
Maximum extractable value (MEV) — searcher rewards from arbitrage. Ordinary income for the searcher; affects basis for affected pool participants.

A real engine has to support each of these — or explicitly disclaim coverage and require the user to handle the event manually with a dedicated tax preparer.

What synthetic test data needs to include

A synthetic crypto-tax test corpus has to span:

Spread 1
Asset coverage
Bitcoin, Ethereum, top 50 by market cap, plus stablecoins. NFTs. Wrapped tokens. LP tokens. Staking-derivative tokens. Each has different tax treatment.
Spread 2
Network coverage
Multi-chain holders are common — Ethereum + Solana + Bitcoin + L2s. Cross-chain bridge events are a category of edge case.
Spread 3
Activity types
Pure HODL, active trader, DeFi yield farmer, NFT collector, staker, validator. Each profile has a different event distribution.
Spread 4
Lifetime patterns
Year-1 portfolio (basis cleanly tracked), year-3 portfolio (multiple migrations / forks), legacy portfolio (pre-2018 acquisitions with incomplete records).
Spread 5
Edge cases
Hard fork events. Airdrop receipts. NFT minting. Staking with re-staking compounded rewards. LP tokens through impermanent loss. Cross-chain bridge transitions. Slashing events.
Spread 6
Tax-jurisdiction variations
US federal + state (varies). UK, Germany, Singapore, India each have distinct crypto tax regimes. Multi-jurisdiction users are increasingly common.

A test corpus missing any spread is a corpus where the engine has untested branches. Crypto-tax engines shipping production from incomplete corpora typically discover the gaps when the first crypto-experienced user files their return.

Key takeaways

Crypto tax engines have a substantially larger event taxonomy than equity engines — every receipt can be a basis event, and DeFi adds protocol-specific events that don't have securities analogs.
The basis-tracking model extends traditional lots with wallet, network, transaction-hash, and special-status fields. Engines that don't capture on-chain metadata can't reconcile their basis against blockchain history.
DeFi complications include LP deposits, yield farming, governance tokens, synthetic derivatives, bridges, token migrations, slashing, and MEV. Each is a distinct code path.
IRS Form 1099-DA reporting starts with 2025 transactions. Tax engines have to recompute basis from on-chain data because broker-side basis reporting will be incomplete in early years.
Test corpus has to span asset, network, activity-type, lifetime-pattern, edge-case, and jurisdiction dimensions. Missing any leaves untested branches that real users will eventually exercise.

Frequently asked questions

How do we handle pre-2018 records where on-chain data may be incomplete?+

The IRS allows reasonable reconstruction. Engines should support manual entry of pre-broker-reporting acquisitions with documentation requirements (exchange CSVs, wallet exports, third-party tax tools' historical data). The reconstruction quality affects audit defensibility — engines should flag low-confidence basis entries so the user knows their downstream filing carries audit risk.

What about holders who use centralized exchanges and DeFi simultaneously?+

Common — most active crypto users have multiple custody relationships. The engine has to ingest from multiple sources (exchange APIs / CSVs, wallet addresses, manual entries) and consolidate into a single per-user view. The consolidation involves reconciling acquisition events across sources and de-duplicating. This is engineering-heavy and where many engines have correctness issues; explicit user review of consolidated data is the standard mitigation.

Does the engine need to support institutional crypto custodians?+

If serving institutional users, yes. Coinbase Custody, Anchorage, BitGo, Fidelity Digital Assets all have specific data formats and event types. Institutional staking, yield products, and wrapped-token derivatives have different implementations than retail. The engine architecture has to support multiple custodian-specific adapters.

How important is supporting multi-jurisdiction tax rules?+

Increasingly important. The US, UK, Germany, Switzerland, Singapore, Australia, India, and Japan all have distinct crypto tax regimes; the EU is moving toward harmonization but member states vary. Engines serving global users need jurisdiction-specific tax modules layered on the universal event-tracking core. The synthetic test corpus has to include jurisdiction-specific scenarios for each supported jurisdiction.