The textbook drawdown rule taught in CFP coursework since the 1990s is one sentence: in retirement, draw from taxable first, then tax-deferred, then tax-free Roth. The reasoning is that tax-deferred and tax-free assets compound under tax protection, so postponing their withdrawal maximizes terminal after-tax wealth.
The rule is correct for one specific household profile — ordinary-income tax exposure only, no other binding constraint — and produces the wrong sequence for the rest. RMDs override the chosen order at age 73+. IRMAA tiers add discrete premium cliffs. NIIT layers 3.8% onto investment income above the thresholds. ACA premium subsidies decay with rising MAGI. LTCG stacks at 0% until ordinary income shoulders it into 15%. Each constraint binds for a different household segment. When any of them binds, the textbook sequence under-recommends. The textbook also ignores the HSA's stealth-retirement role; see HSA investment and triple-tax-advantage modeling for the account class most retirement engines route through as if it were a checking account.
What the textbook rule gets right
For a household with no other tax exposure, drawing taxable first preserves tax-protected compounding in retirement accounts. A simple example:
W_after_tax_horizon = balance × (1 + r × (1 − tax_rate))^N
vs.
W_after_tax_horizon = balance × (1 + r)^N × (1 − tax_rate_at_withdrawal)- r
- = Pre-tax return
- N
- = Years until withdrawal
- tax_rate
- = Annual tax on returns (taxable account) vs. one-time tax at withdrawal (tax-deferred)
The single-tax-rate calculation is right. The problem is that real households don't have a single tax rate.
What the textbook rule misses
RMD timing overrides the chosen sequence
At age 73 (75 for owners born 1960+), §401(a)(9) requires a minimum withdrawal from tax-deferred accounts whether the optimization wanted it or not. The textbook "deplete taxable first" sequence breaks here: taxable and tax-deferred have to flow simultaneously once RMDs start.
A constraint-aware engine commonly recommends pre-RMD draws from tax-deferred accounts in the gap years (early retirement through age 72) to level the lifetime tax burden, even though the textbook rule says defer. The pre-RMD gap years are the lowest-marginal-rate windows in most retirement plans; not using them often leaves five-figure annual tax savings on the table.
IRMAA bracket cliffs
A household near an IRMAA bracket boundary has a discrete tax-cost cliff. Taking $5,000 more from a tax-deferred account in a year that would otherwise put MAGI at $128,000 might push them over the $133,000 boundary — adding $1,041/year in Medicare premium surcharge for two years. The marginal cost of that $5,000 of withdrawal is then $5,000 + $2,082 = effectively a 41% marginal rate even if the federal rate is 22%.
The optimal sequence in this case may be to draw less from tax-deferred (avoiding the bracket) and more from taxable. The textbook rule, ignoring IRMAA, picks the wrong source.
NIIT, ACA, and LTCG stacking
The same constraint structure applies for NIIT (3.8% surtax on investment income above thresholds), ACA premium subsidies (cliff or slope at MAGI thresholds), and capital-gains rate stacking (0% → 15% → 20% transitions). Each adds a non-linearity to the tax function that the textbook rule doesn't see.
What a real optimizer does
A constraint-aware drawdown optimizer:
- Step 1Map account classesTaxable (basis tracked at lot level), tax-deferred (Traditional IRA, 401(k)), tax-free (Roth IRA, Roth 401(k)), other (HSA, 529, annuity). Each has different withdrawal tax.
- Step 2Identify binding constraints per yearRMD floor, IRMAA tier, NIIT threshold, ACA subsidy slope, LTCG stacking, state tax. Compute per-year per-household.
- Step 3Discretize the withdrawal gridPer year, decision is how much to draw from each account class. Discretize at $5K increments per class.
- Step 4Compute full marginal cost per grid pointFederal marginal + state marginal + IRMAA delta + NIIT delta + ACA subsidy delta + LTCG delta. Sum is the total marginal cost of the year's withdrawal.
- Step 5Solve via dynamic programmingMulti-year DP, evaluating each year's grid points with the optimal continuation. The state space is account balances by class.
- Step 6Output recommendation + sensitivityRecommended per-year per-class withdrawal amounts plus how the recommendation changes with tax-rate assumptions, market returns, life expectancy.
The optimization fits comfortably in standard tooling. The state space (account balances by class) is small, the grid (5–25 years × 30 grid points) is manageable, and the per-grid-point cost computation is fast. A real optimizer runs in seconds; the latency is not the constraint.
The strategies that emerge
The recommendation patterns we see across household types:
| Household profile | Typical optimal pattern | |
|---|---|---|
| Pre-Medicare retiree (60–65), high QSBS basis | Live on cash savings + LTCG at 0% bracket. Bridge to Medicare. Defer SS to maximize survivor benefit and ACA subsidy. | |
| Pre-Medicare retiree, modest savings | Mix of taxable + small tax-deferred withdrawals to keep AGI below ACA cliff. ACA subsidy preservation drives the sequence. | |
| Post-Medicare retiree, high tax-deferred balance | Roth conversions in early gap years. Then draw partly from tax-deferred and partly from tax-free as RMDs approach. | |
| Post-RMD retiree, high tax-deferred balance | RMD forced. Optimize residual: tax-free for excess spending, taxable to manage IRMAA bracket. | |
| HNW retiree with estate goals | Tax-deferred fastest to draw down (avoiding heir-tax-rate inheritance), Roth grows for heirs (10-year rule but tax-free). | |
| Low-balance retiree | Often optimal at the textbook rule simply because the constraints don't bind. Taxable first, tax-deferred next, Roth last. |
The pattern: real optimization rarely produces the textbook rule for households with meaningful tax-deferred balances. The textbook rule is correct for the segment that would benefit least from the optimization in absolute terms.
A middle tier: bracket-aware sequencing
Products that can't justify a full DP optimizer can still beat the textbook rule with a rules engine that walks each year's tax surface bracket-by-bracket. The pattern below is what such an engine evaluates at construction time; it is not a recommendation to a retiree.
Bracket-aware sequencing logic (engine internals)
- If the 0% LTCG bracket has unused room and the household holds appreciated taxable lots, draw from those lots first up to the band ceiling.
- Fill the 12% federal ordinary bracket from tax-deferred — effectively a Roth conversion at the bottom-most rate, even without an explicit conversion.
- Above 12%, switch the marginal source to taxable basis (return of basis is tax-free).
- Fill residual tax-deferred up to but not crossing the IRMAA tier the household is sitting under.
- Above the IRMAA / NIIT thresholds, source incremental cash from Roth (no further AGI impact).
This rule set captures most of the lifetime-PV gain a full DP optimizer recovers, with materially less engineering. The trade is sensitivity: the rules don't anticipate multi-year couplings (e.g., a small draw this year that opens a Roth conversion window next year), so the DP optimizer still wins on households with multiple binding constraints across the horizon. For products serving the mass-affluent retirement segment, the rules tier is often the right cost/benefit.
What this means for synthetic test data
Test data for drawdown-sequencing engines has to include:
- Households at every IRMAA bracket boundary
- Pre-Medicare households at every FPL band
- Households with significant capital-loss carryforwards (changes the LTCG calculation)
- Households with each binding-constraint combination (no constraint, IRMAA only, ACA only, NIIT only, multiple)
- Multi-account households with realistic balance ratios across taxable / tax-deferred / Roth
- Lifetime gift histories (changes the wealth-transfer math)
- State-tax variations
- Pre-RMD and post-RMD years
A test corpus that doesn't span this matrix produces an engine tested only on the easy cases. The hard cases are where the optimization differs from the textbook rule and where the bugs ship.
Key takeaways
- The textbook 'taxable first, tax-deferred next, tax-free last' rule is correct for ~10–20% of household-years and wrong for the majority where any other constraint binds.
- RMDs force a specific sequence regardless of optimization. Engines have to plan around the forced flow.
- IRMAA bracket cliffs, NIIT thresholds, ACA premium subsidies, and LTCG stacking each add non-linearities to the tax function. Each binds for some household types.
- A real optimizer uses dynamic programming over a discrete grid of per-class withdrawal amounts. Runs in seconds; the engineering is the inputs and the constraint logic.
- Bracket-aware sequencing is a reasonable middle ground for consumer-grade tools — captures most of the optimizer's value without the DP machinery.
- Test data needs to span the full matrix of binding constraints. Engines tested only on the no-constraint case never exercise the hard code paths.