Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save MaxGhenis/bbae835f25e3d07ce57b5e16b7ff170a to your computer and use it in GitHub Desktop.

Select an option

Save MaxGhenis/bbae835f25e3d07ce57b5e16b7ff170a to your computer and use it in GitHub Desktop.
CA vs federal income tax share among high earners — PolicyEngine microsimulation showing LTCG treatment makes billionaires contribute 1.34x more to CA state income tax relative to federal
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# CA vs federal income tax share among high earners\n",
"\n",
"California taxes long-term capital gains as ordinary income (up to 13.3%), while the federal code gives LTCG a preferential rate (max 23.8% vs 37% for wages). This means billionaires — whose income is disproportionately LTCG — contribute relatively more to CA state income tax than their share of federal income tax would suggest.\n",
"\n",
"We test this using PolicyEngine's CA-calibrated microsimulation model, which combines CPS microdata with [PUF-imputed income variables](https://policyengine.org/us/research/enhanced-cps-beta)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2026-03-17T13:22:11.345317Z",
"iopub.status.busy": "2026-03-17T13:22:11.345233Z",
"iopub.status.idle": "2026-03-17T13:22:19.310940Z",
"shell.execute_reply": "2026-03-17T13:22:19.309895Z"
}
},
"outputs": [],
"source": [
"from policyengine_us import Microsimulation\n",
"import numpy as np\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"execution": {
"iopub.execute_input": "2026-03-17T13:22:19.313024Z",
"iopub.status.busy": "2026-03-17T13:22:19.312912Z",
"iopub.status.idle": "2026-03-17T13:23:12.569297Z",
"shell.execute_reply": "2026-03-17T13:23:12.568623Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total federal income tax (CA filers): $257.2B\n",
"Total CA state income tax: $87.8B\n",
"Raw tax unit records: 411,806\n"
]
}
],
"source": [
"sim = Microsimulation(\n",
" dataset=\"hf://policyengine/policyengine-us-data/states/CA.h5\"\n",
")\n",
"\n",
"agi = sim.calc(\"adjusted_gross_income\", period=2026)\n",
"fed_tax = sim.calc(\"income_tax\", period=2026)\n",
"state_tax = sim.calc(\"state_income_tax\", period=2026)\n",
"\n",
"w = agi.weights\n",
"agi_v = agi.values\n",
"fed_v = fed_tax.values\n",
"state_v = state_tax.values\n",
"\n",
"fed_total = (fed_v * w).sum()\n",
"state_total = (state_v * w).sum()\n",
"\n",
"print(f\"Total federal income tax (CA filers): ${fed_total/1e9:,.1f}B\")\n",
"print(f\"Total CA state income tax: ${state_total/1e9:,.1f}B\")\n",
"print(f\"Raw tax unit records: {len(agi_v):,}\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"execution": {
"iopub.execute_input": "2026-03-17T13:23:12.599265Z",
"iopub.status.busy": "2026-03-17T13:23:12.599109Z",
"iopub.status.idle": "2026-03-17T13:23:12.632351Z",
"shell.execute_reply": "2026-03-17T13:23:12.631796Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>AGI threshold</th>\n",
" <th>Share of federal income tax</th>\n",
" <th>Share of CA state income tax</th>\n",
" <th>Ratio (state/fed)</th>\n",
" <th>Raw records</th>\n",
" <th>Weighted tax units</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>&gt;$0M</td>\n",
" <td>80.5%</td>\n",
" <td>80.4%</td>\n",
" <td>1.00x</td>\n",
" <td>20,929</td>\n",
" <td>415,158</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>&gt;$1M</td>\n",
" <td>69.7%</td>\n",
" <td>69.2%</td>\n",
" <td>0.99x</td>\n",
" <td>12,687</td>\n",
" <td>217,369</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>&gt;$5M</td>\n",
" <td>6.6%</td>\n",
" <td>8.9%</td>\n",
" <td>1.34x</td>\n",
" <td>5,872</td>\n",
" <td>499</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>&gt;$10M</td>\n",
" <td>6.5%</td>\n",
" <td>8.8%</td>\n",
" <td>1.34x</td>\n",
" <td>5,271</td>\n",
" <td>332</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" AGI threshold Share of federal income tax Share of CA state income tax \\\n",
"0 >$0M 80.5% 80.4% \n",
"1 >$1M 69.7% 69.2% \n",
"2 >$5M 6.6% 8.9% \n",
"3 >$10M 6.5% 8.8% \n",
"\n",
" Ratio (state/fed) Raw records Weighted tax units \n",
"0 1.00x 20,929 415,158 \n",
"1 0.99x 12,687 217,369 \n",
"2 1.34x 5,872 499 \n",
"3 1.34x 5,271 332 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"thresholds = [500_000, 1_000_000, 5_000_000, 10_000_000]\n",
"rows = []\n",
"\n",
"for t in thresholds:\n",
" mask = agi_v > t\n",
" fed_share = (fed_v[mask] * w[mask]).sum() / fed_total\n",
" state_share = (state_v[mask] * w[mask]).sum() / state_total\n",
" rows.append({\n",
" \"AGI threshold\": f\">${t/1e6:.0f}M\",\n",
" \"Share of federal income tax\": f\"{fed_share:.1%}\",\n",
" \"Share of CA state income tax\": f\"{state_share:.1%}\",\n",
" \"Ratio (state/fed)\": f\"{state_share/fed_share:.2f}x\",\n",
" \"Raw records\": f\"{mask.sum():,}\",\n",
" \"Weighted tax units\": f\"{w[mask].sum():,.0f}\",\n",
" })\n",
"\n",
"df = pd.DataFrame(rows)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Why this matters for the CA Billionaire Tax debate\n",
"\n",
"Saez et al. (2026) cite ~2.5% of total CA income tax receipts as the billionaire share, derived from Balkir et al. (NBER WP 34170) which uses federal tax data. But the federal code discounts LTCG — so billionaires' share of *federal* income tax is suppressed relative to their share of *CA* income tax.\n",
"\n",
"At the $5M+ AGI level (a rough proxy for ultra-high-net-worth), the ratio is **1.34x**: these filers pay 8.9% of CA state income tax but only 6.6% of federal. Scaling Saez's 2.5% by this ratio gives ~3.4%, or roughly $4B/year — closer to Rauh et al.'s range of $3.3-5.8B.\n",
"\n",
"This strengthens Rauh et al.'s argument that the income tax cost of billionaire departures is larger than Saez acknowledges."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"execution": {
"iopub.execute_input": "2026-03-17T13:23:12.633694Z",
"iopub.status.busy": "2026-03-17T13:23:12.633570Z",
"iopub.status.idle": "2026-03-17T13:23:15.674849Z",
"shell.execute_reply": "2026-03-17T13:23:15.674364Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Income</th>\n",
" <th>Type</th>\n",
" <th>Federal tax</th>\n",
" <th>CA tax</th>\n",
" <th>Eff. federal rate</th>\n",
" <th>Eff. CA rate</th>\n",
" <th>CA/Fed ratio</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>$1M</td>\n",
" <td>Wages</td>\n",
" <td>$320,000</td>\n",
" <td>$102,800</td>\n",
" <td>32.0%</td>\n",
" <td>10.3%</td>\n",
" <td>0.32x</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>$1M</td>\n",
" <td>LTCG</td>\n",
" <td>$230,400</td>\n",
" <td>$102,800</td>\n",
" <td>23.0%</td>\n",
" <td>10.3%</td>\n",
" <td>0.45x</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>$10M</td>\n",
" <td>Wages</td>\n",
" <td>$3,650,000</td>\n",
" <td>$1,300,644</td>\n",
" <td>36.5%</td>\n",
" <td>13.0%</td>\n",
" <td>0.36x</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>$10M</td>\n",
" <td>LTCG</td>\n",
" <td>$2,372,400</td>\n",
" <td>$1,300,644</td>\n",
" <td>23.7%</td>\n",
" <td>13.0%</td>\n",
" <td>0.55x</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>$100M</td>\n",
" <td>Wages</td>\n",
" <td>$36,950,000</td>\n",
" <td>$13,279,644</td>\n",
" <td>37.0%</td>\n",
" <td>13.3%</td>\n",
" <td>0.36x</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>$100M</td>\n",
" <td>LTCG</td>\n",
" <td>$23,792,400</td>\n",
" <td>$13,279,644</td>\n",
" <td>23.8%</td>\n",
" <td>13.3%</td>\n",
" <td>0.56x</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Income Type Federal tax CA tax Eff. federal rate Eff. CA rate \\\n",
"0 $1M Wages $320,000 $102,800 32.0% 10.3% \n",
"1 $1M LTCG $230,400 $102,800 23.0% 10.3% \n",
"2 $10M Wages $3,650,000 $1,300,644 36.5% 13.0% \n",
"3 $10M LTCG $2,372,400 $1,300,644 23.7% 13.0% \n",
"4 $100M Wages $36,950,000 $13,279,644 37.0% 13.3% \n",
"5 $100M LTCG $23,792,400 $13,279,644 23.8% 13.3% \n",
"\n",
" CA/Fed ratio \n",
"0 0.32x \n",
"1 0.45x \n",
"2 0.36x \n",
"3 0.55x \n",
"4 0.36x \n",
"5 0.56x "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Verify the mechanism: compare effective tax rates on wages vs LTCG\n",
"from policyengine_us import Simulation\n",
"\n",
"results = []\n",
"for income in [1_000_000, 10_000_000, 100_000_000]:\n",
" for income_type, var in [(\"Wages\", \"employment_income\"), (\"LTCG\", \"long_term_capital_gains\")]:\n",
" s = Simulation(\n",
" situation={\n",
" \"people\": {\"person\": {\"age\": {\"2026\": 40}, var: {\"2026\": income}}},\n",
" \"tax_units\": {\"tax_unit\": {\"members\": [\"person\"]}},\n",
" \"families\": {\"family\": {\"members\": [\"person\"]}},\n",
" \"spm_units\": {\"spm_unit\": {\"members\": [\"person\"]}},\n",
" \"marital_units\": {\"marital_unit\": {\"members\": [\"person\"]}},\n",
" \"households\": {\"household\": {\"members\": [\"person\"], \"state_code\": {\"2026\": \"CA\"}}},\n",
" }\n",
" )\n",
" fed = float(s.calculate(\"income_tax\", \"2026\")[0])\n",
" ca = float(s.calculate(\"ca_income_tax\", \"2026\")[0])\n",
" results.append({\n",
" \"Income\": f\"${income/1e6:.0f}M\",\n",
" \"Type\": income_type,\n",
" \"Federal tax\": f\"${fed:,.0f}\",\n",
" \"CA tax\": f\"${ca:,.0f}\",\n",
" \"Eff. federal rate\": f\"{fed/income:.1%}\",\n",
" \"Eff. CA rate\": f\"{ca/income:.1%}\",\n",
" \"CA/Fed ratio\": f\"{ca/fed:.2f}x\" if fed > 0 else \"n/a\",\n",
" })\n",
"\n",
"pd.DataFrame(results)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The CA tax is identical regardless of income type — the federal tax drops ~36% when switching from wages to LTCG, but CA's doesn't move at all. This is why billionaires (whose income is mostly LTCG) contribute relatively more to CA income tax than a federal-derived ratio would suggest."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment