Instantly share code, notes, and snippets.
Created
March 17, 2026 13:23
-
Star
0
(0)
You must be signed in to star a gist -
Fork
0
(0)
You must be signed in to fork a gist
-
-
Save MaxGhenis/bbae835f25e3d07ce57b5e16b7ff170a to your computer and use it in GitHub Desktop.
CA vs federal income tax share among high earners — PolicyEngine microsimulation showing LTCG treatment makes billionaires contribute 1.34x more to CA state income tax relative to federal
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "# CA vs federal income tax share among high earners\n", | |
| "\n", | |
| "California taxes long-term capital gains as ordinary income (up to 13.3%), while the federal code gives LTCG a preferential rate (max 23.8% vs 37% for wages). This means billionaires — whose income is disproportionately LTCG — contribute relatively more to CA state income tax than their share of federal income tax would suggest.\n", | |
| "\n", | |
| "We test this using PolicyEngine's CA-calibrated microsimulation model, which combines CPS microdata with [PUF-imputed income variables](https://policyengine.org/us/research/enhanced-cps-beta)." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 1, | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2026-03-17T13:22:11.345317Z", | |
| "iopub.status.busy": "2026-03-17T13:22:11.345233Z", | |
| "iopub.status.idle": "2026-03-17T13:22:19.310940Z", | |
| "shell.execute_reply": "2026-03-17T13:22:19.309895Z" | |
| } | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "from policyengine_us import Microsimulation\n", | |
| "import numpy as np\n", | |
| "import pandas as pd" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 2, | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2026-03-17T13:22:19.313024Z", | |
| "iopub.status.busy": "2026-03-17T13:22:19.312912Z", | |
| "iopub.status.idle": "2026-03-17T13:23:12.569297Z", | |
| "shell.execute_reply": "2026-03-17T13:23:12.568623Z" | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "Total federal income tax (CA filers): $257.2B\n", | |
| "Total CA state income tax: $87.8B\n", | |
| "Raw tax unit records: 411,806\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "sim = Microsimulation(\n", | |
| " dataset=\"hf://policyengine/policyengine-us-data/states/CA.h5\"\n", | |
| ")\n", | |
| "\n", | |
| "agi = sim.calc(\"adjusted_gross_income\", period=2026)\n", | |
| "fed_tax = sim.calc(\"income_tax\", period=2026)\n", | |
| "state_tax = sim.calc(\"state_income_tax\", period=2026)\n", | |
| "\n", | |
| "w = agi.weights\n", | |
| "agi_v = agi.values\n", | |
| "fed_v = fed_tax.values\n", | |
| "state_v = state_tax.values\n", | |
| "\n", | |
| "fed_total = (fed_v * w).sum()\n", | |
| "state_total = (state_v * w).sum()\n", | |
| "\n", | |
| "print(f\"Total federal income tax (CA filers): ${fed_total/1e9:,.1f}B\")\n", | |
| "print(f\"Total CA state income tax: ${state_total/1e9:,.1f}B\")\n", | |
| "print(f\"Raw tax unit records: {len(agi_v):,}\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 3, | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2026-03-17T13:23:12.599265Z", | |
| "iopub.status.busy": "2026-03-17T13:23:12.599109Z", | |
| "iopub.status.idle": "2026-03-17T13:23:12.632351Z", | |
| "shell.execute_reply": "2026-03-17T13:23:12.631796Z" | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>AGI threshold</th>\n", | |
| " <th>Share of federal income tax</th>\n", | |
| " <th>Share of CA state income tax</th>\n", | |
| " <th>Ratio (state/fed)</th>\n", | |
| " <th>Raw records</th>\n", | |
| " <th>Weighted tax units</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td>>$0M</td>\n", | |
| " <td>80.5%</td>\n", | |
| " <td>80.4%</td>\n", | |
| " <td>1.00x</td>\n", | |
| " <td>20,929</td>\n", | |
| " <td>415,158</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td>>$1M</td>\n", | |
| " <td>69.7%</td>\n", | |
| " <td>69.2%</td>\n", | |
| " <td>0.99x</td>\n", | |
| " <td>12,687</td>\n", | |
| " <td>217,369</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>>$5M</td>\n", | |
| " <td>6.6%</td>\n", | |
| " <td>8.9%</td>\n", | |
| " <td>1.34x</td>\n", | |
| " <td>5,872</td>\n", | |
| " <td>499</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>>$10M</td>\n", | |
| " <td>6.5%</td>\n", | |
| " <td>8.8%</td>\n", | |
| " <td>1.34x</td>\n", | |
| " <td>5,271</td>\n", | |
| " <td>332</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " AGI threshold Share of federal income tax Share of CA state income tax \\\n", | |
| "0 >$0M 80.5% 80.4% \n", | |
| "1 >$1M 69.7% 69.2% \n", | |
| "2 >$5M 6.6% 8.9% \n", | |
| "3 >$10M 6.5% 8.8% \n", | |
| "\n", | |
| " Ratio (state/fed) Raw records Weighted tax units \n", | |
| "0 1.00x 20,929 415,158 \n", | |
| "1 0.99x 12,687 217,369 \n", | |
| "2 1.34x 5,872 499 \n", | |
| "3 1.34x 5,271 332 " | |
| ] | |
| }, | |
| "execution_count": 3, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "thresholds = [500_000, 1_000_000, 5_000_000, 10_000_000]\n", | |
| "rows = []\n", | |
| "\n", | |
| "for t in thresholds:\n", | |
| " mask = agi_v > t\n", | |
| " fed_share = (fed_v[mask] * w[mask]).sum() / fed_total\n", | |
| " state_share = (state_v[mask] * w[mask]).sum() / state_total\n", | |
| " rows.append({\n", | |
| " \"AGI threshold\": f\">${t/1e6:.0f}M\",\n", | |
| " \"Share of federal income tax\": f\"{fed_share:.1%}\",\n", | |
| " \"Share of CA state income tax\": f\"{state_share:.1%}\",\n", | |
| " \"Ratio (state/fed)\": f\"{state_share/fed_share:.2f}x\",\n", | |
| " \"Raw records\": f\"{mask.sum():,}\",\n", | |
| " \"Weighted tax units\": f\"{w[mask].sum():,.0f}\",\n", | |
| " })\n", | |
| "\n", | |
| "df = pd.DataFrame(rows)\n", | |
| "df" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## Why this matters for the CA Billionaire Tax debate\n", | |
| "\n", | |
| "Saez et al. (2026) cite ~2.5% of total CA income tax receipts as the billionaire share, derived from Balkir et al. (NBER WP 34170) which uses federal tax data. But the federal code discounts LTCG — so billionaires' share of *federal* income tax is suppressed relative to their share of *CA* income tax.\n", | |
| "\n", | |
| "At the $5M+ AGI level (a rough proxy for ultra-high-net-worth), the ratio is **1.34x**: these filers pay 8.9% of CA state income tax but only 6.6% of federal. Scaling Saez's 2.5% by this ratio gives ~3.4%, or roughly $4B/year — closer to Rauh et al.'s range of $3.3-5.8B.\n", | |
| "\n", | |
| "This strengthens Rauh et al.'s argument that the income tax cost of billionaire departures is larger than Saez acknowledges." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 4, | |
| "metadata": { | |
| "execution": { | |
| "iopub.execute_input": "2026-03-17T13:23:12.633694Z", | |
| "iopub.status.busy": "2026-03-17T13:23:12.633570Z", | |
| "iopub.status.idle": "2026-03-17T13:23:15.674849Z", | |
| "shell.execute_reply": "2026-03-17T13:23:15.674364Z" | |
| } | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>Income</th>\n", | |
| " <th>Type</th>\n", | |
| " <th>Federal tax</th>\n", | |
| " <th>CA tax</th>\n", | |
| " <th>Eff. federal rate</th>\n", | |
| " <th>Eff. CA rate</th>\n", | |
| " <th>CA/Fed ratio</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td>$1M</td>\n", | |
| " <td>Wages</td>\n", | |
| " <td>$320,000</td>\n", | |
| " <td>$102,800</td>\n", | |
| " <td>32.0%</td>\n", | |
| " <td>10.3%</td>\n", | |
| " <td>0.32x</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td>$1M</td>\n", | |
| " <td>LTCG</td>\n", | |
| " <td>$230,400</td>\n", | |
| " <td>$102,800</td>\n", | |
| " <td>23.0%</td>\n", | |
| " <td>10.3%</td>\n", | |
| " <td>0.45x</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>$10M</td>\n", | |
| " <td>Wages</td>\n", | |
| " <td>$3,650,000</td>\n", | |
| " <td>$1,300,644</td>\n", | |
| " <td>36.5%</td>\n", | |
| " <td>13.0%</td>\n", | |
| " <td>0.36x</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>$10M</td>\n", | |
| " <td>LTCG</td>\n", | |
| " <td>$2,372,400</td>\n", | |
| " <td>$1,300,644</td>\n", | |
| " <td>23.7%</td>\n", | |
| " <td>13.0%</td>\n", | |
| " <td>0.55x</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>4</th>\n", | |
| " <td>$100M</td>\n", | |
| " <td>Wages</td>\n", | |
| " <td>$36,950,000</td>\n", | |
| " <td>$13,279,644</td>\n", | |
| " <td>37.0%</td>\n", | |
| " <td>13.3%</td>\n", | |
| " <td>0.36x</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>5</th>\n", | |
| " <td>$100M</td>\n", | |
| " <td>LTCG</td>\n", | |
| " <td>$23,792,400</td>\n", | |
| " <td>$13,279,644</td>\n", | |
| " <td>23.8%</td>\n", | |
| " <td>13.3%</td>\n", | |
| " <td>0.56x</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " Income Type Federal tax CA tax Eff. federal rate Eff. CA rate \\\n", | |
| "0 $1M Wages $320,000 $102,800 32.0% 10.3% \n", | |
| "1 $1M LTCG $230,400 $102,800 23.0% 10.3% \n", | |
| "2 $10M Wages $3,650,000 $1,300,644 36.5% 13.0% \n", | |
| "3 $10M LTCG $2,372,400 $1,300,644 23.7% 13.0% \n", | |
| "4 $100M Wages $36,950,000 $13,279,644 37.0% 13.3% \n", | |
| "5 $100M LTCG $23,792,400 $13,279,644 23.8% 13.3% \n", | |
| "\n", | |
| " CA/Fed ratio \n", | |
| "0 0.32x \n", | |
| "1 0.45x \n", | |
| "2 0.36x \n", | |
| "3 0.55x \n", | |
| "4 0.36x \n", | |
| "5 0.56x " | |
| ] | |
| }, | |
| "execution_count": 4, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "# Verify the mechanism: compare effective tax rates on wages vs LTCG\n", | |
| "from policyengine_us import Simulation\n", | |
| "\n", | |
| "results = []\n", | |
| "for income in [1_000_000, 10_000_000, 100_000_000]:\n", | |
| " for income_type, var in [(\"Wages\", \"employment_income\"), (\"LTCG\", \"long_term_capital_gains\")]:\n", | |
| " s = Simulation(\n", | |
| " situation={\n", | |
| " \"people\": {\"person\": {\"age\": {\"2026\": 40}, var: {\"2026\": income}}},\n", | |
| " \"tax_units\": {\"tax_unit\": {\"members\": [\"person\"]}},\n", | |
| " \"families\": {\"family\": {\"members\": [\"person\"]}},\n", | |
| " \"spm_units\": {\"spm_unit\": {\"members\": [\"person\"]}},\n", | |
| " \"marital_units\": {\"marital_unit\": {\"members\": [\"person\"]}},\n", | |
| " \"households\": {\"household\": {\"members\": [\"person\"], \"state_code\": {\"2026\": \"CA\"}}},\n", | |
| " }\n", | |
| " )\n", | |
| " fed = float(s.calculate(\"income_tax\", \"2026\")[0])\n", | |
| " ca = float(s.calculate(\"ca_income_tax\", \"2026\")[0])\n", | |
| " results.append({\n", | |
| " \"Income\": f\"${income/1e6:.0f}M\",\n", | |
| " \"Type\": income_type,\n", | |
| " \"Federal tax\": f\"${fed:,.0f}\",\n", | |
| " \"CA tax\": f\"${ca:,.0f}\",\n", | |
| " \"Eff. federal rate\": f\"{fed/income:.1%}\",\n", | |
| " \"Eff. CA rate\": f\"{ca/income:.1%}\",\n", | |
| " \"CA/Fed ratio\": f\"{ca/fed:.2f}x\" if fed > 0 else \"n/a\",\n", | |
| " })\n", | |
| "\n", | |
| "pd.DataFrame(results)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "The CA tax is identical regardless of income type — the federal tax drops ~36% when switching from wages to LTCG, but CA's doesn't move at all. This is why billionaires (whose income is mostly LTCG) contribute relatively more to CA income tax than a federal-derived ratio would suggest." | |
| ] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": "Python 3", | |
| "language": "python", | |
| "name": "python3" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.14.0" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 4 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment