Skip to content

Instantly share code, notes, and snippets.

@MaxGhenis
Last active December 26, 2025 19:37
Show Gist options
  • Select an option

  • Save MaxGhenis/0106e27ffe070a97aad0f04877aa76e4 to your computer and use it in GitHub Desktop.

Select an option

Save MaxGhenis/0106e27ffe070a97aad0f04877aa76e4 to your computer and use it in GitHub Desktop.
85th Percentile Household Income in Los Angeles City (2023-2024) using ACS 1-Year PUMS
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 85th Percentile Household Income in Los Angeles City\n",
"\n",
"Using 1-year ACS PUMS data to calculate the weighted 85th percentile household income for 2023 and 2024.\n",
"\n",
"LA City PUMAs identified from Census ACS API by name matching. These 23 PUMAs are explicitly labeled as LA City areas."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:14.383845Z",
"iopub.status.busy": "2025-12-26T19:37:14.383768Z",
"iopub.status.idle": "2025-12-26T19:37:14.673109Z",
"shell.execute_reply": "2025-12-26T19:37:14.672617Z"
}
},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"from io import BytesIO\n",
"from zipfile import ZipFile\n",
"import requests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## LA City PUMAs\n",
"\n",
"Los Angeles City spans 23 PUMAs (2020 geography). These were identified from the Census ACS API by matching PUMA names containing \"LA City\" and \"Los Angeles County\".\n",
"\n",
"Note: Some LA City areas may be in shared PUMAs (e.g., 03707, 03748) which include portions of other cities - these are excluded to avoid overestimation."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:14.674668Z",
"iopub.status.busy": "2025-12-26T19:37:14.674546Z",
"iopub.status.idle": "2025-12-26T19:37:14.676644Z",
"shell.execute_reply": "2025-12-26T19:37:14.676259Z"
}
},
"outputs": [],
"source": [
"# LA City PUMAs (2020 geography) - from Census ACS API name matching\n",
"LA_CITY_PUMAS = [\n",
" \"03705\", # LA City (Northwest/Chatsworth & Porter Ranch)\n",
" \"03706\", # LA City (North Central/Granada Hills & Sylmar)\n",
" \"03708\", # LA City (Northeast/Sunland, Sun Valley & Tujunga)\n",
" \"03721\", # LA City (Northeast/North Hollywood & Valley Village)\n",
" \"03722\", # LA City (North Central/Van Nuys & North Sherman Oaks)\n",
" \"03723\", # LA City (North Central/Mission Hills & Panorama City)\n",
" \"03724\", # LA City (Northwest/Encino & Tarzana)\n",
" \"03725\", # LA City (Northwest/Canoga Park, Winnetka & Woodland Hills)\n",
" \"03730\", # LA City (Central/Hancock Park & Mid-Wilshire)\n",
" \"03732\", # LA City (East Central & Hollywood)\n",
" \"03733\", # LA City (Central/Koreatown)\n",
" \"03734\", # LA City (East Central/Silver Lake, Echo Park & Westlake)\n",
" \"03735\", # LA City (Mount Washington, Highland Park & Glassell Park)\n",
" \"03744\", # LA City (East Central/Central City & Boyle Heights)\n",
" \"03745\", # LA City (Southeast/East Vernon)\n",
" \"03746\", # LA City (Central/Univ. of Southern California & Exposition Park)\n",
" \"03747\", # LA City (Central/West Adams & Baldwin Hills)\n",
" \"03750\", # LA City (South Central/Westmont)\n",
" \"03751\", # LA City (South Central/Watts)\n",
" \"03767\", # LA City (South/San Pedro)\n",
" \"03770\", # LA City (West Los Angeles, Century City & Palms)\n",
" \"03775\", # LA City (Central)\n",
" \"03776\", # LA City (Central/Westwood & West Los Angeles)\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:14.677693Z",
"iopub.status.busy": "2025-12-26T19:37:14.677629Z",
"iopub.status.idle": "2025-12-26T19:37:14.679801Z",
"shell.execute_reply": "2025-12-26T19:37:14.679429Z"
}
},
"outputs": [],
"source": [
"def download_pums_data(year: int) -> pd.DataFrame:\n",
" \"\"\"Download ACS 1-year PUMS household data for California.\"\"\"\n",
" url = f\"https://www2.census.gov/programs-surveys/acs/data/pums/{year}/1-Year/csv_hca.zip\"\n",
" print(f\"Downloading {year} PUMS data from {url}...\")\n",
" \n",
" response = requests.get(url)\n",
" response.raise_for_status()\n",
" \n",
" with ZipFile(BytesIO(response.content)) as z:\n",
" csv_name = [n for n in z.namelist() if n.startswith(\"psam_h\") and n.endswith(\".csv\")][0]\n",
" print(f\"Reading {csv_name}...\")\n",
" with z.open(csv_name) as f:\n",
" df = pd.read_csv(f, usecols=[\"PUMA\", \"HINCP\", \"WGTP\"], dtype={\"PUMA\": str})\n",
" \n",
" return df"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:14.680776Z",
"iopub.status.busy": "2025-12-26T19:37:14.680699Z",
"iopub.status.idle": "2025-12-26T19:37:14.682648Z",
"shell.execute_reply": "2025-12-26T19:37:14.682282Z"
}
},
"outputs": [],
"source": [
"def weighted_percentile(values: np.ndarray, weights: np.ndarray, percentile: float) -> float:\n",
" \"\"\"Calculate weighted percentile.\"\"\"\n",
" mask = ~np.isnan(values) & ~np.isnan(weights) & (weights > 0)\n",
" values = values[mask]\n",
" weights = weights[mask]\n",
" \n",
" sorted_indices = np.argsort(values)\n",
" sorted_values = values[sorted_indices]\n",
" sorted_weights = weights[sorted_indices]\n",
" \n",
" cumsum = np.cumsum(sorted_weights)\n",
" cutoff = percentile / 100.0 * cumsum[-1]\n",
" \n",
" idx = np.searchsorted(cumsum, cutoff)\n",
" return sorted_values[min(idx, len(sorted_values) - 1)]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:14.683570Z",
"iopub.status.busy": "2025-12-26T19:37:14.683501Z",
"iopub.status.idle": "2025-12-26T19:37:14.685501Z",
"shell.execute_reply": "2025-12-26T19:37:14.685089Z"
}
},
"outputs": [],
"source": [
"def calculate_la_city_85th_percentile(year: int) -> dict:\n",
" \"\"\"Calculate 85th percentile household income for LA City.\"\"\"\n",
" df = download_pums_data(year)\n",
" \n",
" # Pad PUMA codes to 5 digits for matching\n",
" df[\"PUMA\"] = df[\"PUMA\"].str.zfill(5)\n",
" la_city = df[df[\"PUMA\"].isin(LA_CITY_PUMAS)].copy()\n",
" \n",
" p85 = weighted_percentile(\n",
" la_city[\"HINCP\"].values,\n",
" la_city[\"WGTP\"].values,\n",
" 85\n",
" )\n",
" \n",
" return {\n",
" \"year\": year,\n",
" \"p85_income\": p85,\n",
" \"total_households\": la_city[\"WGTP\"].sum(),\n",
" \"sample_size\": len(la_city)\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Calculate for 2023 and 2024"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:14.686546Z",
"iopub.status.busy": "2025-12-26T19:37:14.686468Z",
"iopub.status.idle": "2025-12-26T19:37:19.609782Z",
"shell.execute_reply": "2025-12-26T19:37:19.609339Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading 2023 PUMS data from https://www2.census.gov/programs-surveys/acs/data/pums/2023/1-Year/csv_hca.zip...\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Reading psam_h06.csv...\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"2023 Results:\n",
" 85th Percentile Household Income: $192,000\n",
" Total Households (weighted): 1,394,196\n",
" Sample Size: 15,755\n",
"Downloading 2024 PUMS data from https://www2.census.gov/programs-surveys/acs/data/pums/2024/1-Year/csv_hca.zip...\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Reading psam_h06.csv...\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"2024 Results:\n",
" 85th Percentile Household Income: $200,000\n",
" Total Households (weighted): 1,415,235\n",
" Sample Size: 15,789\n"
]
}
],
"source": [
"results = []\n",
"for year in [2023, 2024]:\n",
" result = calculate_la_city_85th_percentile(year)\n",
" results.append(result)\n",
" print(f\"\\n{year} Results:\")\n",
" print(f\" 85th Percentile Household Income: ${result['p85_income']:,.0f}\")\n",
" print(f\" Total Households (weighted): {result['total_households']:,.0f}\")\n",
" print(f\" Sample Size: {result['sample_size']:,}\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:19.610949Z",
"iopub.status.busy": "2025-12-26T19:37:19.610863Z",
"iopub.status.idle": "2025-12-26T19:37:19.617791Z",
"shell.execute_reply": "2025-12-26T19:37:19.617402Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>year</th>\n",
" <th>p85_income</th>\n",
" <th>total_households</th>\n",
" <th>sample_size</th>\n",
" <th>p85_income_formatted</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2023</td>\n",
" <td>192000.0</td>\n",
" <td>1394196</td>\n",
" <td>15755</td>\n",
" <td>$192,000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2024</td>\n",
" <td>200000.0</td>\n",
" <td>1415235</td>\n",
" <td>15789</td>\n",
" <td>$200,000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" year p85_income total_households sample_size p85_income_formatted\n",
"0 2023 192000.0 1394196 15755 $192,000\n",
"1 2024 200000.0 1415235 15789 $200,000"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"results_df = pd.DataFrame(results)\n",
"results_df[\"p85_income_formatted\"] = results_df[\"p85_income\"].apply(lambda x: f\"${x:,.0f}\")\n",
"results_df"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:19.618945Z",
"iopub.status.busy": "2025-12-26T19:37:19.618803Z",
"iopub.status.idle": "2025-12-26T19:37:19.621015Z",
"shell.execute_reply": "2025-12-26T19:37:19.620704Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Year-over-year change (2023 to 2024):\n",
" Absolute change: $8,000\n",
" Percent change: 4.2%\n"
]
}
],
"source": [
"# Year-over-year change\n",
"change = results[1][\"p85_income\"] - results[0][\"p85_income\"]\n",
"pct_change = (change / results[0][\"p85_income\"]) * 100\n",
"print(f\"\\nYear-over-year change (2023 to 2024):\")\n",
"print(f\" Absolute change: ${change:,.0f}\")\n",
"print(f\" Percent change: {pct_change:.1f}%\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Validation\n",
"\n",
"Compare against Census ACS published income distribution for LA City."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"execution": {
"iopub.execute_input": "2025-12-26T19:37:19.622129Z",
"iopub.status.busy": "2025-12-26T19:37:19.622059Z",
"iopub.status.idle": "2025-12-26T19:37:20.151509Z",
"shell.execute_reply": "2025-12-26T19:37:20.151159Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Census ACS 2023 LA City:\n",
" Median household income: $79,701\n",
" Total households: 1,460,167\n",
"\n",
"Our PUMS estimate: 1,394,196 households\n",
"Coverage: 95.5%\n"
]
}
],
"source": [
"# Get income distribution for LA City from Census API\n",
"url = \"https://api.census.gov/data/2023/acs/acs1\"\n",
"params = {\n",
" \"get\": \"NAME,B19013_001E,B19001_001E,\" + \",\".join([f\"B19001_{i:03d}E\" for i in range(2, 18)]),\n",
" \"for\": \"place:44000\",\n",
" \"in\": \"state:06\"\n",
"}\n",
"\n",
"resp = requests.get(url, params=params, timeout=30)\n",
"data = resp.json()\n",
"\n",
"median = int(data[1][data[0].index('B19013_001E')])\n",
"total_hh = int(data[1][data[0].index('B19001_001E')])\n",
"print(f\"Census ACS 2023 LA City:\")\n",
"print(f\" Median household income: ${median:,}\")\n",
"print(f\" Total households: {total_hh:,}\")\n",
"print(f\"\\nOur PUMS estimate: {results[0]['total_households']:,} households\")\n",
"print(f\"Coverage: {results[0]['total_households']/total_hh*100:.1f}%\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.11"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment