Lifetime Citizen Portal Access — AI Briefings, Alerts & Unlimited Follows
How to extract dual‑vintage PUMA data from the 2022 ACS 5‑year PUMS using MDAT
Loading...
Summary
A step‑by‑step guide showing how to build two MDAT tables (2010 and 2020 PUMA boundaries), select the HHLANP variable and the appropriate PUMA codes, then combine totals to produce full 2022 ACS 5‑year PUMS estimates for specific PUMAs.
Tyson, presenter, demonstrated how to use the microdata access tool (MDAT) to get detailed PUMA‑level estimates from the 2022 American Community Survey (ACS) 5‑year Public Use Microdata Sample (PUMS) by creating two separate tables—one using 2010 PUMA boundaries and one using 2020 PUMA boundaries—and then combining their totals.
The tutorial is aimed at users who need more detailed categories (for example, the language spoken at home) than the prebuilt tables on data.census.gov provide. Tyson said, “Because the 2022 ACS 5‑year PUMS has dual vintages, we’ll actually have to create 2 separate tables to account for this,” and then walked through the exact MDAT steps.
He began by opening MDAT at data.census.gov/app/mdat, selecting the ACS 5‑year PUMS dataset and switching the vintage to 2022. Then Tyson added the detailed household language variable (HHLANP) and the PUMA variable as working variables. For the first table he selected the PUMA10 (2010) variable, warned that MDAT limits the number of PUMAs per table, and filtered the geography by state (New York and Pennsylvania) to disambiguate repeated PUMA codes across states.
Using the cart, he created a custom recode for the PUMA10 variable, searched for and selected code 01600 (Butler County, Pennsylvania) and 04110 (New York City Queens Community District 5 under 2010 boundaries), and gave each recode a readable label (for example, “Puma 01600 Butler County, PA”). He then built the table layout by moving geographies to columns and placing the recoded PUMA variable ahead of the language variable in the rows to show the two 2010 PUMAs with detailed language categories.
Tyson showed how to hide the total selected geographies column (so the table displays only the individual PUMAs) and noted that some columns will show zeros when a PUMA code exists in a state the user did not select. He pointed out that the first table covers only part of the five‑year period (2018–2021).
To capture 2022 data within the same ACS 5‑year PUMS, Tyson instructed viewers to repeat the process using the PUMA20 (2020) variable: return to the MDAT landing page, reselect the ACS 5‑year PUMS with vintage 2022, add the same HHLANP variable, choose PUMA20, and then recode the PUMA20 values (selecting 01600 for Butler County and 04405 for the New York City Queens CD5 2020 GEOID) and rename groups accordingly.
After building the second table with the 2020 boundaries and arranging the variables the same way, Tyson advised combining the two tables by adding the totals for each detailed language category by PUMA to obtain the full 2022 ACS 5‑year PUMS estimate. He closed by directing viewers to the resources link for additional guidance and mapping tools (data.census.gov mapping and TIGERweb) to confirm GEOID or boundary changes.
Practical takeaways: create one MDAT table for the 2010 PUMA boundaries and one for the 2020 boundaries, limit PUMAs per table as required, filter by state to avoid cross‑state code collisions, give recoded groups clear labels, hide the overall totals column for clarity, and sum the matching cells across the two tables to generate complete 2022 estimates.

