Thursday 27 August 2015

r - Understanding U.S. Census MSA to place relationships?


I need to aggregate place-level data by MSA (metropolitan statistical area). I'm referring to U.S, Census "places" and MSAs ("places" might be townships, boroughs, villages, etc. they're the level under counties).


The U.S. Census provides several types of relationships (https://www.census.gov/geo/maps-data/data/ua_rel_download.html):



  • Urban area to MSA

  • Urban area to place

  • MSA to principal city (principal cities are places, but not the only places in MSAs)


The DOL provides MSA to county relationships (https://www.dol.gov/owcp/regs/feeschedule/fee/fs04ctst.xls).


My understanding of MSAs may be off, but I expect MSAs are not made up of only whole counties. Some counties might fall partly into an MSA and partly out of it. Assuming that is correct, I am looking at the next level under counties, which are places, for a more granular/precise definition of MSAs. As it happens, even some places might fall partly into an MSA, but my data's definition doesn't go that far. I also have zip codes, but some zip codes are multi-place, so place is more precise than zip code for my purpose.



I am unable to find MSA to place relationships specifically.


Is there a direct source for these relationships?


If not, can they be derived from other data?


Or am I going about this the wrong way?


EDIT: I have 100,000s of records of business establishments. I aim to aggregate business establishment data by MSA and conduct regression analysis by MSA. The address data is not clean and I guess it would take too much work to normalise it enough that it becomes acceptable for geocoding into long-lat. So my strategy has been working with R to at least normalise "places" (in the Census meaning of the term) within addresses. (I realise some of my places straddle multiple counties, but counties are generally not specified in addresses.) Unfortunately (after months of part-time data-cleansing), I have only just realised that there is no relationship table for MSA-place. (I had mistakenly thought there was till now.) So I'm now wondering whether I can reconstruct such a table based on other available Census data, or whether I have to change my strategy for aggregation, for example by zipcode (though zip codes are less precise, for my purpose, than places) or by aiming for full geocoding in spite of the difficulty I perceive.



Answer



According to the U.S. Census, "Counties or equivalent entities form the geographic 'building blocks' for metropolitan and micropolitan statistical areas throughout the United States and Puerto Rico." Additionally, states are made up of counties; there are no multi-state counties. A step "under" counties, one finds "places"; places can straddle multiple counties, but not multiple states.


MSA to County Relationships are provided by the U.S. Census here, under the heading "Core based statistical areas (CBSAs), metropolitan divisions, and combined statistical areas (CSAs)".


County to Place Relationships are provided by the U.S. Census here. This includes ONLY census places, i.e. incorporated places and census-designated places (CDP). Many populated "places" are not included, which can lie outside of census places (e.g. a rural community) or inside (e.g. a neighbourhood of a city). I have not yet found a list of unincorporated/non-CDP places (and their counties and states).


Finally, to obtain MSA to place relationships, join the MSA to County Relationships file with the County to Place Relationships file on county, which can be done for example in Excel. Some places straddle multiple counties and some places can be homonyms across counties within states or across states. Counties can also be homonyms across states. Hence, an accurate representation of MSA to place relationships must also include relevant counties and states (all of which data is included in the above-mentioned files).



No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...