Wednesday, 14 March 2018

postgresql - PostGIS query to retrieve the largest polygon for multi-polygons by grouping on attribute field


I'm working with the Natural Earth 10m Admin 1 States Provinces polygon data. This dataset contains multi-polygons for each state / province in the entire world. What I'd like to do is to grab the largest polygon for each state / province, then create a centroid that is inside that polygon for labeling purposes in web cartography.


I went ahead and calculated the area for each polygon using ST_Area(geom::geography) for the dataset and so far my PostGIS query looks like this, which returns the the largest polygon for a single state /province:


SELECT area, name, ST_Centroid(geom) geom
FROM ne_10m_admin_1_states_provinces
WHERE area IN (
SELECT area

FROM ne_10m_admin_1_states_provinces
WHERE name = 'state province name'
ORDER BY area DESC
LIMIT 1
)

However I'm unsure of how to determine the best way to iterate over all features in the dataset to grab the polygon for each state / province with the largest area. Would the best way to do this be to create a function in PGSQL? Or is there a simpler way via plain old SQL?



Answer



You want to one-step it, using the ST_Dump function to get the subcomponents...


WITH geoms AS (

SELECT name, (ST_Dump(geom)).geom AS geom
FROM ne_10m_admin_1_states_provinces
)
SELECT DISTINCT ON (name) name, ST_Centroid(geom) AS geom
FROM geoms
ORDER BY name ASC, ST_Area(geom) DESC;

The dump strips out the individual components. The order by sets up the relation in a good order so that the distinct on can pick off the first entry of each category (name) we care about.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...