Thursday 15 September 2016

postgis - Only union/dissolve intersecting features to save time?



I have one table of about 300k polygons, some overlapping but most do not. I need to dissolve the overlapping ones and also keep the non-overlapping ones. I only need the geometries, no attributes. I can do this with:


create table polygons123_dissolved as
(select (st_dump(st_union(polygons123.wkb_geometry))).geom as geom
from polygons123
where polygons123.group_id = 1 and polygons123.type_id in (1,2,3))

Example: enter image description here


But this is very slow, more than an hour. How can i speed it up?


Explain output:


Result  (cost=37684.32..37949.59 rows=1000 width=32)

-> ProjectSet (cost=37684.32..37689.59 rows=1000 width=32)
-> Aggregate (cost=37684.32..37684.33 rows=1 width=32)
-> Seq Scan on polygons123 (cost=0.00..36538.54 rows=229155 width=849)
Filter: ((group_id = 1) AND (type_id = ANY ('{1,2,3}'::integer[])))

Answer



I would look into using ST_ClusterDBSCAN. I have had tremendous success using this function to solve many cluster like geometric problems.


WITH clusters 
AS(select st_clusterdbscan(shape,0,2) over() cluster_id,shape from table
)
select st_union(shape) shape from clusters

where cluster_id is not null
group by cluster_id
union
select shape from clusters where cluster_id is null

In this example the clusters cte will assign an id to each polygon that intersects another polygon with a distance of 0. Essentially any polygons that intersect another polygon will be added to a unique cluster group. the third parameter specifies how many intersecting polygons there needs to be for the cluster to be created. I chose 2 for this example. so a polygon that is alone will result in a null value, you could wrap COALESCE around this whole function and assign a 0 to the single polygons.


after the cte you should simply st_union the geometries that are not null and make sure to group by the cluster id number. then merge/union with the rest of the single polygons.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...