Sunday, 15 December 2019

sql - Join based on maximum overlap in PostGIS/PostGresQL?


I have two sets of polygons in two tables. The sets overlap each other. For each polygon in set A, I would like to get the ID of the polygon in set B that it overlaps the most. I'm using PostgreSQL with the PostGIS extension.


I know just enough about SQL to know that you can only join based on true/false conditions. So this won't work:


SELECT
a.id as a_id,

b.id as b_id,
FROM
a
JOIN
b
ON
max(ST_Area(ST_Intersection(a.geom, b.geom)))

because max() can't be in the ON clause.


ST_Intersects() is a true/false test, so I could join on that, but polygons in set A will often overlap with more than one polygon in set B, and I need to know which one overlaps the most. ST_Intersects would presumably just return the first overlapping ID it came across, regardless of the extent of the overlap.



This seems like it should be do-able, but it's beyond me. Any thoughts?



Answer



You could use something like:


SELECT DISTINCT ON (a.id)
a.id as a_id,
b.id as b_id,
ST_Area(ST_Intersection(a.geom, b.geom)) as intersect_area
FROM a, b
ORDER BY a.id, ST_Area(ST_Intersection(a.geom, b.geom)) DESC


It:


1) Calculates ST_Area(ST_Intersection(a.geom, b.geom)) for every (a,b) pair of records.


2) Orders them by a.id and by intersect_area when a.id are equal.


3) In every group of equal a.id it picks the firs record (the first record has the highest intersect_area because of ordering on step 2).


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...