I have a series of points in a horizontal line, arranged into discrete spatial clusters by the underlying data pattern. Here's an example of what they look like:
How can I use PostGIS to group them spatially into the highlighted sets A, B and C?
This is a harder problem than it initially seemed to me. So far I've tried:
Using ST_X() to order them, then lag() to identify points between which there are gaps. Unfortunately this just groups the points into "gap" groups and "interior" groups.
Using convex hulls around arbitrary spatial thresholds. This both failed to separate all clusters and had the side effect of creating mid-cluster separations. (For future reference, this was a dumb idea.)
Using ST_GeoHash() to cluster them. This doesn't accurately catch the linear nature of the geometry.
Answer
I'm not at a computer that has access to PostGIS right now, but I feel as though this algorithm might work. Of course if you have vertical groups, you would need to use an exclusion or inclusion clause for ST_Y().
DECLARE @totalUnique int = 0
DECLARE @lastUnique int = 1
CREATE TABLE #TABLEX (ID1 int, ID2 int)
CREATE TABLE #TABLEX2 (ID1 int, ID2 int)
--Get distances of objects
INSERT INTO #TABLEX
SELECT ID1, ID2
FROM (
SELECT T1.ID AS ID1,
T2.ID AS ID2
FROM BaseTable AS T1
INNER JOIN
BaseTable AS T2
ON ST_Distance(T1.Shape, T2.Shape) <= SeparationDistance
) AS X
--Loop for as long as new connections can be made
WHILE(@lastUnique <> @totalUnique)
BEGIN
--Count the number of current connections
SELECT @lastUnique = COUNT(*)
FROM (
SELECT * FROM #TABLEX
GROUP BY ID1, ID2
) AS XX
--Look for new connections via current known paths
INSERT INTO #TABLEX (ID1, ID2)
SELECT A.ID1, B.ID2
FROM #TABLEX AS A
INNER JOIN
#TABLEX AS B
ON A.ID2 = B.ID1
AND
A.ID1 <> B.ID2
--Count the number of current connections
SELECT @totalUnique = COUNT(*)
FROM (
SELECT * FROM #TABLEX
GROUP BY ID1, ID2
) AS XX
--Group each path set by the lowest ID
INSERT INTO #TABLEX2(ID1, ID2)
SELECT MIN(ID1) AS theGroup, ID2
FROM #TABLEX
GROUP BY ID2
TRUNCATE TABLE #TABLEX
--Reload our new path sets
INSERT INTO #TABLEX (ID1, ID2)
SELECT ID1, ID2 FROM #TABLEX2
TRUNCATE TABLE #TABLEX2
END
--Show final results
SELECT ID1 AS theGroup, ID2
FROM #TABLEX
DROP TABLE #TABLEX
DROP TABLE #TABLEX2
No comments:
Post a Comment