I'm looking for an efficient way to cluster lines independent of their direction. That means that a line between New York and Los Angeles should be in the same cluster as a line in the other direction between Los Angeles and New York. The start / end point locations should be similar (i.e. San Diego to Long Island should be in the same cluster as LA-NY but probably not San Francisco to Boston) and there are no intermediate points. Input data would be similar to this example:
(By Cassiopeia sweet at Japanese Wikipedia GFDL or CC-BY-SA-3.0, via Wikimedia Commons)
I have previously tried to sort the lines in advance, e.g. to make them all run from west to east, but this does not solve the problem for lines running from north to south and other way around.
Do you know any algorithm dealing with this problem? I've been looking but besides Algorithm to compute average direction of undirected segments I haven't found anything remotely helpful, so I must be using the wrong search terms.
Answer
If I understand you right you want to cluster lines that is about the same without respect to direction.
Here is an idea that I think could work.
split the lines in start point and end point
Cluster the points and get cluster id
Find lines with the same combination of cluster id. Those are a cluster
This should be possible in PostGIS (of course :-) ) version 2.3
I haven't tested the ST_ClusterDBSCAN function, but it should do the work.
If you have a line table like this:
CREATE TABLE the_lines
(
geom geometry(linestring),
id integer primary key
)
And you want to create the cluster where start and end points is max 10 km apart. And there must be at least 2 points to be a cluster then the query could be something like:
WITH point_id AS
(SELECT (ST_DumpPoints(geom)).geom, id FROM the_lines),
point_clusters as
(SELECT ST_ClusterDBSCAN(geom, 10000, 2) cluster_id, id line_id FROM point_id)
SELECT array_agg(a.line_id), a.cluster_id, b.cluster_id
FROM point_clusters a
INNER JOIN point_clusters b
ON a.line_id = b.line_id AND a.cluster_id < b.cluster_id
GROUP BY a.cluster_id, b.cluster_id
By joining with a.cluster_id
No comments:
Post a Comment