I want to cluster all villages (and towns, ...) around the world. So instead of having millions of them, I'd like to reduce them by combining villages that are close enough (like 10km or so) to each other.
So I was looking into ST_ClusterDBSCAN
and it's doing quite a good job. This is how I made my table:
INSERT INTO villages_clustered
SELECT
name,
way,
ST_ClusterDBSCAN(way, eps := 2000, minpoints := 1) over () AS cluster_id
FROM villages;
Now when I look at my data, I get lots of small clusters which is what I wanted. But there are also some, which are just suuuuper large that should split into smaller ones.
Would really love to know how to improve my query to get better results. What I basically want is clusters of like 10km or so.
Answer
The eps distance is the maximum distance between points in the cluster, not the maximum width of the entire cluster.
So if you have points A, B, and C, as long as each point is within the eps distance of one other point, then it gets included in the cluster. If the eps distance was 1 km, A could be within 1 km of B, and C can be within 1 km of B, but A can be 2 km from C and ABC are still a cluster because A & C are within 1 km of B.
https://en.wikipedia.org/wiki/DBSCAN
No comments:
Post a Comment