Sunday 23 August 2015

clustering - Finding clusters of points based distance rule using R?



I have some points that I need to divide into groups based on clusters of points:



x <- c(-1.482156, -1.482318, -1.482129, -1.482880, -1.485735, -1.485770, -1.485913, -1.484275, -1.485866)
y <- c(54.90083, 54.90078, 54.90077, 54.90011, 54.89936, 54.89935, 54.89935, 54.89879, 54.89902)

library(sp)
xy <- SpatialPoints(matrix(c(x,y), ncol=2))
proj4string(xy) <- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84")
xyTransformed <- spTransform(xy, CRS("+init=epsg:27700 +datum=WGS84"))

plot(xyTransformed)


enter image description here


Each group should be defined using this rule: all points in the group should be within 40m of each other.


Based on the points in code above, I believe I should end up with 3-4 groups.


Does anybody know how to do this in R?



Answer



You can use a hierarchical clustering approach. By applying hclust and cutree you can derive clusters that are within a specified distance. Another way is to use the spdep package and calculate a distance matrix using dnearneigh. If you would like code for the dnearneigh approach let me know and I will post it.


require(sp)
require(rgdal)
d=40 # Distance threshold


# Create example data and transform into projected coordinate system
x <- c(-1.482156, -1.482318, -1.482129, -1.482880, -1.485735, -1.485770, -1.485913, -1.484275, -1.485866)
y <- c(54.90083, 54.90078, 54.90077, 54.90011, 54.89936, 54.89935, 54.89935, 54.89879, 54.89902)

xy <- SpatialPointsDataFrame(matrix(c(x,y), ncol=2), data.frame(ID=seq(1:length(x))),
proj4string=CRS("+proj=longlat +ellps=WGS84 +datum=WGS84"))

xy <- spTransform(xy, CRS("+init=epsg:27700 +datum=WGS84"))

chc <- hclust(dist(data.frame(rownames=rownames(xy@data), x=coordinates(xy)[,1],

y=coordinates(xy)[,2])), method="complete")

# Distance with a 40m threshold
chc.d40 <- cutree(chc, h=d)

# Join results to meuse sp points
xy@data <- data.frame(xy@data, Clust=chc.d40)

# Plot results
plot(xy, col=factor(xy@data$Clust), pch=19)

box(col="black")
title(main="Clustering")
legend("topleft", legend=paste("Cluster", 1:4,sep=""),
col=palette()[1:4], pch=rep(19,4), bg="white")

No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...