Friday 1 February 2019

spatial statistics - Self organizing maps and spatially Constrained Clustering in R


There are a few algorithms to perform spatial clustering in R but most of them focus on point pattern analysis (for example here). I would like to perform spatial constrained clustering for polygon data, something similar to this. There are some interesting Open-Source Tools to do this task like SOMVIS and other tools on the site from this link. But rather than using "another" tool I would like to try this in R since I work a lot in R and I'd like to keep that flexible working environment.


So there seems to be very good information here. Does anybody have expirience with a similar task and can anybody give me some advice on good ressources, packages, books etc?


Best,Johannes



Answer



I think you can do this by setting up a dissimilarity matrix between cells such that diss((i,j),(k,l)) is large for non-neighbours, and is the difference between your cell values at (i,j),(k,l) for neighbours. Then you feed the dissimilarity matrix into any clustering algorithm that takes a matrix - hclust, or pam or any of several in the Clustering Task View.



Then, for example, hclust would proceed by starting with each cell in its own cluster, and merging the cells closest in the dissimilarity matrix on the first step. This would have to be two spatially adjacent cells, and all subsequent steps of the cluster algorithm would only ever add adjacent cells to clusters.


Note the difference between spatial distance and dissimilarity. Clustering works on the dissimilarity matrix, which in a conventional clustering problem is the (non-spatial) "distance" between two of the objects you are trying to cluster (eg age difference, blood pressure difference etc). What I'm trying to do here is defining that dissimilarity so that for non-adjacent cells the distance is large, whatever the value of the spatial variable you are trying to cluster.


I had a quick play to see if I could get this going but I'd done something wrong. Part of the problem is that if you have a NxM grid, you end up with an (NxM)x(NxM) dissimilarity matrix (or triangle of it) and I was probably getting a dimension wrong somewhere. Maybe later...


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...