I have a pretty big point feature class in a file geodatabase (~4 000 000 records). This is a regular grid of points with a 100m resolution.
I need to perform a kind of generalization on this layer. For this, I create a new grid where each point lies in the middle of 4 "old" points:
* * * *
o o o
* * * *
o o o
* * * *
[*] = point of the original grid - [o] = point of the new grid
The attribute value of each new point is calculated based on the weighted values of its 4 neighbors in the old grid. I thus loop on all the points of my new grid and, for each of them, I loop on all the points of my old grid, in order to find the neighbors (by comparing the values of X and Y in the attribute table). Once 4 neighbors have been found, we get out of the loop.
There is no methodological complexity here but my problem is that, based on my first tests, this script will last for weeks to complete...
Do you see any possibility to make it more efficient? A few ideas on the top of my head:
- Index the fields X and Y => I did that but didn't notice any significant performance change
- Do a spatial query to find the neighbors rather than an attribute-based one. Would that actually help? What spatial function in ArcGIS should do the job? I doubt that, e.g., buffering each new point will prove more efficient
- Transform the feature class into a NumPy Array. Would that help? I haven't worked a lot with NumPy so far and I wouldn't like to dive into it unless someone tells me it might really help reducing the processing time
- Anything else?
Answer
Thanks everybody for your help!
I finally found a very non-pythonic way to solve this issue... What was actually taking the most computing time was to find the 4 neighbors of each point. Rather than using the X and Y attributes (either with an arcpy cursor or within another data structure, such as a python ditionary), I ended up using the ArcGIS tool Generate near table. I assume this takes advantage of the spatial indexes and the performances are obviously much much higher, without me having to implement the index myself.
No comments:
Post a Comment