Sunday, 9 October 2016

arcpy points in polygon check


I have a list of points and a list of polygons both in a separate shape files. I am trying to write a ArcPy script to loop through the points, and then the polygons respectively and tell me when a point is within a polygon. I have tried both the contains and touches methods on polygons but haven't had any luck. Any help would be appreciated. So far I have the following:


    pointFeatureClass = "C:\\Data\\POINTS2.dbf"

pointFieldList = arcpy.ListFields(pointFeatureClass)
for field in pointFieldList:
curField = field.name
arcpy.AddMessage(curField)

ptRows = arcpy.SearchCursor("C:\\Data\\POINTS2.dbf", "", "", "CID;Shape","CID;Shape")
for ptRow in ptRows:
#Now loop through poly's testing for intersect
shpRows = arcpy.SearchCursor("C:\\Data\\counties.shp", "", "", "NAME;Shape","NAME;Shape")
for shpRow in shpRows:

#arcpy.AddMessage("Checking for point "+ str(ptRow.CID) +" in: " + shpRow.NAME)
if shpRow.Shape.touches(ptRow.Shape):
arcpy.AddMessage("Got it: " + shpRow.NAME)

Answer



When using ArcPy geometry methods like contains, the coordinates are taken literally with no reprojection. If your XY coordinates are in UTM and your polygon coordinates are in Albers, ArcPy will not correct for the difference, and as a result the contains method will return False.


From a performance perspective, your current "brute force" approach is fairly inefficient: it is looping through all 17 million points times the number of polygons, and performing a computationally "expensive" geometry operation. The built-in Spatial Join tool probably has some space-partitioning algorithms to improve performance, but I think you are right to worry that a Spatial Join with so many points might hang. There are still some other ways you could improve performance:




  • Add a break statement when the contains method returns true. This will make it so your script doesn't continue iterating through other polygons after it has found one that intersects the point.





  • Add some basic partitioning to your points, where possible. Find the extent of your entire polygon layer (not just individual polygon features) and store the XMin, XMax, YMin, and YMax values. If the point has an X coordinate less than the XMin or greater than the XMax, or a Y coordinate less than the YMin or greater than the YMax, then the point must be outside of the polygon layer's extent and can be skipped before even doing any iterating using a continue statement.




  • Apply the same kind of partitioning described in the above bullet to each shape. The contains method of the ArcPy polygon performs a classic "point-in-polygon" search which involves finding the number of intersections between a line going through the point and the perimeter of the shape. This doesn't use any really advanced trig or calc, but it still takes awhile to run, especially for polygons with lots of vertices. You can avoid this by getting the XMin, XMax, YMin, and YMax values of the Extent property of the Polygon. Just as before, if the point has an X coordinate less than the XMin or greater than the XMax, or a Y coordinate less than the YMin or greater than the YMax, then the point must be outside of the polygon extent and can be skipped using a continue statement.




  • If you are feeling brave, you can leave Arc entirely and try using Shapely or another library with less overhead to do your Point-In-Polygon checking. You will want to avoid oddities and edge cases (like multipart polygon features) but I've found that libraries like Shapely have better performance than ArcPy.





No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...