Wednesday, 20 March 2019

arcpy - Python code to compare two fields in different feature layers?


I have two shapefiles which encompass 3897 features each. Corresponding shapefiles (village and watershed that flows into it) have the same ID (ws_id in one, OBJECTID in the other). Many watersheds overlap. Because of errors in the watershed layer, I want to compare the areas of each watershed with its corresponding village (settelment_area for villages and Shape_Area for the watersheds). If the watershed has a smaller area than the village it flows into, it must be deleted and the corresponding village's data added to that of the village into which the next watershed it flows in (data includes urban area and population).


I've tried doing this using cursors and temporary tables, but my python skills are not great and I seem to loose a lot of the related data along the way (ending up with empty lists).


using arcgis desktop 10.2 with advanced license


Code


#the aim of this code is to get rid of erroneous data by creating a new serviceshed_v0 shapefile which would contain only serviceshed which are larger than the settlement they run into.
#it could then be extended to have different minium size requirements for servicesheds


import arcpy
arcpy.CheckOutExtension("spatial")
arcpy.env.overwriteOutput = True

workdir = r'C:\Users\xxx\beneficiarylayers.gdb'
arcpy.env.workspace = workdir
fc1 = workdir + r'\servicesheds_v0'
fc2 = workdir + r'\miyun_settlements'
fc1table = arcpy.CreateTable_management(workdir, 'fc1table')
fc2table = arcpy.CreateTable_management(workdir, 'fc2table')

template = workdir + r'\template'
arcpy.CreateTable_management(workdir, 'servicesheds_v1', template)
newservicesheds = workdir + r'\servicesheds_v1'

rows = arcpy.SearchCursor(fc1)
for row in rows:
arcpy.Append_management('Shape_Area', fc1table)

del row, rows


rows = arcpy.SearchCursor(fc2)
for row in rows:
arcpy.Append_management('settlement_area', fc2table)

del row, rows

rows = arcpy.SearchCursor(fc1table)
for row in rows:
if fc1table > fc2table:
arcpy.Append_management(row, newservicesheds)


del row, rows

that's a first attempt. Problem is I realise tables are not the way to go because the data relating each tuple to a polygon is lost. i'd like to get something like this to work (below) but i'm not sure if it's allowed in python synthax


arcpy.CreateTable_management(workdir, 'servicesheds_v1')
newservicesheds = workdir + r'\servicesheds_v1'

cursor1=arcpy.da.SearchCursor(fc1, "Shape_Area")
cursor2=arcpy.da.SearchCursor(fc2, "settlement_area")


for row in cursor1:
if 'Shape_Area' in cursor1 > 'settlement_area' in cursor2:
arcpy.Append_management(row, newservicesheds)

FINAL CODE here's a method which allows a rapid creation of a list giving all the polygons that fulfil the function of being too small.


import arcpy
arcpy.CheckOutExtension("spatial")
arcpy.env.overwriteOutput = True

workdir = r'C:\Users\xx\beneficiarylayers.gdb'

arcpy.env.workspace = workdir
fc1 = workdir + r'\servicesheds_v0'
fc2 = workdir + r'\miyun_settlements'

#set up cursors
cursor1 = arcpy.da.SearchCursor(fc1, ["ws_id", "Shape_Area"])
cursor2 = arcpy.da.SearchCursor(fc2, ["OBJECTID", "settlement_area"])
wrongsheds = []
#make a dictionary and store values from watershed table
serviceshed_area = {}

for row in cursor1:
serviceshed_areas[row[0]] = row[1]

#loop through other table
for row in cursor2:
if row[1] > serviceshed_areas[row[0]]:
# if serviceshed_areas < village_Areas add to list and print
wrongsheds.append(row[0])
print "serviceshed {} is wrong".format(row[0])

Answer




it looks like there's a wee bit of confusion on what's greater/smaller than what but I'll leave that for you to manipulate. This solution will iterate ONLY ONCE through each table which will not confuse the cursors and will save you 10 million iterations(from loop in a loop iteration). Using a dictionary lookup is so fast it's considered 'free'. So populate a dictionary in one loop and refer to it in another loop (that is NOT inside the first loop). I can't speak for exactness/syntax as I did not execute it.


#set up cursors
cursor1 = arcpy.da.SearchCursor(fc1, ["OBJECTID", "Shape_Area"])
cursor2 = arcpy.da.SearchCursor(fc2, ["ws_id", "settlement_area"])

#make a dictionary and store values from watershed table
watershed_areas = {}
for row in cursor1:
watershed_areas[row[0]] = row[1]


#loop through other table
for row in cursor2:
if row[1] > watershed_areas[row[0]]:
# if watershed_area < village_Area
print "watershed {} is smaller".format(row[0])

No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...