The problem: I have a geodatabase with several datasets and many more feature classes within. The fields within the feature classes have been populated through joins with shapefiles and manual edits. Often times string fields will become populated with whitespace (i.e. '', ' ', ' ', etc) or the string "Null", and numeric fields will become populated with a zero (0). I would like to find these records and replace them with a true NULL value. I have the following code, which uses an UpdateCursor, but it still very slow and doesn't catch all of the NULL-like records. Does anyone know of other ways to accomplish this task?
GDB = arcpy.GetParameterAsText(0) #input geodatabase
arcpy.env.workspace = GDB
datasetList = arcpy.ListDatasets() #list datasets
for dataset in datasetList:
arcpy.env.workspace = os.path.join(GDB, dataset)
fcList = arcpy.ListFeatureClasses()
for fc in fcList:
arcpy.AddMessage("Processing %s..." % fc)
#count features
arcpy.MakeTableView_management(fc, "tempTableView")
count = int(arcpy.GetCount_management("tempTableView").getOutput(0))
if count > 0:
fieldList = arcpy.ListFields(fc)
for field in fieldList:
arcpy.AddMessage("...%s" % field.name)
rows = arcpy.UpdateCursor(fc)
for row in rows:
count = 0
if row.isNull(field.name):
continue # if already null
elif field.type == "Text":
value = row.getValue(field.name)
if value.lstrip(' ') == '' or value.lower() == '': # looks for whitespace or ''
row.setNull(field.name)
count += 1
elif field.type == "ShortInteger" or field.type == "LongInteger" or field.type == "Float" or field.type == "Double":
value = row.getValue(field.name)
if value == 0:
row.setNull(field.name)
count += 1
if count > 0: # update row if records have changed
rows.updateRow(row)
del rows
else:
arcpy.AddMessage("...NO RECORDS FOUND.")
Answer
Since I'm most familiar with 10.1+, and cursors in general are a lot better in the da
module, here's a potential solution. Currently, you are creating a cursor each time you change fields, which means you are taking a hit there. Furthermore, you are checking the field type for each record instead of just using the field type once to filter initially.
I've changed how Null
values are checked, but I haven't thoroughly tested it to check for all possible values. For the small sample dataset I had, it worked @ 10.2.2.
#Return None if the value needs to be changed, else return the value
def nullify(value):
x = value
if value is not None: #True null fields are read as None types
if type(value) == str:
if value.lstrip(' ') == '' or value.lower() == '':
x = None
else:
if value == 0:
x = None
return x
#We're only interested in some fields
ftypes = ("String", "SmallInteger", "Integer", "Double")
fieldList = [f.name for f in arcpy.ListFields(fc) if f.type in ftypes]
with arcpy.da.UpdateCursor(fc, fieldList) as rows:
for row in rows:
nulled = map(nullify, row)
if row != nulled: #Only update if the row actually needs to be changed.
rows.updateRow(nulled)
No comments:
Post a Comment