Wednesday, 5 October 2016

arcgis desktop - Merge from arcpy.ListTables() producing duplicate rows?



I was trying to manually merge file geodatabase tables and was getting Null values for the output file rows and decided to write a python script to merge the files. Here's my script: import arcpy


from arcpy import env
env.workspace = "G:/US county_Climate Hazard models/NWS_County/SUMCounty_NWS_new_frq.gdb"
arcpy.env.overwriteOutput = True
tableList = arcpy.ListTables()
arcpy.Merge_management(tableList, "C:/Documents/ArcGIS/Default.gdb/Merge_sum_freq")

The code takes some time to run but runs without errors. However, when I open ArcMap back again and check for the output, I get duplicate entries in the table I made. Here's how the output looks like:


enter image description here


As you see every entry is duplicated. Could someone explain why is that the case and how could I fix it? Is there something going on with the way I wrote the code?



****EDIT # 1****** I also added:


print tableList

and I got:


[u'CT_09001', u'CT_09003', u'CT_09005', u'Merge_sum_freq']

(I'm not sure why I get this "u" thing)


****EDIT #2***** The original question is now resolved thanks to Michael Miles-Stimson. I accidentally created Merge_sum_freq inside of SUMCounty_NWS_new_frq.gdb


****EDIT #3 NEW Question ***** I have a performance related question: every geodatabase table I need to merge has 1 row and 6 fields (including OBJECTID), which you could see on the screenshot I posted. When I run the above script for 10 tables, it runs in 10sec. For 30 tables - 8sec, 264 tables - ~2min. In the end I need to merge >3000 tables (every table with 1 row, so final table will have >3000 rows). Not sure if I'm extrapolating this right, but it seems like they would complete in >20min. Is this considered a good performance? If not, is there any way I could optimize anything?



Answer




Try this:


import arcpy
from arcpy import env

# two traps here for beginners, well spotted!
env.workspace = "G:/US county_Climate Hazard models/NWS_County/SUMCounty_NWS_new_frq.gdb"
env.overwriteOutput = True

outTable = "C:/Documents/ArcGIS/Default.gdb/Merge_sum_freq"


# Clear the way, just to be sure
if arcpy.Exists(outTable):
try:
arcpy.Delete_management(outTable)
except:
import sys
arcpy.AddError("Unable to delete table, may be locked!")
sys.exit(-1) # exit here, unable to complete

inTables = arcpy.ListTables()


# report the tables, check here for duplicates
for Tbl in inTables:
arcpy.AddMessage("Table : " + Tbl)

arcpy.Merge_management(inTables,outTable)

Have a look at your messages; I have used AddMessage and AddError as I don't know if you're using the toolbox or command line environment, these reporting tools work with both. If you see the table names more than once each there's a problem with listtables (unlikely)... most likely the script was unable to remove the previous table and has appended instead.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...