Monday, 9 May 2016

arcgis 10.1 - Increasing speed of Python scripts using arcpy



Update 4/11/2014


It looks like the script was getting hung up on the Delete Features tool, so I switched to Truncate Table, as suggested in the answer below. I also removed the unused variables from the append tool.


Update 4/10/2014


I ran this script on my co-worker's computer (his machine has more memory AND contains ArcGIS 10.0/Python26) and it ran quickly. Hooray! Once my tech support finds the ArcGIS 10.0 CD, I will install and test to see if that improves the speed on my machine. To be clear, we are running the same script, our network drive and database connection are mapped identically, and the print statements are the same. I'll post an update here once that happens.


End updates


I need to increase the speed of some Python scripts that perform updates on an Oracle database. I had these Python scripts running well for a year+, via scheduled tasks and batch files to initiate the scripts. Last week I moved from an XP to a Windows 7 machine and ArcGIS 10.0 --> 10.1. Since then the scripts have become terribly slow. If I run this script using a small feature class (containing ~20 features) it runs in 30 seconds. If I use a medium feature class (~80,000 records) it runs in 15 minutes. The feature class I really need to be able to transfer quickly contains about 1,000,000 records - the script only goes as far as the print statement to check if the files exist (if statement in code below). This process would take only 35 minutes to complete on my XP/ArcGIS 10.0 machine.


Below is the simplified code I have been testing with. Does anyone have suggestions on what I can do to increase the speed? Thanks, Patty


import arcpy, os, sys
from arcpy import env
arcpy.env.overwriteOutput = True

from datetime import datetime
import smtplib
import string
import urllib

#Define variables
inWorkspace = "O:/LANDING_PAD/BOE/example.gdb"
lpFeatures = inWorkspace + os.sep + "fc1"
outWorkspace = "Database Connections\\THIS.sde"
arcpy.env.workspace = outWorkspace

workspace = ""
copyFC = outWorkspace + os.sep + "SDE.fc1_1" #The feature class the script will update via delete and append
schema_type = "NO_TEST"
fieldMappings = ""
subtype = ""
t = datetime.now()
print "This script began at: " + str(t)

if arcpy.Exists(lpFeatures) is True and arcpy.Exists(copyFC) is True:
print "Both files exist. Beginning delete..."

arcpy.DeleteFeatures_management(copyFC) #(copyFC)

print "ALL DONE DELETING!"

arcpy.Append_management(lpFeatures, copyFC, schema_type, fieldMappings, subtype) #Append data from landing pad to db
print "ALL DONE APPENDING!"
record_count = arcpy.GetCount_management(lpFeatures)
print record_count
r = datetime.now()
print "This script ended at: " + str(r)


Answer



I have wanted to comment first, but then it seemed more appropriate to wrap it to be an answer (even though it might be incomplete).


I've run your code on my machine (top hardware laptop with SSD) appending a file geodatabase feature class to a SQL Server geodatabase feature class on the same machine which took me around 13 min. I cannot tell you for sure why the speed of execution differs so much in your environment (10.0 >> 10.1), but you have asked for suggestions on what you can do to increase the speed, so here are some ideas which might increase the speed of running the script.


1) I run the script from the command line which is equivalent to running a .bat file (I run the script in the 64-bit flavor, I have ArcGISx6410.2 64bit Python installed).


c:\Python27\ArcGISx6410.2\python.exe c:\scripts\appendfc.py

From my experience, it is generally faster to run 64bit version of Python for executing long and heavy GP operations like Append. So you want to make sure you run this version of Python when executing the script.


2) I wouldn't recommend using arcpy.DeleteFeatures_management; it is much slower than running Truncate Table since the latter one doesn't use database transactions which improves performance over row-by-row deletion.


You mentioned that the script only goes as far as the print statement to check if the files exist (if statement in code). There is a good chance that it keeps deleting row by row which might be a very slow process when you accessing a table in a remote Oracle (or really any DBMS) database. Try running the script with Truncate Table, but with no Append first just to see the performance difference on the deletion stage.


3) You seem to be using "Database Connections\\THIS.sde" in the code. It is better however to refer to the connection file itself (.sde file) with the file system or UNC path, not the Catalog window "Database Connections" folder. You can access the .sde file created at C:\Users\%user%\AppData\Roaming\ESRI\Desktop10.1\ArcCatalog. You can move this .sde file around as you need and place in the folder the Python script will have access to.



4) In the arcpy.Append_management function, you use a couple of empty parameters. In theory, it shouldn't make any difference, but I would suggest running the function without specifying those parameters just because you don't need them. You never know what's going on behind the scenes and whether those empty strings are evaluated at some point and whether this can impact the performance. Just go with arcpy.Append_management(lpFeatures, copyFC, schema_type) and don't specify parameters for which you don't supply any values.


5) I discourage using os.sep when building a path to a feature class. Use os.path.join(geodatabase,featureclassname) for that instead. It is just cleaner and more readable.


You can add more details to the question after you've tried the things above and had some tests and code review.


A couple of good questions to read to get more insight on how to speed up the Python scripts in ArcGIS:


Performance of ArcGISScripting and large spatial data sets


Background Geoprocessing (64-bit)


Arcgis CopyFeatures tool extremely slow when exporting to SDE


Ways to Speed Up Python Scripts Running As ArcGIS Tools


Geoprocessing considerations for ArcSDE data


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...