Saturday 30 May 2015

Clip and copy files to same directory structure using OGR/GDAL with Python


We need to clip all our data (many TB) by a new boundary and save it while maintaining the same data structure. I want to use OGR/GDAL as the data is in shp/tab/kml/geoTif/ecw etc.


I can find the file and create the data structure in python but need to use ogr/gdal to then clip the file and save it to the new directory in the same projection and file type as the input. I have written the process to do this using arcpy but that doesn't work for tab and ecw, so I would prefer to go opensource but am new to this...


So the python is as follows --


import os, glob, sys, shutil, datetime, subprocess
top = r'C:\Program Files (x86)\MapInfo\Offline Dataset\MapInfo_Tables\Property Information' # Source directory for all files to check and process
#top = os.getcwd()
RootOutput = r'C:\junk' # Directory to copy/move files to
RootDirDelete='C:\\' #repatative name of root file path to remove from new directory name

clip_polygon='c:\junk\boundary.shp'
filecount=0
successcount=0
errorcount=0

print "Working in: "+ top

list =[]
f = open(RootOutput+'\\Success_LOG.txt', 'a')
f.write("Log of files Succesfully processed. RESULT of process run @:"+str(datetime.datetime.now())+"\n")

f.close()


for root, dirs, files in os.walk(top, topdown=False):
for directory in dirs:
currentPath=os.path.join(root,directory)
os.chdir(currentPath)
#arcpy.env.workspace = currentPath
print os.getcwd()
lstFCs = glob.glob('*.*')

print lstFCs
for fc in lstFCs:
OutPutDir=RootOutput+"\\"+str(os.getcwd()).strip(RootDirDelete)
filecount=filecount+1
if OutPutDir.endswith(".gdb"):
print "File in GDB...SKIPPING"
else:
try:
os.makedirs(OutPutDir)
print "Created: "+OutPutDir

except:
#pass
print "Unexpected error:", sys.exc_info()[0]

## try:
fcIn=currentPath+"\\"+fc
fcOut=OutPutDir+"\\"+fc
ext=fc[fc.find('.')+1:]
if ext=='shp':
subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clip_polygon, fcOut, fcIn], shell=True)

successcount=successcount+1
print "Data clipped: "+fcIn+" to " +fcOut
else:
print 'extension not shp'



I need




  • ogr clip function to clip data by clip_polygon for vector, raster and elevation



    -in arcpy it is like


    #arcpy.Clip_analysis(outPath+'\'+outFileN+'.shp', clip_features, outPath+'\'+outFileN+AOI+'.shp', xy_tolerance)


    -in ogr2ogr as a process I guess the following would work (Error when using gdal/ogr and python to clip a "shp" file by a "shp")


    #subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clipping_shp, output_shp, input_shp], shell=True)




  • projection should be same as input (usually EPSG 28355 or 4283)




UPDATE: I ran the above process and it gives the following errors, the subprocess when run independently reports a 1 but no file is written...



[Dbg]>>> fcIn
'C:\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp'
[Dbg]>>> fcOut
'C:\\junk\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp'
[Dbg]>>> clip_polygon
'c:\\junk\\boundary.shp'
[Dbg]>>> subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clip_polygon, fcOut, fcIn], shell=True)
1
[Dbg]>>>
[Dbg]>>> subprocess.check_call(['gdalinfo', fcIn], shell=True)

Traceback (most recent call last):
File "", line 1, in
File "C:\Python27\ArcGIS10.1\lib\subprocess.py", line 511, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['gdalinfo', 'C:\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp']' returned non-zero exit status 1
[Dbg]>>>

UPDATED


        if ext=='shp':
subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clip_polygon, fcOut, fcIn])


gives me the following error.


>>> 
Working in: C:\Program Files (x86)\MapInfo\Offline Dataset\MapInfo_Tables\Property Information
C:\Program Files (x86)\MapInfo\Offline Dataset\MapInfo_Tables\Property Information\shp
['TRC_DCDB.dbf', 'TRC_DCDB.prj', 'TRC_DCDB.sbn', 'TRC_DCDB.sbx', 'TRC_DCDB.shp', 'TRC_DCDB.shx', 'TRC_DCDB_Raw.dbf', 'TRC_DCDB_Raw.prj', 'TRC_DCDB_Raw.sbn', 'TRC_DCDB_Raw.sbx', 'TRC_DCDB_Raw.shp', 'TRC_DCDB_Raw.shx']
Unexpected error:
extension not shp
Unexpected error:
extension not shp

Unexpected error:
extension not shp
Unexpected error:
extension not shp
Unexpected error:
Traceback (most recent call last):
File "C:\junk\Clip_And_Copy_Files.py", line 45, in
subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clip_polygon, fcOut, fcIn])
File "C:\Python27\ArcGIS10.1\lib\subprocess.py", line 493, in call
return Popen(*popenargs, **kwargs).wait()

File "C:\Python27\ArcGIS10.1\lib\subprocess.py", line 679, in __init__
errread, errwrite)
File "C:\Python27\ArcGIS10.1\lib\subprocess.py", line 893, in _execute_child
startupinfo)
WindowsError: [Error 2] The system cannot find the file specified
>>>

From FWTools ogr2ogr in Python


I tried


>>> command = 'ogr2ogr -f "ESRI Shapefile" ' + fcIn + fcOut

>>> os.system

>>> os.system(command)
1
>>> os.getcwd()
'C:\\junk'
>>> fcOut
'C:\\junk\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp'
>>> fcIn
'C:\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp'

>>>

but again I get no output..




No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...