Saturday 25 January 2020

Reprojecting coordinates without building geometry object in ArcPy?


Is it possible with arcpy to reproject single x,y coordinates without building a complete geometry object?


For instance, something like this:


input = [[-8081441,5685214], [-8081446,5685216], [-8081442,5685219], [-8081440,5685211], [-8081441,5685214]]
output = [arcpy.A_Reproj_Fn((coord[0], coord[1]), source_SR, dest_SR) for coord in input]


I have huge arrays of coordinates to reproject, but for performance issue I do not want to build a geometry object before reprojecting.



Answer



There are some options available. Here is a very simple benchmarking. I am on a decent i7 machine with 16GB RAM and SSD disk, Windows 10, Python 2.7 (the built-in Python in ArcGIS 10.4.1).


Ideally you would be using pyproj from Anaconda (it is easy to connect Anaconda with arcpy, check this guide). Excellent in terms of performance: 8secs for transforming 1mln XY pairs.



import time
from functools import wraps
import numpy as np
from pyproj import Proj, transform


src_sr = Proj(init='epsg:4326')
dest_sr = Proj(init='epsg:3857')

def timethis(func):
'''
Decorator that reports the execution time.
'''
@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()

result = func(*args, **kwargs)
end = time.time()
print(func.__name__, round(end-start))
return result
return wrapper

#----------------------------------------------------------------------
@timethis
def prepare_data(size):
"""create numpy array with coords and fields"""

ys = np.random.uniform(0,89,size).tolist()
xs = np.random.uniform(-179,179,size).tolist()
return zip(xs,ys)

@timethis
def pyproj(data):
res = [list(transform(inProj,outProj,pair[0],pair[1])) for pair in data]
return res

data = prepare_data(1000000)

print data[0:2]
print pyproj(data)[0:2]

The output:


('prepare_data', 0.0)
[(-68.23467501281871, 15.374738820652137), (3.4997809786621303, 22.93714083606868)]
('pyproj', 8.0)
[[-7595849.276871487, 1732425.641861154], [389593.8364326526, 2624418.652993309]]



Next in terms of performance is using arcpy.da.NumPyArrayToFeatureClass and then back with arcpy.da.FeatureClassToNumPyArray. The worst one is arcpy.PointGeometry:


import arcpy
import numpy as np
import time
from functools import wraps

def timethis(func):
'''
Decorator that reports the execution time.
'''

@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
end = time.time()
print(func.__name__, round(end-start))
return result
return wrapper

temp_fc = r'in_memory\temp_fc'


src_sr = arcpy.SpatialReference(4326)
dest_sr = arcpy.SpatialReference(3857)

#----------------------------------------------------------------------
@timethis
def prepare_data(size):
"""create numpy array with coords and fields"""
ys = np.random.uniform(0,89,size).tolist()
xs = np.random.uniform(-179,179,size).tolist()

if arcpy.Exists(temp_fc):
arcpy.Delete_management(temp_fc)
return zip(xs,ys)

#----------------------------------------------------------------------
@timethis
def reproject_numpy(data):
"""reproject coords with arcpy.da module"""
arr = np.array(data,np.dtype([('x', np.float64), ('y', np.float64)]))
arcpy.da.NumPyArrayToFeatureClass(arr,temp_fc,['x', 'y'],src_sr)

res = arcpy.da.FeatureClassToNumPyArray(temp_fc,'SHAPE@XY',spatial_reference=dest_sr)
return [list(i[0]) for i in res]

#----------------------------------------------------------------------
@timethis
def reproject_geometry_objects(data):
"""reproject coords with arcpy.PointGeometry.projectAs()"""
geometries = (arcpy.PointGeometry(arcpy.Point(pair[0],pair[1]),src_sr).projectAs(dest_sr) for pair in data)
res = [[geom.firstPoint.X,geom.firstPoint.Y] for geom in geometries]
return res


sizes = [10**i for i in xrange(8)]
for size in sizes:
print
print '{size}'.format(size=size).center(50,'-')
data = prepare_data(size)
print data[0:2]
print reproject_numpy(data)[0:2]
print reproject_geometry_objects(data)[0:2]


The output:


------------------------1-------------------------
('prepare_data', 2.0)
[(-167.4990262605815, 20.72648057680592)]
('reproject_numpy', 0.0)
[[-18645906.311743677, 2359294.0419395217]]
('reproject_geometry_objects', 0.0)
[[-18645906.311697092, 2359294.0419164128]]

------------------------10------------------------

('prepare_data', 2.0)
[(-84.10710438738408, 54.32017524422917), (149.53056786201995, 60.5826412111139)]
('reproject_numpy', 0.0)
[[-9362760.0324575398, 7231028.3662902983], [16645666.672426889, 8530614.7980484683]]
('reproject_geometry_objects', 0.0)
[[-9362760.032500302, 7231028.3663340295], [16645666.672429098, 8530614.79807427]]

-----------------------100------------------------
('prepare_data', 0.0)
[(99.45852992001386, 10.777413690256289), (-128.66525647722594, 22.3855564491687)]

('reproject_numpy', 0.0)
[[11071672.905741969, 1206874.3036907075], [-14322950.83380558, 2557879.2814767426]]
('reproject_geometry_objects', 0.0)
[[11071672.905743506, 1206874.3037197446], [-14322950.833830737, 2557879.281497049]]

-----------------------1000-----------------------
('prepare_data', 0.0)
[(-173.27374656367525, 8.223262860596597), (-139.61519355591957, 10.62062833548439)]
('reproject_numpy', 0.0)
[[-19288745.235347211, 918568.44701982441], [-15541892.253658244, 1189112.2559826041]]

('reproject_geometry_objects', 1.0)
[[-19288745.235311065, 918568.4469744475], [-15541892.253649296, 1189112.25603746]]

----------------------10000-----------------------
('prepare_data', 0.0)
[(11.194744968264729, 7.877971732593098), (54.58669010557381, 7.112696412558614)]
('reproject_numpy', 0.0)
[[1246193.3093983289, 879748.16972375475], [6076562.5466901502, 793823.26923564379]]
('reproject_geometry_objects', 9.0)
[[1246193.309427791, 879748.1696780229], [6076562.546642701, 793823.2691861242]]


----------------------100000----------------------
('prepare_data', 0.0)
[(-119.9078743699582, 9.317778132275732), (-46.418166430278006, 4.464482461184021)]
('reproject_numpy', 1.0)
[[-13348083.516972212, 1041852.8393190451], [-5167246.6505450243, 497487.58635374776]]
('reproject_geometry_objects', 90.0)
[[-13348083.516967565, 1041852.8393501436], [-5167246.650575973, 497487.5863742894]]

---------------------1000000----------------------

('prepare_data', 0.0)
[(-69.46181556994199, 5.285715860826253), (-150.64679833442418, 17.05875285677861)]
('reproject_numpy', 7.0)
[[-7732453.938828676, 589239.59320861078], [-16769924.88017785, 1927665.2915218703]]

---------------------10000000---------------------
('prepare_data', 5.0)
[(110.4244111629609, 30.50859019906087), (-89.59380469024654, 30.291453068724223)]
('reproject_numpy', 84.0)
[[12292389.221812241, 3569093.3287834972], [-9973536.7163228001, 3541068.701018624]]


1mln XY pairs can be transformed in ~7secs with the numpy array conversion method. 10mln XY pairs can be transformed in ~85secs with the numpy array conversion method.


Not too bad! Pretty much the same as using pyproj. Unless you can't use pyproj, stick to this one.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...