Friday, 23 October 2015

python - Querying Thousands of Points with ST_Value()?


Currently I have an application using the psychopg2 Python library to query a database for elevation data. My current python implementation looks like the following:


   def GetElevation(lat, lon, cur):
point = "'SRID=4326;POINT({} {})'::geometry".format(lat,lon)
cur.execute("SELECT ST_Value(rast, {}) FROM dted0 WHERE ST_Intersects(rast, {});".format(point, point))
return cur.fetchone()[0]


This works, but I am was curious if I could pass in an array of latitutdes and an array of longitudes. I tried building a "point string" with several thousand queries, but I get an error saying that I can only pass 100 arguments to the function.


Is there a way to pass all the points I need in a single query?


The reason I am doing this is because I assumed doing only one transaction with the database would be faster rather than looping for each query.


-------------------EDIT 1----------------------


In an attempt to implement the top rated post, I've done the following:


import json
json_arr = []
for lat, lon in zip(lats, lons):
json_arr.append({'lat:': lat, 'lon':lon})

json_str = json.dumps(json_arr)

Next I defined a function per his advice:


def GetElevationsBatch(json_elevations, cur):
cur.execute(
"""SELECT ST_Value(rast,z.point)
FROM dted0
JOIN (
SELECT ST_SetSRID(ST_MakePoint(lat, lon), 4326) as point
FROM json_to_recordset(%s) AS z(lon double precision, lat double precision)

) AS z
ON ST_Intersects(rast,z.point)""", (json_elevations,))
return cur.fetchall()

However, when I call the function I get I don't get good things:


a = GetElevationsBatch(json_str, cur)
print("a = ", a)
# Result: a = None

-----------------------EDIT 2--------------------------



I've left the python plugin out for now until I can find the best query, so here is my latest attempt:


SELECT (ST_Dump(gv.geom)).geom, gv.val
FROM srtm , LATERAL ST_Intersection(rast, 'SRID=4326;MULTIPOINT(-111.305048568 38.0601633931,-111.822991286 38.6025320796,-111.136977796 38.3631992596,-111.206470006 38.971228396)'::geometry) AS gv
WHERE ST_Intersects(rast, 'SRID=4326;MULTIPOINT(-111.305048568 38.0601633931,-111.822991286 38.6025320796,-111.136977796 38.3631992596,-111.206470006 38.971228396)'::geometry);

In the above query, I am attempting to use the notion of MULTIPOINT geometries. However, I found that these calls "work", but they are much slower than my original method of simply querying for every point. I understand that I am not running on any amazing hardware, but should calls really take on the order of seconds for a simple elevation query? Seems to me there is something awry here. It is taking nearly 3 seconds to retrieve 4 points with the above code. To compare to my original method, it only takes about 100ms to retrieve 4 points.


Shouldn't a solution in which I only query the database once be quick than one where I have to query several times?



Answer



Okay I thought of another reason why my original answer might be slow. If the bounding box of your point covers enough area, it would produce a lot of rasters that require checking the Slow way.


So here is another answer, still using multi-point but doing your original single check approach:



SELECT dp.geom, ST_Value(dted0.rast, geom) AS val
FROM ST_Dump(your_multi_point_here) AS dp
JOIN dted0 ON ST_Intersects(dted0.rast,dp.geom) ;

No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...