I have a PostGIS query that will return several million rows:
SELECT
t1.id AS id1,
t2.id AS id2,
ABS(t1.mean_h - t2.mean_h) AS h_diff,
ST_Distance(t1.the_geom, t2.the_geom) AS dist
FROM tas_ponds as t1, tas_ponds as t2
WHERE
(t1.gid > t2.gid) AND
ST_DWithin(t1.the_geom, t2.the_geom, 17000)
When run in psql
, I get an out of memory for query result
error.
Googling suggests that this an error within psql rather than postgres/PostGIS. Would amending the query into the form SELECT ... INTO x FROM ...
fix the problem? Are there any other recommended approaches for dealing with very large datasets?
Answer
Some poking around does confirm this is a Postgres client problem, independent of spatial or server considerations: the client has a limited amount of memory to buffer the results before displaying them on the screen, which you're exceeding.
The recommended approach to handle this is to use a DECLARE / FETCH approach to access the data in smaller blocks than the total result set. You could also create a view with components of the query (e.g. distance) to cut down on the memory needed for the query operation itself.
No comments:
Post a Comment