I have a table of about 150,000 points in PostGIS with a spatial index and a SRID of 27700 (OS). I want to select the points that fall outside England and Wales. I have a multipolygon table with just one record in it (a dissolved polygon of England and Wales).
I would have thought that the following query would select this quite quickly, but it was still running after about 5 hours (I actually forgot I'd started it and was off doing something else). I've tried a few variations using ST_Within to do it the other way around, but I can't get a result in a reasonable time (I have yet to wait for the query to finish running). Considering the points were generating in pgRouting in about 5 mins, I'm surprised it takes this long to do any analysis of them.
Admittedly I'm using an old PC but still with 2GB of ram I would have thought this was a relatively simple query. Is it likely that it should take this long or is something likely to be wrong. The query I've used is below:
SELECT pk
FROM catchment_distance_output, england_wales_os
WHERE ST_Disjoint (catchment_distance_output.geometry, england_wales_os.geometry);
I imagine I'm doing something obvious, but I'm very new to this and learning as I go.
Answer
Ironically, the fastest way to find the set of things not within other things is to do a full join that finds the contained things, but using a LEFT JOIN, so the un-matched things are hanging about to be found, thus:
SELECT pts.*
FROM pts LEFT JOIN polys
ON ST_Contains(polys.geom, pts.geom)
WHERE polys.id IS NULL;
The un-matched rows in a left join are returned as NULL so doing an IS NULL test on a column that you know is declared NOT NULL finds you all the un-matched rows.
No comments:
Post a Comment