I have a large amount of locations that need to be inserted to a PostGIS database. However, I'd like to aggregate these points so that if there's already a location, say nearer than a kilometer, then to discard that point. I'd like a pointer or two on the appropriate query (or, if it can't be done in a reasonable way, then flame me...:)
I reckon it has to be done a bit like:
INSERT INTO locations VALUES (ST_GeographyFromText('POINT(54.55 26.33)'))
IF ST_Distance(ST_GeographyFromText('POINT(54.55 26.33)'), ) > 1000;
As most of the audience has probably understood by now, I'm not that bright when it comes to databases so don't be too harsh on me...
Answer
You can use ST_Distance and convert to geography
to test if there exists at least one location less than 1000 meters away:
SELECT
COUNT(*) = 0 AS should_insert
FROM
locations,
(
VALUES( SetSRID(MakePoint(54.55, 26.33), 4326) )
) AS new_value
WHERE
center IS NOT NULL
AND
ST_Distance( column1::geography, center::geography ) < 1000
LIMIT
1
;
where 54.55, 26.33
is the value you want to test against locations.center
in that example (column1
is the name automatically assigned by the usage of VALUES()
).
Then depending on the boolean result, you can decide to insert or not.
Otherwise, a more efficient method could be to insert all locations to a temporary
(or even better unlogged
if you have PostgreSQL 9.1) table, then let PostGIS cluster into cells, and use the result of this clustering to insert:
INSERT INTO locations (id, center)
SELECT
ids[1] AS id,
centers[1] AS center
FROM
(
SELECT
array_agg(id) AS ids,
array_agg(center) AS centers
FROM temporary_table
GROUP BY ST_SnapToGrid( ST_Transform(ST_SetSRID(center, 4326), 2163), 1000, 1000 )
) AS grouped
;
(this example works only with North American positions because of SRID 2163
).
No comments:
Post a Comment