At the momenent I am working on a query, as described in an earlier question. I have the two tables
- testme (tracked GPS-profile with Point Geometry) and
- roads (geometry roads shapefile)
Besides the distance of each tracked point (row) in table pt in want to find the closest point using a combination of ST_Distance and ST_DWithin.
DROP TABLE IF EXISTS raw_2015_processed;
EXPLAIN ANALYZE
CREATE TABLE raw_2015_processed AS
SELECT pt.id,
pt."DeliveryID",
pt."VehicleID",
pt."TrackID",
pt."Longitude",
pt."Latitude",
pt."Altitude",
pt."Heading",
pt."Speed",
pt."Satelites",
pt."HDOP",
pt."VDOP",
pt."Xfcd",
pt.ts,
pt.received,
pt.the_geom,
(SELECT ST_ClosestPoint(line.geom,pt.the_geom) AS closest_geom
FROM roads AS line
WHERE ST_DWithin(line.geom,pt.the_geom, 0.5) LIMIT 1),
(SELECT ST_Distance(line.geom,pt.the_geom) AS distance
FROM roads AS line
ORDER BY pt.the_geom <#> line.geom LIMIT 1),
(SELECT ST_AsText(ST_ClosestPoint(line.geom, pt.the_geom)) AS closest_coordinates
FROM roads AS line
ORDER BY pt.the_geom <#> line.geom LIMIT 1)
FROM raw_2015 AS pt
ORDER by pt.id;`
Working with a reduced file 'Testme' of 144 rows EXPLAIN ANALYZE returns following Query Plan:
"Sort (cost=84207.04..84207.40 rows=144 width=131) (actual time=11797.677..11797.685 rows=144 loops=1)"
" Sort Key: pt.id"
" Sort Method: quicksort Memory: 54kB"
" -> Seq Scan on testme pt (cost=0.00..84201.87 rows=144 width=131) (actual time=82.458..11797.268 rows=144 loops=1)"
" SubPlan 1"
" -> Limit (cost=9.46..578.16 rows=1 width=155) (actual time=81.508..81.508 rows=1 loops=144)"
" -> Bitmap Heap Scan on roads line (cost=9.46..578.16 rows=1 width=155) (actual time=81.388..81.388 rows=1 loops=144)"
" Recheck Cond: (geom && st_expand(pt.the_geom, 0.5::double precision))"
" Rows Removed by Index Recheck: 0"
" Filter: ((pt.the_geom && st_expand(geom, 0.5::double precision)) AND _st_dwithin(geom, pt.the_geom, 0.5::double precision))"
" -> Bitmap Index Scan on geom_index_roads (cost=0.00..9.46 rows=139 width=0) (actual time=79.722..79.722 rows=450402 loops=144)"
" Index Cond: (geom && st_expand(pt.the_geom, 0.5::double precision))"
" SubPlan 2"
However, when I run the query for a larger dataset with more data points (< 5 Million points) , it gets very slow (i.e. several hours / days). Do you guys see a way to increase the speed of the query? Is there an an alternative option to st_dwithin or a different query structure which can proof to be helpful?
No comments:
Post a Comment