I have a dataset osmMajor containing road data (from OSM; 670,780 features) and another dataset points containing some points of interest (500,000 points). For each feature in my point layer, I want to find the coordinate of the closest point located on the road network.
I have two options to do this: either ArcMap's near tool (Analysis tools/Proximity/Near), or Spatialite. I tried using ArcMap, but it seems a very long task to perform (I had to stop the process after 3 hours). For this reason, I was thinking about using Spatialite to do this, but it seems that the process is less straightforward than what I thought.
How should I write my Spatialite query to get what I want?
For Spatialite, I think that I should follow these steps:
- For each point feature, calculate the distance between the point and the closest road
- Around each point, create a buffer using the distance to the closest feature
- Get the intersection between the buffer and the road data (it should be a point)
- Get the coordinates (x,y) of the resulting point
Trying to implement these 4 steps on a sample dataset with only 10 points, and I encountered several issues.
For point (1.), I can find the overall smallest distance between the points and the roads, but I cannot have it for each point:
SELECT test.ID, min(Distance(test.Geometry, osmMajor.Geometry)) AS distance
FROM test, osmMajor
WHERE osmMajor.ROWID IN (
SELECT ROWID
FROM SpatialIndex
WHERE f_table_name = 'osmMajor'
AND search_frame = test.Geometry)
GROUP BY test.ID
I get a distance for only three points instead of the 10 that are in the test table. I suspect it has something to do with the way the spatial index works, because the query only returns the results for points 2, 3 and 4, which are the closest to the road network.
Can anyone give me advice on the best way to get the coordinate of the point on a feature which is closest to another point?
Answer
The logic of the Spatialite process will be:
1) Find the road that is closest to the reference point with
select 1.ID,2.ID,Min(Distance(1.geometry,2.geometry);
2) Find the closest point of the closest road that was just identified
select ClosestPoint(1.geometry,2.geometry) from 1,2 where 1.ID=x and 2.ID=y;
3) Tune the subquery to make the search from the spatial b-tree index of Spatialite so that it does not drop correct hits from the resultset but it still gives speed-up for the query.
Edit: Test procedure and results to prove that it is possible
It is often better to do the job step-by-step by using temporary tables. I made a line layer with 1602 features and point layer with 294 points and made a simple test.
First step is to create a table that select the closest road from each point and saves the result into a new table "mindist". Point IDs and line IDs are saved into the table for the next step. This is the slow part and spatial index would make the query faster.
create table mindist as
select "p"."ogc_fid" pid,
"l"."ogc_fid" lid,
min(Distance("p"."geometry","l"."geometry")) as distance
from "poi" p, "lin" l
group by pid
order by pid;
The next step is to go through the temporary table and select for each row the closest point from the geometry of the road to the point geometry. Points and lines are found with the IDs and because they are primary keys in the main tables the query will be fast.
create table closest_point as
select m.rowid as pid, lid,
closestpoint(l.geometry,p.geometry)as geometry
from mindist m, lin l, poi p
where m.pid=p.ogc_fid
and m.lid=l.ogc_fid
group by m.rowid;
This image shows the results. Lines are the original line layer (motorways), red dots are the original points (villages) and stars show the closest point on a motorway from the villages. Results look visually good to me.
No comments:
Post a Comment