Friday 24 July 2015

postgis - Increase spatial query performance (st_difference)


I have two link tables


one is 9029 rows (and let's say this table is A) and the other is 11698 rows (and this table is B)


the both tables are road link and A is for tunnel link and B is road link


From table road link (table B), I need to erase segment where tunnels are (Table A)


Basically, it is Table B - Table A.


This is my spatial query though, it took me so much time and I do not know why.. It seems like other GIS specialists also experiences that st_difference takes quite lots of time.


CREATE TABLE rd_tunnel_erased AS
SELECT b.objectid, b.roadclass, b.linkclass, b.linktype, st_difference (b.shape, a.shape) as shape
FROM b, a


As I said b is road link and a is tunnel link (and the two table does have same schema but difference geometry)


Here are two screenshots that might help you


this is overlayed a and b table enter image description here


from b table, I erased (differenced) a table. enter image description here


With only 1 link, the st_difference worked fine and fast.


Somehow, when it goes over my real data, it takes more than 3~4 days (in fact, I do not know how many days and hours it took.. because it failed for trivial mistakes. I am afraid that I might have to waste my time for re-running this query and ended up getting nothing.)


All my tables are spatially indexed.



Answer



If I get this right, you want a new table with the 'difference' of both networks. To achieve that, I´d go for a two step operation:




  • create a copy of the roads table as your later result table

  • update its geometries with the difference.


This will make sure all other geometries (and attributes) will be there, as with the query you intended (provided the proper use of a JOIN), the resulting table would only hold those roads that were actually split!


Note that you will most likely get Multi geometries in return, thus the result table has to be prepared to accept them. Using ST_Multi on the geometries will force the column type to be MULTILINESTRING.

Create the copy:


CREATE TABLE rd_tunnel_erased AS
SELECT objectid,
roadclass,
linkclass,

linktype,
ST_Multi(shape) AS shape
FROM b;

CREATE INDEX rd_tunnel_erased_shape_idx
ON rd_tunnel_erased
USING GIST (shape);

Update geometries:


UPDATE rd_tunnel_erased AS rd

SET shape = ST_Difference(rd.shape, tn.shape)
FROM a AS tn
WHERE ST_Intersects(rd.shape, tn.shape);

VACUUM ANALYZE rd_tunnel_erased;


UPDATE:

If two or more tunnel segments fall on the same road segment, the implicit cross join in the UPDATE query will create an equal amount of the same road segment, each subtracted by one of the intersecting tunnel segment. One option would be to either ST_Collect them by an identifying attribute (e.g. if two-part tunnels have the same tunnel_id) or ST_Collect simply all tunnel segments prior to subtracting the roads:


WITH
blade AS (

SELECT ST_Collect(shape) AS shape
FROM a
--GROUP BY
)

UPDATE rd_tunnel_erased AS rd
SET shape = ST_Difference(rd.shape, tn.shape)
FROM blade AS tn
WHERE ST_Intersects(rd.shape, tn.shape);


Uncomment the GROUP BY clause when there is a proper .

There are other ways, but I find this the most forward and efficient one.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...