I had a shapefile that I needed to integrate into my SQL Server DB. To do so, I used the very useful Shape2Sql.exe with which I chose 'Set SRID 4326' in the settings.
I now have a table that contains a geometry column while another table is showing to me the latitude and longitude of different addresses along with other useful information. This can be summed up as follows:
CREATE TABLE dbo.myTable1 (
FULL_ADDRESS varchar(120) NULL,
LONGITUDE varchar(50) NULL,
LATITUDE varchar(50) NULL
)
CREATE TABLE dbo.myTable2 (
POLYGON_NAME nvarchar(255) NULL,
geom geometry NULL
)
What I would like is to join the two tables based on the sets of coordinates that match a certain polygon. I took some inspiration from here and I basically did this:
SELECT DISTINCT
mt1.FULL_ADDRESS
, mt1.LONGITUDE
, mt1.LATITUDE
, mt2.POLYGON_NAME
INTO #TEMP_TABLE
FROM myTable1 mt1
LEFT JOIN myTable2 mt2
ON geom.STIntersects(geometry::STGeomFromWKB((geography::Point(mt2.LATITUDE, mt2.LONGITUDE, 4326)).STAsBinary(), 4326)) = 1
(the DISTINCT is useful because, for some reason that I can't explain yet, I get several matches for a single set)
The results thereof are exactly what I expect but, for 10,000 sets of coordinates, it almost took 3 hours. Problem is, I'm supposed to find the polygons of 25,000,000 sets (which means 300 days of computation...). I therefore need to find a trick to improve that query performance by several orders of magnitude.
BONUS QUESTION:
When I don't store the results of my query inside a temp table (i.e. when I want to see the results), I get the following error message:
A .NET Framework error occurred during execution of user-defined routine or aggregate "No extended information available.": .
But everything is fine when using a SELECT... INTO #TEMP_TABLE instead of a plain and simple SELECT. I couldn't find any piece of information about this message. Any idea to solve it?
Answer
The best way to increase spatial join is to create spatial index.
CREATE INDEX table_gist ON table USING GIST(geom);
Since your first table has coordinates but no geometry you can't create a spatial index. That's why it takes so much time.
You should consider adding a geometry column on your first table based on your coordinate.
ALTER TABLE mytable1
ADD COLUMN geom geometry;
UPDATE mytable1
SET geom=ST_MakePoint(longitude, latitude);
Then you can create an index on mytable1. Your ST_Intersects between your two geometries will be much faster.
PS : I haven't talk about geography/geometry and srid but you should handle it :)
No comments:
Post a Comment