I've run into a small problem where my for
loop has been running for 3.5 hrs+. I have a Geodataset in SQL Server
with a field called Shape
that stores co-ordinates in geometry
data type. Firstly, I connect R to my DB via ODBC
and retrieve the information I want (also converting the Shape
column to something readable)
library(sp)
library(rgeos)
library(RODBC)
ch<-odbcConnect("SpatialAnalysis", rows_at_time=1)
df<-sqlQuery(ch, "select OBJECTID, LOT_NO, Shape.STAsText() as WKT FROM SRC_PLI_QLD")
cnt<-sqlQuery(ch, "select count(OBJECTID) from SRC_PLI_QLD")
This has 2.5 million points. I now read them into a SpatialPointsDataFrame
, reading the first element first.
point.sp <- SpatialPointsDataFrame(readWKT(df$WKT[1]),
data=data.frame(OBJECTID=df$OBJECTID[1], LOT_NO=df$LOT_NO[1]))
Now I read the rest of the elements. AND this is where the problem is. It has been 3.5 hrs and still running.
for (n in 2:as.integer(cnt)) {
point.sp <- rbind(point.sp,
SpatialPointsDataFrame(readWKT(df$WKT[n]),
data.frame(OBJECTID=df$OBJECTID[n], LOT_NO=df$LOT_NO[n])))
}
What is the problem in the above mentioned for
loop? Is there another way I can do this?
Answer
I solved this in another way and it took less than 10 seconds. The change was in the query. Instead of retrieving Shape.STAsText()
as a WKT object from Shape
column, I retrieved the Lat
and Long
value from Shape
df<-sqlQuery(ch, "select OBJECTID, LOT_NO, Shape.STY as Lat, Shape.STX as Lon FROM SRC_PLI_QLD")
cnt<-sqlQuery(ch, "select count(OBJECTID) from SRC_PLI_QLD")
Then I did :
coordinates(df) =~Lat+Lon
and my dataframe df
has been converted to a SpatialPointsDataFrame
> class(df)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
No comments:
Post a Comment