Sunday 10 April 2016

r - retrieving all coordinates of gridded temperature data with its metadata that falls in Germany' polygon?


I have yearly gridded temperature observation for all countries, and I picked up recent 35 years' data for climate research. However, in original data, yearly temperature observation has been stored by global level, and I intend to retrieve all coordinates with its metadata (here its metadata is 12 month' average monthly temperature) only for Germany. I used data.table package to read those data in a clean format in R. Now I need to retrieve all points with its metadata that falls in German polygon. I don't need to visualize them first because I have to compute annual temperature for each coordinate that falls in Germany. I used maptools, sp packages to get this done, but my output was not what I expected.


Here is how I load my data in R (old post can be found here: old post):


getwd()
setwd("stella/data/air_temp_1980_2014/")


tempdat <- list.files(path = getwd(), recursive = TRUE)
TempDatas <- lapply(tempdat, function(x) {
data.table::fread(x, sep = " ", header = FALSE,
col.names = c("Long", "Lat", "Jan","Feb", "Mar", "April", "May",
"Jun", "Jul", "Aug", "Sept", "Oct", "Nov", "Dec"))
})

Here is how data look like (below example data is yearly temperature observation for the whole world):


> head(temperaturesDat)
$air_temp.1980

Long Lat Jan Feb Mar April May Jun Jul Aug Sept Oct Nov Dec
1: -179.75 71.25 -24.7 -24.0 -23.2 -18.3 -8.4 0.2 1.5 0.6 -2.0 -9.6 -18.6 -22.4
2: -179.75 68.75 -27.0 -28.2 -27.2 -21.6 -9.0 0.6 2.8 1.9 -0.2 -11.9 -22.7 -25.1
3: -179.75 68.25 -27.8 -28.5 -27.5 -22.0 -9.5 0.4 3.0 1.8 -0.8 -12.7 -23.6 -26.8
4: -179.75 67.75 -26.8 -26.6 -25.7 -20.5 -8.0 2.7 6.0 4.0 0.5 -12.2 -23.2 -27.3
5: -179.75 67.25 -29.1 -28.4 -27.5 -22.3 -9.7 2.2 6.2 3.3 -1.3 -15.4 -26.4 -31.1
---
85790: 179.75 -87.75 -26.7 -39.0 -52.0 -54.9 -54.3 -55.8 -58.0 -59.8 -58.0 -49.6 -32.9 -24.9
85791: 179.75 -88.25 -27.4 -39.8 -53.1 -56.0 -55.5 -56.9 -59.1 -61.2 -59.2 -50.7 -33.4 -25.3
85792: 179.75 -88.75 -27.5 -40.3 -53.7 -56.6 -56.1 -57.5 -59.8 -61.9 -59.8 -51.2 -33.5 -25.4

85793: 179.75 -89.25 -27.4 -40.3 -53.9 -56.7 -56.1 -57.7 -59.9 -62.2 -59.8 -51.3 -33.4 -25.1
85794: 179.75 -89.75 -27.2 -40.2 -54.0 -56.4 -55.8 -57.8 -59.7 -62.4 -59.8 -51.2 -33.5 -24.9

$air_temp.1981
Long Lat Jan Feb Mar April May Jun Jul Aug Sept Oct Nov Dec
1: -179.75 71.25 -22.4 -23.7 -22.2 -16.4 -4.2 0.7 1.7 0.7 -3.1 -9.0 -16.2 -22.0
2: -179.75 68.75 -26.6 -26.5 -25.5 -16.8 -4.2 0.7 1.8 1.2 -2.5 -8.5 -17.6 -26.9
3: -179.75 68.25 -27.1 -27.3 -26.1 -17.2 -5.0 0.7 2.1 1.1 -3.1 -9.8 -18.7 -27.3
4: -179.75 67.75 -25.7 -26.2 -24.8 -15.5 -3.7 3.0 5.3 3.5 -1.6 -9.7 -18.4 -26.1
5: -179.75 67.25 -27.5 -28.6 -27.3 -17.2 -5.5 2.5 5.7 3.0 -3.3 -13.2 -21.7 -28.6

---
85790: 179.75 -87.75 -25.7 -38.3 -50.0 -57.7 -53.3 -55.6 -54.7 -54.3 -60.2 -53.5 -39.6 -27.4
85791: 179.75 -88.25 -25.8 -39.0 -51.3 -58.7 -54.2 -56.8 -55.7 -55.1 -61.3 -54.7 -40.5 -28.0
85792: 179.75 -88.75 -25.6 -39.3 -51.9 -59.1 -54.5 -57.5 -56.3 -55.4 -61.9 -55.4 -40.9 -28.3
85793: 179.75 -89.25 -25.2 -39.3 -52.2 -59.1 -54.2 -57.8 -56.3 -55.3 -62.0 -55.6 -41.1 -28.1
85794: 179.75 -89.75 -24.8 -39.0 -52.3 -58.6 -53.8 -57.9 -56.0 -55.1 -62.0 -55.6 -41.4 -28.1

Here is what I tried to make this happen:


library(rgeos)
library(maptools)

library(sp)

data(wrld_simpl)
germany<- wrld_simpl[wrld_simpl@data$NAME =="Germany",]
pointDatas <- lapply(TempDatas, function(x) {
pts <- data.frame(Long=x$Long, Lat=x$Lat)
res <- pts[, c(1,2)]
})

then I tried Map and SpatialPointsDataFrame to get the coordinates that fall in Germany:



co<-proj4string(germany)
points_in_germany <- Map(SpatialPointsDataFrame, coords=pointDatas,
data=pointDatas, proj4string = CRS(co))

I tried my approach but I got an error down below:


Error in dots[[3L]][[1L]] : this S4 class is not subsettable
In addition: There were 15 warnings (use warnings() to see them)

but the result does not seem what I want. How can I improve implementation above? Any way to make this happen in R? Any idea?


Desired output:



Basically, I need to retrieve all points with its metadata (12-month' temperature observations) that fall in Germany's polygon in new data.frame, then compute annual temperature for each point and add it as a new column. How can I get this done? Any way to do this more efficiently in R? Any thought?


Update:


How to merge multiple SpatialPointsDataFrame into one? Any idea?



Answer



Following your question on reading and formatting this specific climate data:


First, add libraries and country boundaries.


library(sp)
library(raster)
library(maptools)


data(wrld_simpl)
germany<- wrld_simpl[wrld_simpl@data$NAME =="Germany",]

Now, assuming we have the data.frame "stations" created in your previous question (at this point, not sure why you split it into a list) we then create a unique station identifier using a concatenated vector of the coordinate pairs that will be used to match back to the climate data. Then we create a data.frame containing these unique ids along with coordinate pairs [x,y]. This data.frame is then coerced into a SpatialPointsDataFrame object.


xy <- unique( paste(stations[,1], stations[,2], sep="_")  )
stations.xy <- data.frame(long = as.numeric(unlist(lapply(strsplit(xy, "_"), function(x) {x[1]}))),
lat = as.numeric(unlist(lapply(strsplit(xy, "_"), function(x) {x[2]}))))
stations.xy$ID <- xy
coordinates(stations.xy) <- ~long+lat


For sub-setting to the Germany polygon boundary, first we must match the proj4string (assuming that the data are in the same projections). Then we can use a spatial overlay function to subset the station locations.


proj4string(stations.xy) <- proj4string(germany)
germany.pts <- raster::intersect(stations.xy, germany)

plot(germany)
plot(germany.pts,pch=19,add=TRUE)

We can then match the concatenated ids to create an bracket index query that will subset the full "stations" dataset to observations only occurring within the polygon boundary of Germany.


idx <- which(stations$ID %in% germany.pts$ID)
stations <- stations[idx,]


Now, you have a data.frame that matches your station locations. It can easily be subset and merged to the germany.pts SpatialPointsDataFrame.


stations.1929 <- merge(germany.pts, stations[stations$year == 1929,], by = "ID")
head(stations.1929@data)

No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...