Tuesday, 9 February 2016

r - Same extent and resolution of rasters, but different number of cells


Based on Unable to writeRaster for signature "rasterPCA", "character", I obtained two rasters that are PC1 and PC2 of a bunch of climatic variables. However, irrespective of having the same extent and resolution, the number of cells differ in my global environment, when loaded into R.



Snapshot to show differences in number of cells.


enter image description here


Below is the code I am using, which is from the appendix of Hamann et al., 2015 and I get this error:


library(SDMTools)     # install package to read and write ESRI ASCII grids
library(yaImpute) # install package for k-nearest neighbour (kNN) search

lg1 <- asc2dataframe("C:\\Users\\rameshv\\LGM\\4_PCAforR\\PC_1.asc") # principal component grids
lg2 <- asc2dataframe("C:\\Users\\rameshv\\LGM\\4_PCAforR\\PC_2.asc")
present1 <-asc2dataframe("C:\\Users\\rameshv\\Present\\4_PCAforR\\PC_1.asc")
present2 <- asc2dataframe("C:\\Users\\rameshv\\Present\\4_PCAforR\\PC_2.asc")


idxy <- cbind(id=1:nrow(lg1),lg1[,1:2]) # data frame of IDs and XY coords
b <- (max(lg1$var.1)-min(lg1$var.1))/120 # bin size for 120 PC1 bins

l1 <- round(lg1$var.1/b) # convert PC1 to 120 bins via rounding
l2 <- round(lg2$var.1/b) # convert PC2 to <120 bins via rounding
p1 <- round(present1$var.1/b) # same for present PC1
p2 <- round(present2$var.1/b) # same for present PC2
l <- paste(l1,l2) # PC1/PC2 combinations in LGM climate
p <- paste(p1,p2) # PC1/PC2 combinations in present climate

u <- unique(p)[order(unique(p))] # list of unique PC1/PC2 combinations

sid <- c() # empty vector for source IDs
tid <- c() # empty vector for target IDs
d <- c() # empty vector for distances

for(i in u){ # loop for each unique PC1/PC2 combination
lxy <- idxy[which(l==i),] # coordinates of i-th combination in LGM
pxy <- idxy[which(p==i),] # coordinates of i-th combination in present
sid <- c(sid, lxy$id) # append i-th PC1/PC2 combination to previous


if(nrow(pxy)>0){ # kNN search unless no-analogue climate
knn <- data.frame(ann(as.matrix(pxy[,-1]), as.matrix(lxy[,-1]), k=1)$knnIndexDist)
tid <- c(tid, pxy[knn[,1],"id"]) # the IDs of the closest matches
d <- c(d, sqrt(knn[,2])) # their corresponding geographic distances
}
else { # else statement for no-analogue climates
tid <- c(tid, rep(NA,nrow(lxy))) # flag destinations as missing for no analogues
d <- c(d, rep(Inf,nrow(lxy))) # flag distances as infinity for no analogues
}

}

At the end of the for loop, I get this error:


Error in ann(as.matrix(pxy[, -1]), as.matrix(lxy[, -1]), k = 1) : 
error: nrow(ref) and nrow(target) must be > 0

I am not sure if the error has something to do with difference in number of cells? Any suggestions?


EDIT Based on Jeffrey Evans's answer, I tried converting it to a Spatial Pixels Dataframe, but I am getting differences in number of rows. See screenshot below.


enter image description here


Code below:



pre_pca<- as(rasterPCA(pres_stack)$map,"SpatialPixelsDataFrame")
head(pre_pca@data)

coor <- data.frame(coordinates(pre_pca), pre_pca@data)
> str(coor)
'data.frame': 54617 obs. of 9 variables:
$ x : num 72.6 72.7 72.7 72.8 72.9 ...
$ y : num 22.2 22.2 22.2 22.2 22.2 ...
$ PC1: num 98 95 93 87.9 83.9 ...
$ PC2: num -43.4 -43.2 -42.2 -43.7 -43.5 ...

$ PC3: num -4.3 -4.5 -4.12 -5 -5.08 ...
$ PC4: num -2.58 -2.55 -2.39 -2.6 -2.55 ...
$ PC5: num 0.726 0.729 0.677 0.387 0.437 ...
$ PC6: num 0.2668 0.1506 0.2056 0.0257 0.0321 ...
$ PC7: num 0.0308 0.0275 0.0287 0.0235 0.0326 ...

lg_pca <- as(rasterPCA(lg_stack)$map,"SpatialPixelsDataFrame")
coor2 <- data.frame(coordinates(lg_pca), lg_pca@data)
> str(coor2)
'data.frame': 54258 obs. of 9 variables:

$ x : num 80.7 80.8 80.8 80.9 80.9 ...
$ y : num 22.2 22.2 22.2 22.2 22.2 ...
$ PC1: num 18.2 18.3 19.3 21.3 22.3 ...
$ PC2: num -28.2 -28.3 -28.2 -27.4 -28.3 ...
$ PC3: num 6.84 7.2 6.78 7.44 7.13 ...
$ PC4: num 0.352 0.722 0.68 0.776 0.611 ...
$ PC5: num -5.74 -4.42 -4.44 -4.48 -4.35 ...
$ PC6: num 1.233 0.982 1.014 1.072 1.189 ...
$ PC7: num 0.0161 0.0146 0.0229 0.0149 0.015 ...


EDIT2


Tried suggestions in R, which included projecting the data, but upon converting it to a SpatialPixels Data Frame, the recurring error emerges again (See dimensions of coor and coor2 in this case). I have provided an explicit example where I have viewed the rasters to look at the nrow, ncell etc.


enter image description here



Answer



It looks like the issue is related to the asc2dataframe function and not the raster class objects. I am wondering if the function is dropping NA values when reading to a data.frame.


I would highly recommend using the raster::getValues function to coerce the PCA stack to a data.frame object. You are adding considerable processing overhead by saving the PCA rasters and then reading them back in using the asc2dataframe function. You can even nest the rasterPCA function in getValues and as.data.frame to do it in one-fell-swoop.


Add libraries and data


library(raster)
library(sp)
library(RStoolbox)

data(rlogo)

Calculate PCA on the raster stack and coerce into data.frame object. Check class and dimensions of results.


rpc <- as.data.frame(getValues(rasterPCA(rlogo)$map))
class(rpc)
head(rpc)
dim(rpc)
ncell(rlogo)

If you want to retain the spatial coordinates then you could use the SpatialPixelsDataFrame class.



rpc <- as(rasterPCA(rlogo)$map, "SpatialPixelsDataFrame") 
head(rpc@data)
class(rpc@data)
coordinates(rpc@data)

To convert to a data.frame with coordinates you would just pull the @data slot and add the coordinates slot using a call to data.frame. If you look at the resulting dimensions you will see that, in this example, the number of rows are the same as the number of raster values (n=7777).


rpc <- data.frame(coordinates(rpc), rpc@data)
dim(rpc)

No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...