Wednesday, 12 August 2015

Problems with NA values when reading .DEM file with R 'raster' package in Windows


I am trying to read a raster file in a .DEM format on windows using the 'raster' package in R.


I get problems with NA values, when loading the data into R in Windows 7, but I do not have the problem on a Mac with OSX Lion. On windows, the NA values do not seem to be read correctly. The question is why this happens?


The raster file used was downloaded from USGS with the following R code:


download.file('http://edcftp.cr.usgs.gov/pub/data/gtopo30/global/e020n90.tar.gz', 'e020n90.tar.gz')

untar('e020n90.tar.gz')

Then I read the raster into R using the 'raster' package. In OSX Lion and R64 version 2.13.1, the NA values are recognized:


> onMac <- raster('E020N90.DEM')
> onMac
class : RasterLayer
dimensions : 6000, 4800, 28800000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : 20, 60, 40, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs

values : /Users/Tam/Desktop/E020N90.DEM
min value : -9999
max value : 5483

> summary(values(onMac))
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
-137 85 148 213 213 5483 13046160

But on Windows 7 (64Bit, same R version) it converts the cell values that should be NA's into numbers:


> onWindows <- raster('E020N90.DEM')

> onWindows
class : RasterLayer
dimensions : 6000, 4800, 28800000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : 20, 60, 40, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0
values : E:/WorldDegreeDays/gsoddata/gtopo/E020N90.DEM
min value : -9999
max value : 5483


> summary(values(onWindows))
Min. 1st Qu. Median Mean 3rd Qu. Max.
1 150 946 27190 55540 65540

Why are there no NA values in the raster when I read it on Windows? How could I work around it? My guess is it has to do with the way numbers are stored, a lot of the NA values are converted to 55540.


Info from Windows (after loading raster):


SessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-mingw32/x64 (64-bit)


locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base


other attached packages:
[1] rgdal_0.7-1 raster_1.9-12 sp_0.9-88

loaded via a namespace (and not attached):
[1] grid_2.13.1 lattice_0.19-30

Info from OSX (after loading raster):


R version 2.13.1 (2011-07-08)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)


locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base

other attached packages:
[1] rgdal_0.6-33 raster_1.9-12 sp_0.9-88


loaded via a namespace (and not attached):
[1] grid_2.13.1 lattice_0.19-33

Answer



One workaround is just to go for the raw data, since this is a very simple file format.


Not for everyone, but it can be illuminating to see what is happening.


## all these details are in the .HDR file
NROWS <- 6000
NCOLS <- 4800

At this point you can try the different options for integer sign and endianness directly, and reading this way we achieve what Robert does with the > 32767 transformation after the file is read.



x1 <- readBin("E020N90.DEM", "integer", size = 2, signed = TRUE, n = NROWS * NCOLS, endian = "big")

range(x1)
[1] -9999 5483

x1[x1 < -9998] <- NA

## now for the simple georeferencing, also in the HDR file

ULXMAP <- 20.00416666666667

ULYMAP <- 89.99583333333334
XDIM <- 0.00833333333333
YDIM <- 0.00833333333333

## now generate x/y coordinates, and the data matrix (flip on Y)
x <- list(x = seq(ULXMAP, by = XDIM, length = NCOLS),
y = seq(ULYMAP - NROWS * YDIM, by = YDIM, length = NROWS),
z = matrix(x1, nrow = NCOLS)[ , NROWS:1])

library(sp)


x <- image2Grid(x)

library(raster)
r <- raster(x)

plot(r)

enter image description here


Finally, set the projection as it is read by raster (and this would give the same aspect ratio in the plot that is seen when read that way).



projection(r) <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0"

EDIT: Whoops, had forgotten to subtract from the top, now fixed - there's still a half-cell issue I haven't gotten to the bottom of as well.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...