Recently I am working with geospatial data in R, I faced this problem. I downloaded E-OBS
gridded data from E-OBS
grid data, I am interested in 0.25
degree grid data of land surface temperature and precipitation for France only, but E-OBS
grid data is conducted on whole europe and all measurement in daily level. However, those E-OBS
grid data in netCDF
format (.nc
file), after decompressing those grid data, it is almost 1GB
size. I used ncdf4
package to read and process them in R.
Here is what I did in R:
library(ncdf4)
dat <- ncdf4::nc_open("~/rr_0.25deg_reg_1980-1994_v17.0.nc/rr_0.25deg_reg_1980-1994_v17.0.nc")
here is E-OBS
grid data information down below:
> dat
File ~/rr_0.25deg_reg_1980-1994_v17.0.nc/rr_0.25deg_reg_1980-1994_v17.0.nc (NC_FORMAT_CLASSIC):
1 variables (excluding dimension variables):
short rr[longitude,latitude,time]
long_name: rainfall
units: mm
standard_name: thickness_of_rainfall_amount
_FillValue: -9999
scale_factor: 0.100000001490116
3 dimensions:
longitude Size:464
long_name: Longitude values
units: degrees_east
standard_name: longitude
latitude Size:201
long_name: Latitude values
units: degrees_north
standard_name: latitude
time Size:5479 *** is unlimited ***
long_name: Time in days
units: days since 1950-01-01 00:00
standard_name: time
5 global attributes:
Ensembles_ECAD: 17.0
Conventions: CF-1.4
References: http://www.ecad.eu\nhttp://www.ecad.eu/download/ensembles/ensembles.php\nhttp://www.ecad.eu/download/ensembles/Haylock_et_al_2008.pdf
history: Wed Apr 4 11:37:33 2018: ncks -a -d time,10957,16435 rr_0.25deg_regular_1.nc rr_0.25deg_reg_1980-1994_v17.0.nc
Wed Apr 4 11:36:14 2018: ncks -a --mk_rec_dmn time rr_0.25deg_regular.nc rr_0.25deg_regular_1.nc
NCO: 4.6.7
here is the code I want to access its metadata:
dat$dim$longitude$vals
dat$dim$latitude$vals
dat$dim$time$vals
lon = ncvar_get(dat, varid = "longitude")
lat = ncvar_get(dat, varid = "latitude")
but when I hit this code precip = ncvar_get(dat, varid = "rr")
, Rstudio raised memory error down below:
> precip = ncvar_get(dat, varid = "rr")
Error: cannot allocate vector of size 1.9 Gb
I intend to read those grid data and extract all grid that belongs to France. It is not an ordinary task for me because it is my very first time to work with geospatial data. I learned ArcGis
and tried another way to read those grid data in ArcGis Descktop
(version: ArcGis desktop 10.4.1
). So I used some of ArcGis
tool down below:
from Multidimension Tool.tbx
-> Make NetCDF Raster Layer
, load .nc
file and make raster. here is the screenshot down below:
My goal is to retrieve all grid data of France from original E-OBS
grid data. Since R raised memory problem, I tried to find way to read chosen E-OBS
grid data in ArcGis
and export its attribute table data and work with in R later on.
To do this, I also tried like this:
from Multidimension Tool.tbx
-> Make NetCDF Table view
, to see its attribute table of those grid data in ArcGis
, but it returned an empty table.
I tried to read and process those E-OBS
grid data both in R
and ArcGis
, but I couldn't get all grid data of France.
How can I pull out all grid data of one specific country?
I also tried conversion toolbox in ArcGIS
down below:
from Conversion Tools.tbx
-> From Raster
-> Raster to ASCII
, but exported ASCII
file, I have respective precipitation observation, but I lost its respective geocoordinate of the grid node.
Desired output:
My goal is to extract interested grid raster in ASCII
format file because I want those extract raster in vector data. I want a format something like this in ASCII
file:
-33.750 77.250 -34.8 -38.6 -31.2 -29.8 -18.8 -12.1 -9.7 -8.8 -15.8 -24.9 -35.4 -38.7
-33.750 76.750 -34.3 -38.2 -30.8 -29.5 -18.8 -12.2 -9.9 -8.9 -15.6 -25.0 -35.0 -38.9
-33.750 76.250 -34.7 -38.3 -31.4 -29.9 -19.7 -12.9 -10.7 -9.5 -16.0 -25.9 -35.5 -39.8
-33.750 75.750 -34.7 -38.2 -31.5 -29.9 -20.1 -13.1 -11.1 -9.8 -15.8 -26.2 -35.5 -40.3
-33.750 75.250 -34.5 -37.9 -31.2 -29.6 -19.9 -12.9 -10.8 -9.4 -15.0 -25.8 -35.2 -40.3
-33.750 74.750 -33.9 -37.3 -30.3 -28.7 -19.0 -12.2 -10.2 -8.7 -14.0 -24.8 -34.4 -40.1
Answer
Just use the raster and ncdf4 R packages in conjugation. With both of these packages added, read the nc file using the raster::stack
or raster::brick
functions. This will keep everything memory safe as well. If there is any transposition that needs to happen, which is sometimes the case with the NetCDF format, there are some functions in raster for flipping the raster(s).
add libraries
library(raster)
library(ncdf4)
library(R.utils)
library(maptools)
Here we download the data, unzip it and create a raster stack. You should set your working directory using setwd()
so you know where everything is going.
url = "https://www.ecad.eu/download/ensembles/data/Grid_0.25deg_reg/rr_0.25deg_reg_1980-1994_v17.0.nc.gz"
download.file(url, basename(url))
gunzip(basename(url))
rname <- list.files(getwd(), "nc$")
( r <- stack(rname[1]) )
To subset the data you can use crop with a defined extent. I pulled the extent of France from the "wrld_simpl" dataset included with the maptools package.
data(wrld_simpl)
france <- wrld_simpl[wrld_simpl@data$NAME == "France",]
e <- extent(france)
plot(r[[20]])
plot(e,add=TRUE)
plot(france,add=TRUE)
Here is where we crop the entire raster stack to the defined extent.
r <- crop(r, e)
plot(r[[20]])
plot(e,add=TRUE)
plot(france,add=TRUE)
Now that the data is subset, we can write a more friendly compressed format (LZW geoTiff).
writeRaster(r, "France_rr_0.25deg_reg_1980-1994.tif",
options="COMPRESS=LZW", overwrite=TRUE)
You could also write out a flat file format but, I highly recommend against this. This subset is quite small with only 201 rows and 464 columns (93,264 unique coordinate pairs). It however contains 2557 layers resulting in matrix with 238,476,048 values. If the intent is attempting to analyze this date in something like excel, this is just not tractable. The easiest way to format the data with coordinates and then write to a flat file so to coerce to an sp class object, add the coordinates to the data slot and then write the data.frame associated with the object.
r <- as(r, "SpatialPixelsDataFrame")
r@data <- data.frame(coordinates(r), r@data)
str(r@data)
write.csv(r@data, "climate_data.csv", row.names = FALSE)
No comments:
Post a Comment