I am converting vector to raster in R. However the process was too long. Is there possibility to put the script into multithread or GPU processing in order to do it more faster?
My script to rasterized vector.
r.raster = raster()
extent(r.raster) = extent(setor) #definindo o extent do raster
res(r.raster) = 10 #definindo o tamanho do pixel
setor.r = rasterize(setor, r.raster, 'dens_imov')
r.raster
class : RasterLayer dimensions : 9636, 11476, 110582736 (nrow, ncol, ncell) resolution : 10, 10 (x, y) extent : 505755, 620515, 8555432, 8651792 (xmin, xmax, ymin, ymax) coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
setor
class : SpatialPolygonsDataFrame features : 5419 extent : 505755, 620515.4, 8555429, 8651792 (xmin, xmax, ymin, ymax) coord. ref. : +proj=utm +zone=24 +south +ellps=GRS80 +units=m +no_defs variables : 6 names : ID,CD_GEOCODI, TIPO, dens_imov,area_m,domicilios1 min values : 35464, 290110605000001, RURAL, 0.00000003,100004,1.0000 max values : 58468, 293320820000042, URBANO, 0.54581673,99996, 99.0000
Answer
I tried to "parallelize" the function rasterize
using the R
package parallel
in this way:
- split the SpatialPolygonsDataFrame object in
n
parts rasterize
every part separately- merge all the parts into one raster
In my computer, the parallelized rasterize
function took 2.75 times less than the no-parallelized rasterize
function.
Note: the code below download a polygon shapefile (~26.2 MB) from the web. You can use any SpatialPolygonDataFrame object. This is only an example.
Load libraries and example data:
# Load libraries
library('raster')
library('rgdal')
# Load a SpatialPolygonsDataFrame example
# Load Brazil administrative level 2 shapefile
BRA_adm2 <- raster::getData(country = "BRA", level = 2)
# Convert NAMES level 2 to factor
BRA_adm2$NAME_2 <- as.factor(BRA_adm2$NAME_2)
# Plot BRA_adm2
plot(BRA_adm2)
box()
# Define RasterLayer object
r.raster <- raster()
# Define raster extent
extent(r.raster) <- extent(BRA_adm2)
# Define pixel size
res(r.raster) <- 0.1
Figure 1: Brazil SpatialPolygonsDataFrame plot
Simple thread example
# Simple thread -----------------------------------------------------------
# Rasterize
system.time(BRA_adm2.r <- rasterize(BRA_adm2, r.raster, 'NAME_2'))
Time in my laptop:
# Output:
# user system elapsed
# 23.883 0.010 23.891
Multithread thread example
# Multithread -------------------------------------------------------------
# Load 'parallel' package for support Parallel computation in R
library('parallel')
# Calculate the number of cores
no_cores <- detectCores() - 1
# Number of polygons features in SPDF
features <- 1:nrow(BRA_adm2[,])
# Split features in n parts
n <- 50
parts <- split(features, cut(features, n))
# Initiate cluster (after loading all the necessary object to R environment: BRA_adm2, parts, r.raster, n)
cl <- makeCluster(no_cores, type = "FORK")
print(cl)
# Parallelize rasterize function
system.time(rParts <- parLapply(cl = cl, X = 1:n, fun = function(x) rasterize(BRA_adm2[parts[[x]],], r.raster, 'NAME_2')))
# Finish
stopCluster(cl)
# Merge all raster parts
rMerge <- do.call(merge, rParts)
# Plot raster
plot(rMerge)
Figure 2: Brazil Raster plot
Time in my laptop:
# Output:
# user system elapsed
# 0.203 0.033 8.688
More info about parallelization in R:
No comments:
Post a Comment