I have a raster, which is not very big, only a few hundred megabytes compressed, but it has about 10 billion values, each stored in 1 byte (0..255). Uncompressed it would take about 10gb (lots of empty cells). When I try getValues(myRaster)
I get an error that it cannot allocate vector of size 40Gb. Judging by the number, I get it that it tries to put everything in a vector with 4 bytes values. This means that in just one line, when I try to process a 300mb file, I get an error that I need 40gb of ram. Here's some example code:
library(raster)
myRaster <- raster("myRasterFile.tiff") # <--this file is not more than a few hundred mb
getValues(myRaster)
#Error: cannot allocate vector of size 40 Gb
I do understand that R is not made with programmers in mind. But is there any way I can force getValues(myRaster)
to return a vector coerced as 1 byte values?
I know I can just split the raster into whatever fits my RAM, and maybe even run things in parallel as many times as it fits my CPU, but that code would not showcase at all the simplicity and elegance of R for statistics.
I also know I can just increase the swap temporarily to more than 40Gb, thus the virtual RAM, and then just wait until it's done, but this code will literally work only on my computer.
Does anyone have any idea of how would this can be solved?
No comments:
Post a Comment