My dataset is about 8.3 GB after loaded into R environment by using lidR package. Some points which have return number greater than number of returns need to be removed.
With a small dataset, I could just use:
l <- lasfilter(las, ReturnNumber <= NumberOfReturns)
which is not possible with my PC.
What is the right way to filter (to use 'lasfilter' on) huge LiDAR data? I am not sure which one to chose from the list of filter expression for argument 'filter' in the 'readLAS' function. I modified a sample dataset from lidR package so it has points that return number are greater than their number of returns:
library(lidR)
LASfile <- system.file("extdata", "Megaplot.laz", package="lidR")
las = readLAS(LASfile)
# make rn > nr
las@data$ReturnNumber[4] <- 3L
las@data$ReturnNumber[8] <- 3L
writeLAS(las, "/TEMP/rnGTnr.laz")
# using "-keep_last", guessing from its source code (and it's wrong)
las2 <- readLAS("/TEMP/rnGTnr.laz", filter = "-keep_last")
# > Warning message:
# > Invalid data: 2 points with a 'return number' greater than the 'number of returns'.
So, the problem persists.
Can it be done using LASCatalog object (by dividing the data into chunks)? I don't know how to filter using LASCatalog object. There is a filter option for LASCatalog, and again I don't know which one is appropriate for my case.
Answer
There is no streaming equivalent of ReturnNumber <= NumberOfReturns
I can see some options:
- I'm pretty sure that the warnings comes from points that have a
NumberOfReturns = 0
. Thus I would tryfilter = "-drop_number_of_returns 0"
. - Go to the github repo of the
rlas
package and open an issue with a feature request. This is not hard to add such filter. - Apply
lasfilter
at the R level on small chunks withcatalog_apply
that gives access to the core engine of the package. The side effect is that your file will be split in chunks. But it is not a problem actually.
filter_invalid_returns = function(chunk)
{
las = readLAS(chunk)
if (is.empty(las)) return(NULL)
las <- lasfilter(las, ReturnNumber <= NumberOfReturns)
return(las)
}
LASfile <- system.file("extdata", "Megaplot.laz", package="lidR")
project <- catalog(LASfile)
opt_chunk_buffer(project) <- 0
opt_chunk_size(project) <- 120 # small because this is a dummy example.
opt_output_files(project) <- paste0(tempfile(), "_{ID}")
output <- catalog_apply(project, filter_invalid_returns)
new_proj = catalog(unlist(output))
plot(new_proj)
But the true question is why do you have a file that do not respect LAS specification? And is really pertinent to remove these points? This is another non technical question that nobody can answer for you.
Edit: Actually you did not mention if your 8 GB are from a single file or from several. If it comes from several file you may prefer to use
opt_chunk_buffer(project) <- 0
opt_chunk_size(project) <- 0
opt_output_files(project) <- paste0(tempdir(), "/{ORIGINALFILENAME}")
No comments:
Post a Comment