I'm trying to load huge .osm files into R in order to run advanced analysis. I did not expect the loading part to be complex, but it proved difficult.
The issue here is that the time it takes to parse the XML file (.osm is xml-like) and populate an object (here an osmar one) seems to be quadratically proportional to the file size.
require(osmar)
my_file = osmsource_file("filename")
osmar_object = get_osm("bla", source = my_file)
And in case you wonder how I found that it was quadratic, here is a representation of the analysis, with files between 0 and 500 Mo.
At this point, my only chance to make it work seems to be the following : split the country in 100x100 small parts, and load them separately. Obviously, I don't really like the idea.
Why would it be so long ? How can I do so, fast ?
No comments:
Post a Comment