Tuesday, 22 May 2018

python - GDAL (or other open-source) equivalent of Arc's Point To Raster Tool


I'm trying to convert a python script that relies on arcpy into one that only relies on open source modules, but I'm having some trouble. The bulk of my processing lies in one Arc tool "Point to Raster." The goal is to take a csv file with lat/long coords and turn it into a raster, where each cell value is the number of points contained within that cell. When I'm able to use arcpy, the workflow looks like this:


Make XY event layer from csv of points -->
Use Event layer as input in Point To Raster tool, with cell assignment = COUNT, specifying spatial extent and cellsize -->
Mask out NoData values -->
Write output to .tif


This works fine, and gives me my desired result, but now I need to write the same script without arcpy. I've found gdal tools that work for everything except the main process: Point to Raster. Do you know of any gdal utility (or reluctantly, some other open source tool) that could do what Arc's Point to Raster tool is doing? The 2 gdal tools that look like they could possibly accomplish this are:




  1. gdal_grid (http://www.gdal.org/gdal_grid.html) -This tool isn't meant to do exactly what I want it to do, although it does take a set of points (could be a shapefile, or a csv written as a VRT**) and turn it into a raster grid. I can specify the extent of the gridd along with its resolution, which is important. There is also an option to set grid cell values based on data metrics, with 'count' being one of them. However, under the count metric it states "A number of data points found in grid node search ellipse." What does 'grid node search ellipse' mean? Can I not assign values based on a number of data points found in each grid cell'? Is there a difference?




  2. gdal_rasterize (http://www.gdal.org/gdal_rasterize.html) -This one looks a little less promising, or maybe that's just because I don't entirely understand it. In summary, the tool "burns vector geometries (points, lines and polygons) into the raster band(s) of a raster image," which sounds like what I need, although there is no keyword to specify point count as the value to be burned, so I don't know.





A few more things:




  • **I was going to test using gdal_grid, but can't find anywhere how to convert a CSV to a VRT in python. I know it's possible by hand, every tutorial/help page I can find says something like: "Say you have a CSV with lat/long, you can write a VRT file for that CSV" then jumps right into "Here's what the VRT file looks like for the CSV." But HOW do I write this VRT file, in python? Several pages say to download an FW toolkit (or something) to create the CSV, but I need to do this within python as I'm automating the process for hundreds of CSVs. I tried ogr2ogr but got the error: "VRT driver does not support data source creation," so I'm at a loss. I could convert the csv to shapefile or some other OGR-recognized format, but the VRT is appealing because the shapefile/whatever that would be created would be very, very large and probably take a while to create. Which leads me to...




  • Each csv has an obsenely large amount of points: ~78 million. So python/numpy won't even read them without crashing. Additionally, the resulting raster will be global with fine resolution (1km), so a hack that manipulates the coordinates or loops through the points/cells is out of the question.




I'm not sure if I should put my CSV to VRT question in a separate thread, so my apologies if this is too much for one question.



Thoughts/suggestions?




No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...