Tuesday 24 April 2018

python - List of lat-long, need values in a raster



Apologies in advance as I am not a GIS specialist by any means. I have a set of 1 million points and I'm trying to find their values in a geotiff raster. I've tried various versions of this answer involving affine transformations: https://gis.stackexchange.com/a/221471/143163


def retrieve_pixel_value(geo_coord, data_source):
"""Return floating-point value that corresponds to given point."""
x, y = geo_coord[0], geo_coord[1]
forward_transform = \
affine.Affine.from_gdal(*data_source.GetGeoTransform())
reverse_transform = ~forward_transform
px, py = reverse_transform * (x, y)
px, py = int(px + 0.5), int(py + 0.5)
pixel_coord = px, py


data_array = np.array(data_source.GetRasterBand(1).ReadAsArray())
return data_array[pixel_coord[0]][pixel_coord[1]]

This isn't ideal since it's point by point querying but it's better than nothing. The problem I'm having, however, is that it's not returning the correct values.


As a sanity check, I used some WRF data that had lat/long layers and queried various points and the resultant lat/long that is supposed to be the nearest coordinates in the WRF data are very far off from where they should be. e.g. inputting 33.77864,-117.33142 returns 38.72556, -115.75209 (the range of this layer is 32.597065 to 39.3944,-121.413025 to -113.04607, so there should be much closer matches). Furthermore, switching lat long as inputs doesn't drastically change the return value (when switched, returns 34.820377,-120.55661). Like I said, not an expert, but to me this seems like I'm using the wrong coordinate system as inputs. Does anyone know a way to convert lat long to the appropriate coordinates to find values in a raster?


I realize this isn't the most efficient way to do a list query on a big db, given the original problem of finding raster values for 1 million points, is there a more GIS-ish way to do this?



Answer



Assuming that your raster is not skewed, i.e. both the third and fifth values of the Geotransform are zero, and that your points and raster share the same coordinate system (as @davemfish pointed out), you can leverage numpy to access all the values without the need of using a for loop.


The following snippet shows how to calculate the raster values for each point. It does this by calculating the index of each point in the array containing the raster values and then indexing the array to get the value of each point. However, numpy vectorizes this operations, resulting in a much faster result than a loop.



Note: assume that x and y are both 1D numpy arrays containing the longitudes and latitudes, respectively, for your points.


import gdal
import numpy as np


def get_indices(x, y, ox, oy, pw, ph):
"""
Gets the row (i) and column (j) indices in an array for a given set of coordinates.
Based on https://gis.stackexchange.com/a/92015/86131


:param x: array of x coordinates (longitude)
:param y: array of y coordinates (latitude)
:param ox: raster x origin
:param oy: raster y origin
:param pw: raster pixel width
:param ph: raster pixel height
:return: row (i) and column (j) indices
"""

i = np.floor((oy-y) / ph).astype('int')

j = np.floor((x-ox) / pw).astype('int')

return i, j


# read data and geotransform
ds = gdal.Open('my_raster.tif', 0)
arr = ds.ReadAsArray()
xmin, xres, xskew, ymax, yskew, yres = ds.GetGeoTransform()
del ds


# calculate indices and index array
indices = get_indices(x, y, xmin, ymax, xres, -yres)
values = arr[indices]

Take into account that all of your points have to be within your raster. Otherwise you will get negative indices or indices out of bounds.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...