Monday 26 August 2019

Interpreting GeoTiff pixel values using Python GDAL?


I was running python script that queries a GeoTiff file at the coordinates of a centroid of blocks and as a result I got a list of pixel values, which I have trouble interpreting. When I run gdalinfo to obtain basic information about my tiff I get the following (abridged output):


~$ gdalinfo arsenci020l.tif
Driver: GTiff/GeoTIFF
Files: arsenci020l.tif
arsenci020l.tfw
Size is 10366, 7273
Coordinate System is:
PROJCS["Lambert Azimuthal Equal Area projection with arbitrary plane grid; projection center

100.0 degrees W, 45.0 degrees N",
......(omitted)............
......(omitted)............
Band 1 Block=10366x1 Type=Byte, ColorInterp=Palette
Color Table (RGB with 256 entries)
0: 0,0,0,255
1: 255,255,255,255
2: 132,193,255,255
3: 204,204,204,255


My interpretation of Color Table given by gdalinfo was that every pixel value actually stands for 4 other entries.


The output of python script prints out 2,3,5, etc. How to interpret them? My first instinct was to look at the Color Table above given by gdalinfo, but I'm curious why it talks about RGB value and gives 4 numbers? (don't we represent RGB be in 3 numbers?) Is there an extra column?


Also, the information file about the GeoTiff has the following table:


  Attribute_Label: Arsenic concentration grid cell value.
Attribute_Definition:
Each grid cell value is a color definition that represents an
estimate of the 75th percentile of arsenic concentration in
that region. The concentrations were recorded in micrograms
per liter (ug/L; equivalent to parts per billion), and were
converted to colors according to the following table. All

arsenic concentrations greater than 50 micrograms per liter
were recoded as 50 micrograms per liter. There may be
intermediate colors in the image that are not included in this
table.
> Arsenic Color RGB values
> Concentration
>--------------------------------------------------------------
> 1 ug/L or less dark green 50.67328 150.3464 0.67328
> 3 ug/L light green 152 251 152
> 5 ug/L yellow 255 255 0

> 10 ug/L orange 255 165 0
> 50 ug/L or greater red 255 0 0
> Insufficient data white 255 255 255
> Non-US land grey 204 204 204
> Water light blue 132 193 255

Here it uses 3 numbers to represent a color. Why then gdalinfo gives 4?


As a side note, how should I programmatically connect pixel value, RGB color and arsenic concentration somehow (without excluding any intermediate colors as this table did). Any advice on how to do that?


This is the first time I'm working with GeoTiff files.



Answer




The fourth value is alpha (i.e. RGBA), which you can ignore. The four value structure is expected.


You can read the colour tables into native lists/dicts with GDAL.


from osgeo import gdal
gdal.UseExceptions()

ds = gdal.Open(fname)
band = ds.GetRasterBand(1)
arr = band.ReadAsArray()
ct = band.GetColorTable()


# index value to RGB (ignore A)
i2rgb = [ct.GetColorEntry(i)[:3] for i in range(ct.GetCount())]

# RGB to index value (assumes RGBs are unique)
rgb2i = {rgb: i for i, rgb in enumerate(i2rgb)}

# Now look up an index, e.g., water is light blue
print(rgb2i[(132, 193, 255)]) # 2

No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...