Friday, 6 February 2015

extract scientific layers from MODIS HDF Dataeset using Python GDAL


I have been trying to extract scientific data layers from MODIS HDF files but without success. Before I have been doing this type of task using Arcpy, but I want the script that I am currently working to be based only in GDAL and not in Arcpy so I can easily share it. I used GDAL, but it seems that GDAL version in windows does not have support for HDF files (I browsed some forums about it). I tried to open the HDF file using gdal.Open() but it does not return anything, and I am pretty sure that my directory path to the HDF file is correct. Then I tried pyHDF. So this packages allowed me to extract the scientific layers as 2D array, but I have a problem how to save it as GeoTIFF file. I am aware that for me to save a 2D array to a GeoTIFF file I need some reference, such as projection and raster properties, just like the GetGeoTransform() module in GDAL. In this step, I am not sure how to extract that same kind of information using pyHDF.



Answer




If you are working with a GDAL version that supports HDF5 datasets this is how you can extract a single subdataset from it and convert it to Geotiff in Python. This example was used for MOD09 and MOD13 band extraction (hence the no_data value conversion).


from osgeo import gdal
import numpy as np

def hdf_subdataset_extraction(hdf_file, dst_dir, subdataset):
"""unpack a single subdataset from a HDF5 container and write to GeoTiff"""
# open the dataset
hdf_ds = gdal.Open(hdf_file, gdal.GA_ReadOnly)
band_ds = gdal.Open(hdf_ds.GetSubDatasets()[subdataset][0], gdal.GA_ReadOnly)


# read into numpy array
band_array = band_ds.ReadAsArray().astype(np.int16)

# convert no_data values
band_array[band_array == -28672] = -32768

# build output path
band_path = os.path.join(dst_dir, os.path.basename(os.path.splitext(hdf_file)[0]) + "-sd" + str(subdataset+1) + ".tif")

# write raster

out_ds = gdal.GetDriverByName('GTiff').Create(band_path,
band_ds.RasterXSize,
band_ds.RasterYSize,
1, #Number of bands
gdal.GDT_Int16,
['COMPRESS=LZW', 'TILED=YES'])
out_ds.SetGeoTransform(band_ds.GetGeoTransform())
out_ds.SetProjection(band_ds.GetProjection())
out_ds.GetRasterBand(1).WriteArray(band_array)
out_ds.GetRasterBand(1).SetNoDataValue(-32768)


out_ds = None #close dataset to write to disc

edit: This answer used GDAL version 1.9.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...