Sunday 16 October 2016

Casting geometry to MULTI using GeoPandas?


Shapefiles can mix simple POLYGONs and MULTIPOLYGONs in the same data source. Spatial databases like PostGIS and SpatiaLite are strict, and will not put a POLYGON in a MULTIPOLYGON geometry column.


I've gotten used to using ST_Multi to fix this issue. But now I am trying to use GeoPandas to do some file processing, including converting from shapefile to GeoPackage (with a bunch of stuff in the middle), and I am running into this error:


gdf.to_file("garbage.gpkg", "GPKG")

ValueError: Record's geometry type does not match collection schema's 
geometry type: 'MultiPolygon' != 'Polygon'

Is there a GeoPandas equivalent to ST_Multi that I can use to fix the geometry before saving to the GeoPackage or SpatiaLite format?




Answer



TL;DR


Given a geopandas data frame gdf with mixed Polygons and MultiPolygons, the geometry column can be converted to Multi with the following:


from shapely.geometry.polygon import Polygon
from shapely.geometry.multipolygon import MultiPolygon

gdf["geometry"] = [MultiPolygon([feature]) if type(feature) == Polygon \
else feature for feature in gdf["geometry"]]



There doesn't appear to be an equivalent of PostGIS's ST_Multi that can accept a Polygon or a MultiPolygon, and casts to Multi while not harming geometries that are already Multi.


type(feature) == Polygon


The MultiPolygon contructor requires a list of polygons. If the feature is a Polygon, the feature has to be wrapped in list brackets: MultiPolygon([feature]). Otherwise, MultiPolygon(feature) throws the following error:


TypeError: 'Polygon' object is not iterable

type(feature) == MultiPolygon


If the feature is already a MultiPolygon, MultiPolygon(feature) is harmless effect, but MultiPolygon([feature]) will extract one polygon part from the multipart feature, and drop all others.


Hence, the type must be determined first, and MultiPolygon only applied to non-Multi features (e.g., simple Polygons). The list comprehension above:



  1. Extracts each feature with for feature in gdf["geometry"].


  2. Checks if it is a Polygon with if type(feature) == Polygon.

  3. Passes Polygons only to the MultiPolygon constructor with MultiPolygon([feature]).

  4. Returns the feature untouched with else feature for features which are already MultiPolygons.

  5. Assigns back to the geometry column with gdf["geometry"] =


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...