Tuesday 23 October 2018

arcpy - python psycopg2 insert table into postgis: geometry requires more points


I have a feature class of 20k records. trying to import it into postgis using arcpy and psycogp2. I have a pretty basic script


cur.execute('''drop table if exists developed_lands.daysemtric_test''')
cur.execute('''CREATE TABLE developed_lands.daysemtric_test
(pid serial primary key,
pop_dns float,

shape geometry);''')
with arcpy.da.SearchCursor(shp, ["POP_DNS","SHAPE@WKT"]) as cur2:
for row in cur2:
sql = "insert into developed_lands.daysemtric_test(pop_dns,shape)values(%s,ST_GeomFromText(%s)"
cur.execute(sql,(row[0],row[1]))
x+=1
print x
conn.commit()

error:



   Traceback (most recent call last):
File "C:\Users\rizagha\Desktop\module1.py", line 16, in
cur.execute(sql,(row[0],row[1]))
InternalError: geometry requires more points
HINT: "...6391.65285167098 737288.47085659206)," <-- parse error at position 13049795 within geometry

^

This script works because If I skip the first feature it inserts the records up to row 2,000 then throws the same error. I put the function st_makevalid() around ST_GeomFromText(%s) and it did nothing to help. running repair geometry on the feature class now, but that usually takes hours...


the first row where the error is happening, the geometry is extremely complex. is it possible the WkT is too long?



UPDATE


repair geometry finished and I reran the script and I got the same error



Answer



TL;DR;: PostGIS restricts some kinds of invalid WKT geometries that are otherwise allowed by some formats and software, such as shapefiles and ArcGIS. You can sneak these in via WKB.




Let's take an invalid geometry, described by WKT:


POLYGON((5 5, 20 30, 30 5, 5 5), (22 13, 16 13, 22 13))

invalid


This shape can be loaded into Esri and GDAL/OGR software.



# Esri
import arcpy
e = arcpy.FromWKT('POLYGON((5 5, 20 30, 30 5, 5 5), (22 13, 16 13, 22 13))')
print(e.JSON)
# {"rings":[[[30,5],[20,30],[5,5],[30,5]],[[16,13],[22,13],[16,13]]],"spatialReference":{"wkid":null}}

# GDAL/OGR
from osgeo import ogr
g = ogr.CreateGeometryFromWkt('POLYGON((5 5, 20 30, 30 5, 5 5), (22 13, 16 13, 22 13))')
print(g.IsValid()) # False

print(g.ExportToJson())
# { "type": "Polygon", "coordinates": [ [ [ 5.0, 5.0 ], [ 20.0, 30.0 ], [ 30.0, 5.0 ], [ 5.0, 5.0 ] ], [ [ 22.0, 13.0 ], [ 16.0, 13.0 ], [ 22.0, 13.0 ] ] ] }

But JTS, GEOS and PostGIS do not allow this kind of invalid geometry in WKT form:


SELECT 'POLYGON((5 5, 20 30, 30 5, 5 5), (22 13, 16 13, 22 13))'::geometry;

ERROR: geometry requires more points
LINE 1: SELECT 'POLYGON((5 5, 20 30, 30 5, 5 5), (22 13, 16 13, 22 1...
^
HINT: "...0, 30 5, 5 5), (22 13, 16 13, 22 13))" <-- parse error at position 55 within geometry


However, you can load the WKB equivalent into PostGIS without issue:


SELECT '010300000002000000040000000000000000001440000000000000144000000000000034400000000000003E400000000000003E400000000000001440000000000000144000000000000014400300000000000000000036400000000000002A4000000000000030400000000000002A4000000000000036400000000000002A40'::geometry AS g
INTO TEMP geoms;
SELECT ST_AsEWKT(g) FROM geoms;

st_asewkt
---------------------------------------------------
POLYGON((5 5,20 30,30 5,5 5),(22 13,16 13,22 13))
(1 row)


So you can modify your INSERT to use hex-encoded WKB instead of WKT. I'm not sure what Esri's e.WKB is (it's not ISO or EWKB), but you can use OGR for this: g.ExportToWkb().encode('hex')


And you should repair these geometries to avoid further complications:


SELECT ST_AsText(ST_CollectionExtract(ST_MakeValid(g), 3)) FROM geoms;

st_astext
--------------------------------------
MULTIPOLYGON(((5 5,20 30,30 5,5 5)))
(1 row)


(but use UPDATE instead).


valid


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...