Tuesday, 7 June 2016

postgresql - Import a shapefile to postgis with ogr2ogr gives: Unable to open datasource


I want to use ogr2ogr in order to import a shapefile in a postgis database. I have successfully installed ogr2ogr and I run from the pgsql the following command:


ogr2ogr -f "PostgreSQL" PG:"host=localhost user=user_1 password=***** dbname=imports" world_boundaries.shp

What I get back is an error message:


Unable to open datasource `world_boundaries.shp' with the following drivers: --a list of drivers follows (ESRI Shape File etc.)

I have also tried to define the full path of the shapefile but I got the same message.



Also I tried to run the:


ogrinfo world_boundaries.shp

Same thing.




After fixing the issues with the permissions of the file I get the following error:


  ERROR 1: AddGeometryColumn failed for layer world_boundaries, layer creation has failed.
ERROR 1: Terminating translation prematurely after failed
translation of layer world_boundaries (use -skipfailures to skip errors)


I also tried to import it through the GUI shp2pgsql and I get the following error:


ALTER TABLE "public".""
Failed in pgui_exec(): ERROR: permission denied for relation spatial_ref_sys
CONTEXT: SQL statement "SELECT SRID FROM spatial_ref_sys WHERE SRID = new_srid"
PL/pgSQL function addgeometrycolumn(character varying,character varying,character varying,character varying,integer,character varying,integer,boolean) line 50 at SQL statement
SQL statement "SELECT AddGeometryColumn('',$1,$2,$3,$4,$5,$6,$7)"
PL/pgSQL function addgeometrycolumn(character varying,character varying,character varying,integer,character varying,integer,boolean) line 5 at SQL statement

Shapefile import failed.


The issue this time was that this database user didn't have sufficient permissions. This fixed it:


GRANT ALL ON TABLE spatial_ref_sys TO my_user_name;

Next error message is:


  Warning 1: Geometry to be inserted is of type 3D Multi Polygon, whereas the layer geometry type is 3D Polygon.
Insertion is likely to fail
ERROR 1: INSERT command for new feature failed.
ERROR: Geometry type (MultiPolygon) does not match column type (Polygon)

So it seems I need to use the parameter: -nlt MULTIPOLYGON But when I do so I get another error, which doesn't make any sense to me:



ERROR 1: PostgreSQL driver doesn't currently support database creation.
Please create database with the `createdb' command.
PostgreSQL driver failed to create PG:host=localhost user=my_user_name dbname=my_database password=password -nlt POLYGON

But it gets loaded using the shp2pgsql GUI.




Comment of @elrobis enabled this to finally work. Data loaded in db correctly!


ogr2ogr -f "PostgreSQL" PG:"host=localhost user=geonode dbname=geonode_imports password=geonode" -nlt GEOMETRY wld_bnd_adm0_gaul_2015.shp

Answer



As you discovered by trial and error, there were few nagging issues you needed to fix, the last of which was resolved using ogr2ogr's -nlt GEOMETRY* argument.



* Note the recommendation in @LeeHachadoorian's comment that -nlt PROMOTE_TO_MULTI be used as a default solution, rather than nlt GEOMETRY, as the former promotes best practice in addition to ancillary benefits.


User Permissions and Error Messages


First, ogr2ogr couldn't open your shapefile, and you realized permissions issues were in fact affecting the OS user accessing your shapefile. But there is an important lesson here for others, specifically, ogr2ogr's error message on this point was misleading! Indeed, one of the first commenters thought your shapefile was invalid, and admittedly, my first guess was that there was probably an error/typo in the path/filename, or that there might have been an unconventional character in the file path—like a space—that was breaking ogr2ogr's ability to point to the shapefile. As you discovered, it was actually just a problem with user permissions. Because the error message creates a red herring, this is a possibility others need to keep in the back of their minds. :)


SQL User Privileges and Mystery Failures


I would have been stumped by your second error for awhile, but by testing your SQL user with a different import utility (shp2pgsql), which was smart, you got a more precise error message and gave your SQL user necessary privileges on the spatial_ref_sys table. So someone having difficulty getting their ogr2ogr import instruction to work properly should make sure their SQL user has sufficient privileges on both the database itself and the 'spatial_ref_sys' table.


Mixed Geometry Types and Best Practices


The last hurdle you encountered seems related to the fact that shapefiles allow for both single and multipart geometries to coexist in the same dataset/file. It's considered bad practice to mix geometry types in the same table, even for single/multipart of the same feature type, and by default, the open source players in your toolchain will try to protect you from mixing geometry types. Fortunately, though, they give you some options. Initially I recommended setting the -nlt GEOMETRY* argument on your ogr2ogr instruction, which allowed you to import your polygon dataset in spite of ESRI's looser convention. Be advised though, this means you probably have both single part and multi-part geometries in your table, and that may create other headaches for your later!


It's worth mentioning that ogr2ogr has another -nlt option you should consider, namely PROMOTE_TO_MULTI. To quote the documentation..



Starting with GDAL 1.10, PROMOTE_TO_MULTI can be used to automatically promote layers that mix polygon or multipolygons to multipolygons, and layers that mix linestrings or multilinestrings to multilinestrings. Can be useful when converting shapefiles to PostGIS and other target drivers that implement strict checks for geometry types.




In other words, if you use the PROMOTE_TO_MULTI flag, then ALL of your features will be converted to multipart features, even when they consist of a single part. As noted by @LeeHachadoorian in the comments—and I'm sure most would agree—you're advised to prefer PROMOTE_TO_MULTI over the looser GEOMETRY flag, as it conforms to best practice, unifying the feature geometries in your table. Basically, any code you write should just expect multipart geometries. Admittedly, this can be cleaner and simplify some of development.


Generic Advice for Someone Having Trouble with a SHP to POST Import



  1. Make sure your paths do not have any funky characters in them and that there are no typos or misspellings in either the path or filename

  2. As @user1919 discovered, make sure your OS user has sufficient privileges to access the shapefile! As they demonstrated, it can help to try opening the shapefile in another software, like QGIS—if it works in one software, then it's not corrupt, and it should work in other software.


At first, consider executing your ogr2ogr command as sudo to rule-out permissions issues until after you know for certain your script is working as intended.



  1. Also as @user1919 realized, make sure your SQL user has sufficient privileges on both the database targeted by your script, as well as the spatial_ref_sys table.



Again, at first, consider using the PostGRESql super user here to rule-out SQL privilege issues until your script is working. If your script works with the superuser then fails with a preferred automation user, you know the problem is related to the SQL user and not your data or your environment (gdal/ogr installation, etc.)




  1. Try setting the -nlt flag to either PROMOTE_TO_MULTI or GEOMETRY. Since shapefiles allow for a looser feature type convention, sometimes you have to instruct your open source utilities to be more accommodating :)




  2. If you're importing to PostGRESql or MySQL, try setting -lco PRECISION=no ..fair warning, I don't exactly understand what this argument does, but here's what I've experienced.. As you know, shapefiles often include the SHAPE_LENGTH and SHAPE_AREA fields, and I've noticed sometimes when I'm experiencing mystery failures, if I delete those fields I can get the data to import correctly. However, if I use -lco PRECISION=no, I can get the data to import without having to delete those fields. My recommendation is to use this parameter as a troubleshooting step, but to understand what issue it's truly resolving before you accept the import in a production solution.





  3. Finally, if you're using MySQL, be mindful that some very large feature geometries may offend MySQL's max_allowed_packet parameter. You can read more about this in the documentation for the MySQL driver ..but the solution is to change your MySQL configuration to allow for a larger than default value.




Example SHP to PostGIS Import Command for ogr2ogr


For the sake of any newbies that may wander through here, this is what most of my SHP to Post imports look like using ogr2ogr. Notice that I wrap file paths/names in quotes, this protects against spaces, weird characters, and line breaking across the terminal.. also I've included most of the arguments discussed above, in addition to overrides for the geometry name field, the FID field, and the layer name:


ogr2ogr -f "PostgreSQL" "PG:host=127.0.0.1 user=myuser dbname=mydb password=mypassw0rd" "C:/path/to/some_shapefile.shp" -lco GEOMETRY_NAME=the_geom -lco FID=gid -lco PRECISION=no -nlt PROMOTE_TO_MULTI -nln new_layername -overwrite


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...