I have shapefile with lines, some of them are perfectly overlaps - duplicate, although each one have unique id etc.
- When I view the file in QGIS I see they overlap.
- When I convert the file with ogr2osm to osm XML & import to JOSM, and running validation I also get warning of duplicate ways (the lines converted to ways...).
How can I "programmatically" detect and remove such situations? I don't mind doing it in the shapefile or later in the OSM XML.
Note 1 - The attributes (tags) are unique - the actual spatial geometry is duplicate.
Note 2 - I want to use something like ogrinfo, ogr2ogr etc GDAL libraries either as commandline tools or as import library for python/java code. If other code tools exits it's perfect also.
Answer
It might be possible to do with ogrinfo and SQLite/SpatiaLite dialect http://www.gdal.org/ogr_sql_sqlite.html directly with shapefiles but it may be hard to handle the fids/rowids with that route. But if you convert your data into SpatiaLite you can use SQL for finding and deleting the duplicates. With spatialite command line tools https://www.gaia-gis.it/fossil/spatialite-tools/index you can use SQL in scripts.
Convert data from shapefile into SpatiaLite with ogr2ogr
ogr2ogr -f SQLite -dsco spatialite=yes my_data.sqlite my_data.shp
A query for finding the duplicates
SELECT a.ogc_fid AS a_fid, b.ogc_fid AS b_fid
FROM equal_lines AS a, equal_lines AS b
WHERE ST_Equals(a.geometry,b.geometry)
AND a.ogc_fid!=b.ogc_fid
AND a.ogc_fid
A result will be like
a_fid b_fid
1 3
5 6
5 7
6 7
Thus 1=3, 5=6, 5=7, and 6=7 (means that 5=6=7). Delete rows which have their ogc_fid listed in the right hand column and you have done.
DELETE FROM equal_lines WHERE ogc_fid IN
(SELECT b.ogc_fid
FROM equal_lines AS a, equal_lines AS b
WHERE ST_Equals(a.geometry,b.geometry)
AND a.ogc_fid!=b.ogc_fid
AND a.ogc_fid
If you really need shapefiles you can export the processed table also with SQL by using the ExportSHP function https://www.gaia-gis.it/gaia-sins/spatialite-sql-latest.html.
No comments:
Post a Comment