I am using different shapefiles from open data in Switzerland. They have just updated the whole set. My question is: how to compared two layers (new and old version)? I would like to know the differences in geometry, as well as attributes. In the data there are points, lines and polygons and some of the sets contains a big amount of features (more than 300 000 elements).
I have already checked all the simillar questions, but they all focus on creating the third layer with differences or particular layer example.
What I want to know is just what has been changed.
I am working with QGIS 2.14.3 and i am familiar with python.
Answer
It is not with QGIS or PyQGIS, but if you know Python and the modules Pandas and GeoPandas (Python 2.7 and 3.x), it is easy using the solution of Outputting difference in two pandas dataframes side by side - highlighting the difference if the two shapefiles have the same schema and the same record indexes
import geopandas as gp
# convert shapefiles to GeoDataFrame
old = gp.GeoDataFrame.from_file("shape_old.shp")
new = gp.GeoDataFrame.from_file("shape_new.shp")
old
new
import numpy as np
import pandas as pd
# which entries have changed
ne_stacked = (old != new).stack()
changed = ne_stacked[ne_stacked]
changed.index.names = ['id', 'col']
print changed
id col
0 geometry True
test True
1 ensayo True
2 ensayo True
geometry True
Compare the columns which has been changed.
difference_locations = np.where(old != new)
changed_from = old.values[difference_locations]
changed_to = new.values[difference_locations]
pd.DataFrame({'from': changed_from, 'to': changed_to}, index=changed.index)
But with more than 300 000 elements...
No comments:
Post a Comment