I am doing some work with networkx, which involves the conversion of a point and polyline shapefile into a graph with nodes and links. The documentation about the read_shp() method states that it:
Generates a networkx DiGraph from shapefiles. Point geometries are translated into nodes, lines into edges. Coordinate tuples are used as keys. Attributes are preserved, line geometries are simplified into start and end coordinates. Accepts a single shapefile or directory of many shapefiles.
My first observation is that the method does not read all the features in the shapefiles e.g. there are over 31000 points in my nodes shapefile but the length of nodelist returns only 4991 as against 31760 nodes. I observed the same behaviour for the lines shapefile. I have a small code snippet below as an illustration.
import networkx as nx
G = nx.read_shp("Sample_nodes.shp", simplify=False)
print len(G.nodes())
#output: 4991 instead of 31760
Is there something I am missing?
Answer
It is not complicated.
nx.read_shp
uses ogr to read a shapefile , look at nx_shp.py line 78-88then the script use a dictionary to add nodes to the Graph (lines 87-88,
net.add_node((g.GetPoint_2D(0)), attributes)
)
First part of the script, ogr only
shp = ogr.Open("network_pts.shp")
for lyr in shp:
fields = [x.GetName() for x in lyr.schema]
for f in lyr:
flddata = [f.GetField(f.GetFieldIndex(x)) for x in fields]
g = f.geometry()
attributes = dict(zip(fields, flddata))
attributes["ShpName"] = lyr.GetName()
# Note: Using layer level geometry type
print g.GetPoint_2D(0), attributes
(204097.29746070135, 89662.23095525998) {'ShpName': 'network_pts', 'type': 'one'}
(204168.65175332528, 89745.26602176542) {'ShpName': 'network_pts', 'type': 'two'}
(204110.75574365177, 89765.58041112455) {'ShpName': 'network_pts', 'type': 'three'}
(204220.19951632406, 89794.7823458283) {'ShpName': 'network_pts', 'type': 'three'}
(204097.29746070135, 89662.23095525998) {'ShpName': 'network_pts', 'type': 'one-bis'}
The shapefile contains 5 features with "one" and "one-bis" with the same coordinates and two others with the same type.
2) with read_shp
G = nx.read_shp("network_pts.shp")
G.number_of_nodes()
4
Why
print G.node
{(204097.29746070135, 89662.23095525998): {'ShpName': 'network_pts', 'type': 'one-bis'}, (204110.75574365177, 89765.58041112455): {'ShpName': 'network_pts', 'type': 'three'}, (204220.19951632406, 89794.7823458283): {'ShpName': 'network_pts', 'type': 'three'}, (204168.65175332528, 89745.26602176542): {'ShpName': 'network_pts', 'type': 'two'}}
And for the points with same coordinates
print G.node[(204097.29746070135, 89662.23095525998)]
{'ShpName': 'network_pts', 'type': 'one-bis'}
Only one point was retained, the last one (insertions in a dictionary with same key)
net = nx.DiGraph()
net.node[(204097.29746070135, 89662.23095525998)] = {'ShpName': 'network_pts', 'type': 'one'}
net.node[(204097.29746070135, 89662.23095525998)] = {'ShpName': 'network_pts', 'type': 'one-bis'}
print net.node
{(204097.29746070135, 89662.23095525998): {'ShpName': 'network_pts', 'type': 'one-bis'}}
No comments:
Post a Comment