Sunday, 3 July 2016

arcgis desktop - Efficient Network Neighbors?


I would like to identify the Kth order neighbors of an edge on a network, specifically the neighbors of a large set of streets. For example, I have a street that I'm interested in looking at, call this the focal street. For each focal street I want to find the streets that share an intersection, these are the first order neighbors. Then for each of those streets that share an intersection with the focal street I would like to find their neighbors (these would be the second order neighbors), and so on...


Calculating the first order neighbors using ArcGIS' geoprocessing library (arcpy) took 6+ hours, second order neighbors are taking 18+ hours. I have posted on this in the ESRI forums HERE (link includes arcpy code that works but is crazy slow). Needless to say I want to find a more efficient solution. I have created a python dictionary which is keyed on each street and contains the connected streets as values. For example:


st2neighs = {street1: [street2, street3, street5], street2: [street1, street4], ...}.  


street 1 is connected to street 2, 3, 5; street 2 is connected to street 1 and 4; etc.. There are around 30,000 streets in the study area most have fewer than 7 connected streets. A pickled version of the data used in the code below IS HERE.


I assumed that knowing the first order neighbors would allow me to efficiently trace the higher order neighbors. But the following code is providing incorrect results:


##Select K-order neighbors from a set of sampled streets.
##saves in dictionary format such that
##the key is the sampled street and the neighboring streets are the values

##################
##IMPORT LIBRARIES
##################


import random as random
import pickle

#######################
##LOAD PICKLED DATA
#######################

seg_file = open("seg2st.pkl", "rb")
st_file = open("st2neighs.pkl", "rb")
seg2st = pickle.load(seg_file)

st2neigh = pickle.load(st_file)

##################
##DEF FUNCTIONS
##################

##Takes in a dict of segments (key) and their streets (values).
##returns the desired number of sampled streets per segment
##returns a dict keyed segment containing tlids.
def selectSample(seg2st, nbirths):

randSt = {}
for segK in seg2st.iterkeys():
ranSamp = [int(random.choice(seg2st[segK])) for i in xrange(nbirths)]
randSt[segK] = []
for aSamp in ranSamp:
randSt[segK].append(aSamp)

return randSt

##Takes in a list of all streets (keys) and their first order neighbors (values)

##Takes in a list of sampled streets
##returns a dict of all sampled streets and their neighbors.
##Higher order selections should be possible with findMoreNeighbors
##logic is the same but replacing sample (input) with output from
##findFirstNeighbors

def findFirstNeighbors(st2neigh, sample):
compSts = {}
for samp in sample.iterkeys():
for rSt in sample[samp]:

if rSt not in compSts:
compSts[rSt] = []
for compSt in st2neigh[rSt]:
compSts[rSt].append(compSt)

return compSts

def findMoreNeighbors(st2neigh, compSts):
for aSt in compSts:
for st in compSts[aSt]:

for nSt in st2neigh[st]:
if nSt not in compSts[aSt]:
compSts[aSt].append(nSt)
moreNeighs = compSts
return moreNeighs

#####################
##The nHoods
#####################


samp = selectSample(seg2st, 1)
n1 = findFirstNeighbors(st2neigh, samp)
n2 = findMoreNeighbors(st2neigh, n1)
n3 = findMoreNeighbors(st2neigh, n2)

#####################
##CHECK RESULTS
#####################
def checkResults(neighList):
cntr = {}

for c in neighList.iterkeys():
cntr[c] = 0
for a in neighList[c]:
cntr[c] += 1
return cntr

##There is an error no streets **should** have 2000+ order neighbors
c1 = checkResults(n1)
c2 = checkResults(n2)
c3 = checkResults(n3)


Help!



Answer



This is simpler than it looks, provided you use an effective data structure. One that works well is a queue of (street, order, {list of neighbors}) tuples. The algorithm is:


Initialize all orders to missing.
Set the street's order to 0.
Queue the street.
While the queue is nonempty {
Remove a street S from the queue.
For each neighbor of S {

If the neighbor's order is missing {
Set the neighbor's order to 1 + the order of S
Queue the neighbor
}
}
}

Queues are easy to implement. In this situation, if you have an array A of (street, order, neighbors) tuples, the queue can be as simple as a pair of indexes k, l into A. They start out both initialized to -1 (for indexing beginning at 0). To place a member of the array into the queue, say A[i], increment l and swap entries i and l. To remove the head of the queue, check that the queue is nonempty and if it is, increment k and read A[k]. The queue is empty whenever k is greater than or equal to l. After the algorithm terminates, the elements of the array may have been permuted and their orders have been calculated.


You should be able to process millions of streets a second.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...