I'm trying to map coordinate data to a municipality. Reverse geocoders aren't as accurate as I need them to be (I was using geocoder pkg before), so I found a .shp that maps and names all the municipalities I need to refer to.
I have 9000 records in a CSV that need to be mapped, and was wondering If anyone knew how to check if the individual coordinates are in these shapes.
I'm not familiar with .shp files, and don't know if this would be possible through a package like pyshp
With help, I got it. Here is the code in case anyone has a similar issue in the future, it's sloppy and I may update when I'm done cleaning it up. Loading up the library of shapes is pretty slow, just to warn. And checking each point against each shape has high complexity, but for my sample of 9000 values against ~45 shapes it ran pretty quickly:
import shapefile as shp
from shapely.geometry import Point
import csv
from shapely.geometry.polygon import Polygon
fileName = 'TestLatLong' #Open CSV file with points to check
sf = shp.Reader("NewMuni/NewMuni111") #Open Shapefile with shapes to check points against
sfRec = sf.records() #Read records in shapefile
n = 0
m = 1
coor = ''
coorDict = {}
matplotDict = []
muniFinal = {}
for shape in sf.shapeRecords(): #Iterate through shapes in shapefile
x = [i[0] for i in shape.shape.points[:]] #Initially for use in matplotlib to check shapefile
y = [i[1] for i in shape.shape.points[:]] #Initially for use in matplotlib to check shapefile
for i in x:
matplotDict.append((x[x.index(i)],y[x.index(i)])) #Convert coordinates to be read by Shapely pkg
munishp = Polygon(matplotDict)
muniFinal[sfRec[n][1]] = munishp #Store shape in dictionary with key of municipality
matplotDict = [] #refresh coordinate store for next shape
n += 1
n = 0
with open(fileName + '.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
coor = (row['latitude'],row['longitude'])
rlat = float(row['latitude'])
rlong = float(row['longitude'])
if coor == ' , ' or coor == ', ':
coorDict[row['PrimaryKey']] = 'No Data' #PrimaryKey is my primary key that I will use to write the data back into the .csv
else:
if float(row['longitude']) > 0:
coorDict[row['PrimaryKey']] = (rlat,rlong)
else:
coorDict[row['PrimaryKey']] = (rlong,rlat)
m += 1
#proof of concept- save the results however you'd like
for j in coorDict:
for k in muniFinal:
if muniFinal[k].contains(Point(coorDict[j])):
print(j, 'in', k)
Answer
PyShp will let you read the shapefile, but won't help you figure out if a point is in a boundary. You'll have to combine it with something like Shapely to do the geometric calculations. Luckily, the two modules can interoperate through the Python Geo Interface. Some basic functionality (untested) would be like:
import shapefile
from Shapely.geometry import Point # Point class
from Shapely.geometry import shape # shape() is a function to convert geo objects through the interface
point = (1234,5678) # an x,y tuple
shp = shapefile.Reader('path/to/shp') #open the shapefile
all_shapes = shp.shapes() # get all the polygons
all_records = shp.records()
for i in len(all_shapes):
boundary = all_shapes[i] # get a boundary polygon
if Point(pt).within(shape(boundary)): # make a point and see if it's in the polygon
name = all_records[i][2] # get the second field of the corresponding record
print "The point is in", name
Make sure your points and boundaries are in the same coordinate system!
No comments:
Post a Comment