Tuesday 24 October 2017

python - geodjango slowness and debugging


I'm using geodjango + postgres to display polygons from tigerline on a map. Pretty simple stuff so far.


My issue is that when I use the GPolygon object: django.contrib.gis.maps.google.GPolygon there is a considerable slowdown. (see below for the code I'm using)


locations = Location.objects.filter(mpoly__contains=point)
polygons = []
for location in locations :
for poly in location.mpoly :
gpoly = GPolygon(poly, \

stroke_color = location.location_type.stroke_color,\
stroke_weight = location.location_type.stroke_weight,
stroke_opacity = location.location_type.stroke_opacity,\
fill_color = location.location_type.fill_color,\
fill_opacity = location.location_type.fill_opacity)
gpoly.location = location.id
#raise Exception(gpoly)
polygons.append(gpoly)
# Google Map Abstraction
#raise Exception(len(polygons))

the_map = GoogleMap(polygons=polygons)


  • all of my location's mpoly are MultiPolygons

  • there are only 84 polygons total when I raise Exception(len(polygons))

  • this tiny block of code is incurring a 10 sec load time on localhost w/ 4gig ram and an i5 proc... I'm not resource locked


Does anybody have any idea what GPolygon is doing? Is GPolygon not ideal for production? is it just for prototyping?



I'm now setting up my GPolygon like so:



gpoly = GPolygon(poly.simplify(float(get_tolerance(poly.num_points))), \
stroke_color = location.location_type.stroke_color,\
stroke_weight = location.location_type.stroke_weight,
stroke_opacity = location.location_type.stroke_opacity,\
fill_color = location.location_type.fill_color,\
fill_opacity = location.location_type.fill_opacity)

using the function:


def get_tolerance(num_points) :
"""figure out a good tolerance"""

tolerance = 0
if num_points <= 500:
tolerance = 0
elif num_points <= 750:
tolerance = .001
elif num_points <= 1000:
tolerance = .002
elif num_points > 2000:
tolerance = .007


return tolerance

which sets a tolerance for the simplify geos function based on the total number of points (I noticed that for things like florida where the keys(islands) are separate polygons that the ones with fewer points broke with a high tolerance, while the big main landmass polygons are too huge without simplifying to a large degree.


This has made my program a magnitude faster BUT I still think there is tons of room for improvement.


Which leads me to my newest questions:



  • Is there a better way to guess acceptable tolerances?

  • What other possible speed gains could I explore?




Something i did was pass all of the polygons to the view as geojson objects and use JS to build the polygon objects which is way faster than django.contrib.gis.maps.google.GPolygon.


Server side python:


@csrf_exempt
def get_location_polygons(request):
"""Returns a civic location name from a geodetic point"""
response_data = {'ack':None,'data':None,'messages':[]}
if request.method == 'POST':
#try :
# Get the location
location = Location.objects.get(pk=request.POST['location_id'])


# Build the polygons
response_data['ack'] = True
response_data['messages'] = _(u'OK')
response_data['data'] = {
'stroke_color' : location.location_type.stroke_color,
'stroke_weight' : location.location_type.stroke_weight,
'stroke_opacity' : location.location_type.stroke_opacity,
'fill_color' : location.location_type.fill_color,
'fill_opacity' : location.location_type.fill_opacity,

'polygons' : location.mpoly.geojson,
'title' : location.title
}
#except :
# # Fetch failed
# response_data['ack'] = False
# response_data['messages'] = [_(u'Polygon for location could not be fetched.')]
else:
response_data['ack'] = False
response_data['messages'] = [_(u'HTTP Method must be POST')]


return HttpResponse(json.dumps(response_data), mimetype="application/json")

Client side JS:


poly = JSON.parse(data.data['polygons'])
var paths = coord_to_paths(poly.coordinates, bucket, location_id);
polygons[bucket][location_id] = new google.maps.Polygon({
paths : paths,
strokeColor : data.data.stroke_color,
strokeOpacity : data.data.stroke_opacity,

strokeWeight : data.data.stroke_weight,
fillColor : data.data.fill_color,
fillOpacity : data.data.fill_opacity
});
function coord_to_paths(coords, bucket, location_id)
{
var paths = [];
poly_bounds[bucket][location_id] = new google.maps.LatLngBounds();
for (var i = 0; i < coords.length; i++)
{

for (var j = 0; j < coords[i].length; j++)
{
var path = [];
for (var k = 0; k < coords[i][j].length; k++)
{
var ll = new google.maps.LatLng(coords[i][j][k][1], coords[i][j][k][0]);
poly_bounds[bucket][location_id].extend(ll);
path.push(ll);
}
paths.push(path);

}
}

return paths;
}

Answer



Reducing fidelity (i.e removing the number of nodes) will help since there is less data to pass to Google Maps.


Nevertheless, I would hope you are not doing this for every request directly in the view and that this is something that you are doing only once (during the first save), or through some asynchronous queue mechanism like Celery.


You can always have a shape that you use for analysis (with the full vertex count) and another one that you use for display.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...