I'm trying to build a mobile app, and find the x number of points closest to a users location that satisfy certain conditions that the user chooses on the front end.
My current solution is to incrementally search a small area around the user and expand the search area until the x items are found or the maximum bounds are reached, using a JTS QuadTree.
public static Set searchQuadTree(Quadtree quadTree,
final MyBoundary, data_bounds,
final ArrayList types) {
//used to search QuadTree
double shell = SHELL;
final double EXPAND_SHELL = 5000;
final double MAX_SHELL_SIZE = 70000;
final int MAX_SEARCH_ITEMS = 20;
Set resultSet = new HashSet();
//create polygon to search for points that intersect with it
Polygon polygon = data_bounds.searchAreaPoly(shell, 0);
List items = quadTree.query(polygon.getEnvelopeInternal());
if(items != null) {
GeometryFactory factory = new GeometryFactory();
//search until the min search items are found using search criteria
while ((resultSet.size() < MAX_SEARCH_ITEMS) && (shell < MAX_SHELL_SIZE)) {
Iterator iterator = items.iterator();
while (iterator.hasNext()) {
MyQuadNode item = iterator.next();
Point point = factory.createPoint(getWGSCoord(item.getLongitude(), item.getLatitude()));
if (types.contains(item.getType()) && polygon.contains(point) && resultSet.size() < MAX_SEARCH_ITEMS) {
resultSet.add(item.getImageUrl());
}
}
}
if(resultSet.size() < MAX_SEARCH_ITEMS) {
//expand shell slowly at first;
if (shell < 1000) {
shell *= 5;
} else if (shell < 5000) {
shell *= 2;
} else {
shell += EXPAND_SHELL;
}
//search a larger area
polygon = data_bounds.searchAreaPoly(shell, 0);
List allItems = quadTree.query(polygon.getEnvelopeInternal());
//avoid filtering the same items twice
allItems.removeAll(items);
items = allItems;
}
}
}
return resultSet;
}
This will work fine (although it doesn't really find the nearest neighbors) for a small data set, if the data set grows large however this seems inefficient, i'm estimating efficiency is around x(nlogn + n + n), where x = MAX_SEARCH_ITEMS.
Has this problem been solved previously?
Answer
I am not a Java developer and don't know of any existing Java implementations for this, but using the k-Nearest Neighbors algorithm with a k-d tree will likely give much better performance.
However, if accuracy is important, you will need to implement the Haversine (spherical) or Vincenty (ellipsoidal) distance formulae for the distance metric, since Euclidean distance on latitude-longitude coordinates is very inaccurate depending on the distance of each point from each other and from the equator.
Depending on how much data you have and your particular setup, you might want to look into using a spatial database like PostGIS for this, as in this example: A generic solution to PostGIS nearest neighbor
No comments:
Post a Comment