## Friday, 12 June 2015

### openstreetmap - How to find a point half way between two other points?

I'm working on implementing the OSM Mobile Binary format for map data, and because it uses a 16 bit delta for each point along a way, it cannot represent a way if there is a large distance between two points.

The documented solution is to inject new points into the way, to keep the gap smaller.

Here's the relevant section of my code:

// calculate the distance between this point and the previous point,// multiplied by 1,000,000 so it can be represented as an integer.int32_t realNodeLatDelta = ((point.latitude - lastPoint.latitude) * 1000000);int32_t realNodeLonDelta = ((point.longitude - lastPoint.longitude) * 1000000);// cast as a 16 bit int (reduces filesize by about half)int16_t nodeLatDelta = realNodeLatDiff;int16_t nodeLonDelta = realNodeLonDiff;// check if the cast caused corruptionif (realNodeLatDelta != nodeLatDelta || realNodeLatDelta != nodeLatDelta) {  // if this point is reached, we need to add new points in the way,  // which can be represented in a 16 bit delta}

I'm not sure how to calculate the position of new points along the way? I assume it needs to be a great circle?

Points far enough apart to cause this issue are rare, but when they do happen it's usually a political boundary or similar, which tend to be quite long, and need to be accurately represented.

The best accuracy is obtained with ellipsoidal models. In the interests of simplicity you want to avoid those when you have to code distances yourself. We pay a price: given that the earth's flattening is about 1/300, using a purely spherical model can potentially introduce a relative distance error of up to 1/300 for very long routes: about 3000 parts per million. This is worth exploring.

First, the spherical formula: simply convert (lat, lon) to cartesian coordinates, average the two cartesian points, then convert the average back to spherical coordinates. Here is pseudocode:

function cartesian(f,l) { // f = latitude, l = longitude    return (cos(f)*cos(l), cos(f)*sin(l), sin(f))}function spherical(x,y,z) {    r = sqrt(x^2 + y^2)    if (r == 0) {        if (z > 0) return (90, 0)         elseif (z < 0) return (-90, 0)        else return Undefined // (x,y,z) == (0,0,0)    } else {        return (atan2(r, z), atan2(x, y)) // atan2 must return *degrees*    }}function midpoint(f0,l0, f1,l1) {     return spherical((cartesian(f0,l0) + cartesian(f1,l1))/2)}

(The arithmetic for midpoint involves a vector sum and a scalar division of that sum, so it's really hiding three sums and three divisions.)

That's the midpoint in spherical geometry. The calculation requires two cosines, two sines, a square root, two arctangents, and some multiplication and addition: reasonably fast and easy. There will be no problems near the poles or crossing the +-180 meridian. The result will be undefined when the two points are diametrically opposite.

One way to measure the error is to compute the increased distance traveled via the midpoint compared to the distance between the original points. If the increase is tiny compared to the original distance, we have little to complain about. I have computed these errors using the accurate ellipsoidal distance for the WGS84 ellipsoid. As a typical example of the results, here is a plot of relative errors when one of the endpoints is fixed at (lat, lon) = (45, 0):

The contours are on a logarithmic (base 10) scale: the -6 contours show points where the relative error is 10^(-6); that is, one part per million (ppm). The -5 contours (barely visible near (-45, 180), the diametrically opposite point) are 10 ppm. The -7, -8, etc. are fractions of a ppm: highly accurate.

Evidently, as long as we're not trying to compute midpoints of two points that are almost diametrically opposite, we will do fine. (Remember, the calculation is perfectly correct for the sphere; these errors are due to the flattening of the sphere.)

Given that 16 bit precision is about 16 ppm (a base-10 log equal to -4.8), it looks ok to use the spherical formula for midpoint-finding provided the two points are more than one degree away from being diametrically opposite.

What about the simpler linear formula? To study this, let's compare the distance between the linear midpoint (obtained by averaging the two latitudes and two longitudes) to the spherical midpoint, relative to the distance between the two endpoints. The next figure fixes one endpoint at (45, 180) and explores a relatively small region around it.

Most of these contours (base-10 logarithms once again) are near -2: that's one part per hundred (1%) error. For north-south directions there is no error, but for all other directions the error is unacceptable for many applications.

To see whether the linear approximation ever becomes OK, let's zoom into that previous map by a factor of 10. Now it is one degree wide (50 miles at this latitude) and one-half degree across (35 miles): we're looking at the scale of a large city or a small city and its suburbs.

Now the contours are around -3 to -4: that's 100 to 1000 parts per million (0.01% to 0.1%). Pretty crude and just barely noticeable on a high-res computer screen if you look closely.

Looking back, it is apparent that the slightly more complicated--but still easily implementable--spherical formula achieves greater accuracy across the globe than the simple linear formula does even in nearby locations. (I'm fudging a little, because I have used two different ways to measure errors, so they are not directly comparable.)

The bottom line:

--The linear formula will misposition the midpoint by a relative error of 0.01% to 0.1% at a city scale; over larger areas, the mispositioning can be grossly wrong (1% on up to hundreds of %).

--The spherical formula is absolutely correct for a spherical earth model. Compared to the more accurate ellipsoidal formula, it should still do well except for points that are almost diametrically opposite.