Averaging Angles
Problem Setup
Imagine you have a sensor that outputs angles. However, this sensor is not ideal and outputs data with some error, typical of real-world sensors.
For example, the true angle might be 60°, but the sensor data might vary slightly with readings like 58°, 59°, 63°, 61°, 57°, 64°, and so on.
Let’s explore how to calculate the average of such angle data from a real-world sensor.
Assume the angle data is output in the range of 0° to 360°.
While this might seem like a simple problem, it has some surprising pitfalls.
The Simple But Flawed Method: Adding and Dividing by the Number of Data Points
The average is defined as “adding up the data and dividing by the number of data points.” The implementation might look like this:
def mean_angle(angles: list[float]) -> float:
return sum(angles) / len(angles)
For data with variations around 60°:
angles = [58, 59, 63, 61, 57, 64]
Using this method to calculate the average gives:
m = mean_angle(angles)
print(m) # 60.333333333333336
It seems to work well.
Problematic Data
However, this simple method has issues.
Consider this data:
angles = [358, 359, 3, 1, 357, 4]
This data has variations around 0°.
Calculating the average with the same method yields:
m = mean_angle(angles)
print(m) # 180.333333333333336
The expected average is around 0°, since the data is clustered near 0°, but the calculation gives an average of 180°, which is the opposite.
The Cause of the Problem
The problem stems from the fact that angle data is output in the range of “0° to 360°.”
0° and 359° are actually only 1° apart, but using these values can create a large numerical difference.
The previous method does not account for this, leading to these issues.
Solution?
One approach to solving this is to convert the angle data to the range of “-180° to 180°”:
def mean_angle_2(angles: list[float]) -> float:
# Adjust angles to the range of -180° to 180°
angles = [angle - 360 if angle > 180 else angle for angle in angles]
return sum(angles) / len(angles)
m = mean_angle_2(angles)
print(m) # 0.3333333333333333
This works well.
Another Issue
An observant reader might notice another problem: this method can also fail for data with variations around 180°. For example:
angles = [178, 179, 183, 181, 177, 184]
Using the same approach:
m = mean_angle_2(angles)
print(m) # 0.3333333333333333
This method does not work well in this case.
When using a straightforward approach, issues arise for data with variations near the boundaries of the range, such as around 180°.
Working Solution 1: Use Both Methods and Choose the Better One
You could calculate the average using both the 0° to 360° range and the -180° to 180° range, then select the one that seems better.
However, what defines “better”?
This can be quantified using the properties of the arithmetic mean
Property of the Arithmetic Mean
Consider data set $x_1, x_2, … x_n$. The arithmetic mean of this data set is defined as $$ \bar{x} = \frac{x_1 + x_2 + … + x_n}{n} $$ The mean squared error (MSE) is a function given by $$ f(x) = \frac{(x - x_1)^2 + (x - x_2)^2 + … + (x - x_n)^2}{n} $$ $f(x)$ reaches its minimum value at $\bar{x}$.
Since $f(x)$ is a quadratic function, it can be shown (using basic algebraic techniques) that $\bar{x}$ minimizes $f(x)$. The minimum value $f(\bar{x})$ is called the variance.
To quantify what is “better” in terms of average angles, we can use this property.
Focusing on the denominator of $f(x)$, each term represents the squared difference (distance) between $x$ and each data point.
To apply this approach to angle data set, we interpret “difference” (of angles) geometrically rather than arithmetically.
For example, the difference between 0° and 359° is not 359 but 1.
We can use this modified MSE to determine which mean is better. The specific code for this approach is:
def dist_angle(x: float, y: float) -> float:
"""Calculate the difference between two angles"""
a = abs(x - y) % 360
if a > 180:
return 360 - a
else:
return a
def mse_angle(x: float, angles: list[float]) -> float:
"""Calculate the mean squared error for angle data"""
return sum([dist_angle(x, a) ** 2 for a in angles]) / len(angles)
def mean_angle_3(angles: list[float]) -> float:
"""Choose the better mean from the 0° to 360° and -180° to 180° ranges"""
m1 = mean_angle(angles)
m2 = mean_angle_2(angles)
s1 = mse_angle(m1, angles)
s2 = mse_angle(m2, angles)
if s1 < s2:
return m1
else:
return m2
This approach works well.
Working Solution 2: Consider the 2D Space
While the previous approach works, it is somewhat complex.
Is there a simpler method?
Indeed, another approach is to consider angles in 2D space.
Angles themselves have no inherent special points; they simply represent a circular continuum.
The apparent special points arise because we project this continuum onto a linear scale (by cutting the circle).
So, instead of cutting the circle, consider angles in 2D space by converting each angle to its corresponding direction (unit vector), and then compute the average vector. The angle corresponding to this average vector can be considered the average angle.
Here’s how you could implement this:
import math
def mean_angle_4(angles: list[float]) -> float:
"""Calculate the mean angle in 2D space"""
c = [math.cos(math.radians(a)) for a in angles]
s = [math.sin(math.radians(a)) for a in angles]
mc = sum(c) / len(c)
ms = sum(s) / len(s)
return math.degrees(math.atan2(ms, mc))
The average computed this way will be different from the previous methods.
angles_around_0 = [358.0, 359.0, 3.0, 1.0, 357.0, 4.0]
angles_around_60 = [58.0, 59.0, 63.0, 61.0, 57.0, 64.0]
angles_around_180 = [178.0, 179.0, 183.0, 181.0, 177.0, 184.0]
print(mean_angle_3(angles_around_0)) # 0.3333333333333333
print(mean_angle_4(angles_around_0)) # 0.3331940884210799
print(mean_angle_3(angles_around_180)) # 180.33333333333334
print(mean_angle_4(angles_around_180)) # -179.66680591157893
print(mean_angle_3(angles_around_60)) # 60.333333333333336
print(mean_angle_4(angles_around_60)) # 60.33319408842108
In practice, this difference is usually minor.
Conclusion
The second method is generally more versatile.
For instance, when dealing with 3D direction data using two angles, applying this method is advisable.
Additionally, it is simpler to implement. For example, using numpy, you can achieve this with:
import numpy as np
def mean_angle_5(angles: list[float]) -> float:
"""Calculate the mean angle using numpy"""
a = np.asarray(angles)
return np.degrees(np.angle(np.sum(np.exp(np.radians(a) * 1j))))
Though this uses complex numbers, it achieves the same result.
However, because it involves trigonometric calculations, it might be less suitable for resource-constrained environments.
In such cases, you might use the first method, and for large datasets, you could initially compute a rough average with the first method and then refine it with a simple arithmetic mean.
An alternative to MSE is the Mean Absolute Error (MAE):
$$ g(x) = \frac{|x - x_1| + |x - x_2| + … + |x - x_n|}{n} $$
In summary, using the second method for a quick calculation is often best. For more efficient computation, especially with large datasets, consider using the first method with adjustments.