Kyle Corry

Projects Research Recipes

Offline Temperature Prediction Using Historical Data

2023-08-05

Knowing what the temperature will be like in the future can be very helpful when planning a trip. In this article, I'll cover the system used in Trail Sense, which allows for offline temperature prediction using historical data. This system can predict the temperature at any time, location, and elevation all while running offline and using minimal processing power and storage.

Compressed climate normal data map Compressed climate normal data map

Background

Climate Normals

Climate normals are the average values of meteorological variables over a 30 year period. On average, the weather in a given location will be similar to the climate normals for that location. For example, if the climate normals for a location show that the average temperature in July is 20°C, then the weather in July will likely be around 20°C.

The climate normals I focused on are the monthly average high and low temperatures. These are the average high and low temperatures for each month over a 30 year period.

Phone Thermometers

Most smartphones have a thermometer built into them. This thermometer is used to measure the temperature of the device itself, not the ambient temperature. While the temperature of the device is impacted by the ambient temperature, the largest contributor to the temperature of the device is the heat generated by the CPU and battery. This means that the temperature of the device is not a good indicator of the ambient temperature because just using the device will cause the temperature to rise.

Calibration of this may be possible, but it would require a lot of data collection and would be very device specific.

Bilinear Interpolation

Bilinear interpolation is a method of interpolating between two points in a two-dimensional space. This technique can be used to interpolate the value of a pixel in an image based on the values of the surrounding pixels. If the pixel values smoothly transition from one value to another, then the interpolated value will be close to the actual value. This technique can be useful for geographic data, such as temperature, because it does not normally change drastically over short distances.

Cubic Interpolation

Cubic interpolation is a method of interpolating between two points in a one-dimensional space. This technique can be used to smooth out a set of data points. For example, if you have one temperature data point per month, you can use cubic interpolation to get a temperature for any day of the month. The advantage of cubic interpolation over linear interpolation is that it is smoother and does not have any sharp corners, therefore it better aligns to how temperature changes over time.

Hourly Temperatures

Typically, the temperature is lowest right before sunrise and highest a few hours after noon (let's say 3PM). A rough approximation can be made by splitting the interpolation into two segments. The first segment is between the high temperature at 3PM and the low temperature at sunrise (next day). A quadratic interpolation can be used between high and low temperatures. The second segment is between the low temperature at sunrise and the high temperature at 3PM. A sine interpolation can be used between low and high temperatures.

While this isn't an exact representation, I found it does a good job of approximating the temperature throughout the day.

Elevation and Temperature

The temperature decreases as elevation increases. The rate of decrease is about 0.0065°C per meter. This means that the temperature at sea level will be 6.5°C higher than the temperature at an elevation of 1000 meters. The following formulas can be used: sea_level_temperature = temperature + (elevation * 0.0065) temperature = sea_level_temperature - (elevation * 0.0065)

Solution

At a high level, the monthly average high and low temperatures are encoded as geographic images. Using image compression, the entire model can be stored in less than 300KB. At runtime, the pixel corresponding to the desired location can be extracted from the image and interpolation can be used to get the temperature at any time. [1]

Image generation

  1. Download the monthly average high and low temperatures for the last 30 years from MERRA2. These temperatures do not account for elevation. [2]

  2. Download the elevation data from ETOPO. This data also includes bathymetry, which is not needed and if factored in, it will lead to inaccurate values over bodies of water and near the coast. [3]

  3. Download a land mask from Natural Earth. This data is used to remove the bathymetry from the elevation data. [4]

  4. Combine the elevation data and land mask to get the elevation data for land only.

  5. Convert the temperature data to sea level using the elevation data.

  6. Average the sea level temperature data for each month over the last 30 years. The output is a high and low temperature image for each month.

  7. Map the high and low temperature images to the range [0, 255] using linear interpolation. This allows the images to be stored as 8-bit grayscale images without losing much information.

  8. Group the high and low images by 3 months. This reduces the number of images from 24 to 8. The first month will go into the red channel, second into the green channel, and third into the blue channel. The output of this will be 8 images, each containing 3 months of data.

  9. Compress the images using WebP. This reduces the size of the images by around 80% while maintaining the quality.

Interpolation

  1. Retrieve the monthly high and low temperatures for the given location from the compressed images. Use bilinear interpolation to get the pixel value for the given location. Remap the pixel value from the range [0, 255] to the range of the original image. The values for the desired location are stored in cache to reduce the number of times the images need to be decompressed.

  2. Use cubic interpolation to approximate the high and low temperature for the given day. The 15th of the month is used as the reference point for the interpolation. This is done separately for the high and low temperatures.

  3. If the temperature is needed at a specific time, use the hourly approximation described above. This will require calculating the sunrise and noon times for the given day, which can be done offline as well.

  4. Adjust the temperature to the elevation of the given location (ex. from GPS or barometer).

Results

I compared my model to Weather Spark, a system that displays climate normals for a given location. I tested using 14 locations across 75 different days. The maximum error was 5.8°C and the average error was 1.7°C with a standard deviation of 1.4°C.

In practice, I've found this model generally matches the current temperature. The largest error I've seen is on stormy days or during heat waves. This is because the climate normals are an average, so they do not account for extreme weather events.

Further research is needed to see if combining this model with other sensors on the device, such as a barometer, can improve its accuracy during extreme weather events.

Source code

References

  1. Julian, B. & Angermann, M. (2023, April 24-27). Resource Efficient and Accurate Altitude Conversion to Mean Sea Level. Retrieved from https://doi.org/10.1109/PLANS53410.2023.10140105

  2. Global Modeling and Assimilation Office (GMAO) (2015), MERRA-2 statM_2d_slv_Nx: 2d,Monthly,Aggregated Statistics,Single-Level,Assimilation,Single-Level Diagnostics V5.12.4, Greenbelt, MD, USA, Goddard Earth Sciences Data and Information Services Center (GES DISC), Accessed: 2023-05-22, 10.5067/KVIMOMCUO83U

  3. NOAA National Centers for Environmental Information. 2022: ETOPO 2022 15 Arc-Second Global Relief Model. NOAA National Centers for Environmental Information. https://doi.org/10.25921/fd45-gt74. Accessed 2023-05-26. ETOPO 2022 metadata may be accessed here: ETOPO 2022 metadata landing page

  4. Made with Natural Earth. Free vector and raster map data @ naturalearthdata.com.