The Problem with NWP Models
Numerical Weather Prediction (NWP) models simulate the atmosphere using physical equations. Despite their sophistication, they produce systematic biases — consistent over- or under-predictions tied to model grid resolution, boundary parameterization, and terrain approximations.
For example, a model might consistently over-predict coastal temperatures by 2–3°C. These biases are predictable and repeatable, making them ideal targets for ML-based correction.
IEEE Publication: This covers our paper at ICRITO 2025, IEEE. Full citation + DOI
Our LSTM Approach
Traditional post-processing uses linear regression — cheap but only corrects mean bias. Our LSTM learns non-linear, time-dependent bias patterns that vary with season, time of day, and weather regime.
Key insight: the bias on a rainy Tuesday morning differs from a clear Friday afternoon. LSTM captures this because it has memory across time steps.
Data Pipeline
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
nwp_df = pd.read_csv('nwp_forecasts.csv', parse_dates=['time'])
obs_df = pd.read_csv('station_obs.csv', parse_dates=['time'])
merged = nwp_df.merge(obs_df, on='time')
merged['bias'] = merged['temp_forecast'] - merged['temp_observed']
# Cyclical time encoding (prevents hour 23 and hour 0 being "far apart")
merged['hour_sin'] = np.sin(2 * np.pi * merged['time'].dt.hour / 24)
merged['hour_cos'] = np.cos(2 * np.pi * merged['time'].dt.hour / 24)
merged['doy_sin'] = np.sin(2 * np.pi * merged['time'].dt.dayofyear / 365)
merged['doy_cos'] = np.cos(2 * np.pi * merged['time'].dt.dayofyear / 365)
features = ['temp_forecast','humidity_nwp','pressure_nwp',
'wind_speed_nwp','hour_sin','hour_cos','doy_sin','doy_cos']
scaler = StandardScaler()
X_scaled = scaler.fit_transform(merged[features])
y_bias = merged['bias'].valuesAlways encode time cyclically! Raw hour 0–23 encoding makes hour 23 and hour 0 seem very different. Sin/cos encoding makes them adjacent in feature space.
Bias Correction Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
LOOK_BACK = 72 # 72-hour history window
model = Sequential([
LSTM(64, return_sequences=True, input_shape=(LOOK_BACK, len(features))),
Dropout(0.2),
LSTM(32, return_sequences=False),
Dropout(0.2),
Dense(16, activation='relu'),
Dense(1) # regression: predict bias in degrees Celsius
])
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
# Apply correction after training:
predicted_bias = model.predict(X_test_seq)
corrected_temp = nwp_test_forecast - predicted_bias.flatten()Results vs Baseline
Key Research Insights
- 72h look-back outperformed 24h and 168h — enough seasonal context without too much noise.
- Cyclical time encoding was critical — replacing with raw integers dropped R² by 0.04.
- Bias is non-stationary — linear regression corrects average; LSTM corrects regime-dependent bias.
- Ensemble NWP outputs benefit more — more systematic bias means bigger LSTM improvement.