Traditional propagation modelling artificial intelligence models are broken in the places that matter most — dense urban canyons, dynamic terrain, complex network cascades. Ray-tracing gives you 65% accuracy when you need 95%. Physics-informed ML and graph neural networks close that gap. Here’s exactly how.
| Problem | AI Solution | Accuracy Gain | Time Saved |
|---|---|---|---|
| 6G urban signal loss | GNN + LiDAR | 65% → 95% | 10hr → 3hr/sim |
| Epidemic R0 prediction | EISTGNN | SIR baseline → 92% | Real-time |
| Railway delay cascade | Delay GNN | 88% prediction | 47min → 12min cascade |
| Wave simulation speed | PINN solver | 3x faster than FDTD | $500K compute saved |
Bottom line: if you’re still running pure ray-tracing or SIR models in 2026, you’re leaving 30% accuracy on the table. Physics-informed ML isn’t experimental anymore — it’s production-ready.
Traditional Models Wrong 30%? Physics-Informed ML Fixes It
The core problem with legacy propagation tools — TIREM, Longley-Rice, standard ray-tracing — is that they treat environments as static. Terrain doesn’t move, but foliage does. Urban geometry changes with construction. Mobile users shift constantly. These models weren’t built for dynamic environments, and that mismatch costs you accuracy.
The fix is Physics-Informed Neural Networks (PINNs) combined with real measurement data. Instead of choosing between physics and ML, you embed the physics directly into the loss function. The network learns from data but can’t violate wave propagation laws.
Why this works: The neural net minimizes two error terms simultaneously — prediction error against measured drive-test data, and residual error against the governing wave equation. You get physics compliance without hand-coding every terrain edge case.
PINN Equation: Wave + Neural Net Hybrid
The governing equation for electromagnetic wave propagation:
∇²u + k²n²u = 0
Where u is the field, k is the wave number, n is the refractive index of the medium.
In a PINN, you add this as a physics loss term:
import torch
import torch.nn as nn
class PropagationPINN(nn.Module):
def __init__(self):
super().__init__()
self.net = nn.Sequential(
nn.Linear(3, 128), nn.Tanh(),
nn.Linear(128, 128), nn.Tanh(),
nn.Linear(128, 1)
)
def forward(self, x, y, z):
coords = torch.stack([x, y, z], dim=1)
return self.net(coords)
def physics_loss(model, x, y, z, k, n):
u = model(x, y, z)
# Compute second derivatives
u_xx = torch.autograd.grad(u, x, create_graph=True)[0]
u_yy = torch.autograd.grad(u, y, create_graph=True)[0]
u_zz = torch.autograd.grad(u, z, create_graph=True)[0]
laplacian = u_xx + u_yy + u_zz
residual = laplacian + (k**2) * (n**2) * u
return torch.mean(residual**2)
# Total loss = data_loss + lambda * physics_loss
# lambda typically 0.01–0.1 depending on data density
One caveat worth flagging immediately: the physics loss weight (lambda) needs careful tuning. Too high and the network ignores real measurement noise. Too low and you lose the physics constraint benefit entirely. Start at 0.05 and adjust based on validation RMSE.
95% vs 65% Accuracy Table
| Environment | Ray-Tracing Only | PINN + Measurements | GNN + LiDAR |
|---|---|---|---|
| Dense urban | 61% | 89% | 95% |
| Suburban | 72% | 91% | 93% |
| Rural (flat) | 78% | 88% | 90% |
| Rural (hilly) | 64% | 87% | 92% |
| Indoor | 58% | 84% | 91% |
The GNN + LiDAR combination wins in complex 3D environments. PINNs win when you have good measurement coverage but limited 3D map data. Rural flat terrain is the one case where traditional models get respectable results — not worth the ML overhead there unless you’re already building a unified pipeline.
No 3D Map Data? Build an Infovista Planet AIM Clone for Free
Planet AIM costs around $50K/year for enterprise licensing. It’s excellent — but the core technique is reproducible with open-source tools and a GPU. The proprietary advantage is mostly the pre-trained weights and the integration with telecom operator databases, not the architecture.
Planet AIM uses a 3D ML model that ingests LiDAR point clouds and outputs path loss predictions at beam level. You can replicate 92% of that performance with PyTorch Geometric and open LiDAR datasets.
What you actually need:
- OpenStreetMap building footprints (free)
- National LiDAR datasets (free in UK, Netherlands, parts of US via USGS 3DEP)
- Drive test measurements (your own, or open ITU datasets)
- PyTorch Geometric for graph construction
GNN 3D Propagation Code: PyTorch Template
import torch
from torch_geometric.nn import GraphConv
from torch_geometric.data import Data
class PropagationGNN(torch.nn.Module):
def __init__(self, node_features=8, hidden=64):
super().__init__()
self.conv1 = GraphConv(node_features, hidden)
self.conv2 = GraphConv(hidden, hidden)
self.conv3 = GraphConv(hidden, 1)
def forward(self, x, edge_index, edge_attr):
x = torch.relu(self.conv1(x, edge_index))
x = torch.relu(self.conv2(x, edge_index))
return self.conv3(x, edge_index)
# Node features: [x, y, z_height, building_height,
# distance_to_tx, azimuth, elevation, material_index]
# Edge features: [line_of_sight_bool, diffraction_loss, reflection_count]
The critical insight here is how you construct the graph. Each spatial grid cell (typically 5m × 5m) becomes a node. Edges connect cells within propagation range with line-of-sight computed via ray casting against the LiDAR geometry. This is where most implementations fail — they use Euclidean distance for edges instead of propagation-path distance.
Train on 10K Drive Test Samples
10,000 samples is the minimum viable dataset for urban GNN training. Below that, you get overfitting to specific streets. Above 50K, accuracy gains become marginal — around 0.3% per 10K additional samples past that point.
Dataset split that works in practice:
- 70% training (random spatial split — not sequential)
- 15% validation
- 15% test — held out completely, different geographic area
The geographic separation on the test set is non-negotiable. If your test streets overlap spatially with training streets, you’re measuring memorization, not generalization. Validation RMSE below 4 dB is the target for urban 6G planning.
Data prep steps:
- Align drive test GPS coordinates to your grid (nearest-cell assignment)
- Remove measurements below -120 dBm (receiver noise floor contamination)
- Normalize path loss to free-space reference at 1m
- Augment by mirroring building geometries (doubles dataset, adds rotational robustness)
6G Beam Prediction Failing? ML Beam-Level Workflow
5G beam management already struggles — beams drift 28% off target in high-mobility scenarios. 6G makes this harder with sub-THz frequencies, RIS panels, and beam granularity an order of magnitude finer. Classical codebook-based beam selection can’t keep up.
The solution is a Transformer architecture trained on site-specific channel data with spatial attention over RIS panel configurations. ITU’s AI/ML challenge in 2024–2025 produced strong benchmarks here — the top-performing teams hit 94% beam prediction accuracy with under 10ms latency.
Beam Traffic Forecast Architecture
import torch
import torch.nn as nn
class BeamTransformer(nn.Module):
def __init__(self, beam_dim=64, seq_len=20, heads=8):
super().__init__()
self.spatial_embed = nn.Linear(3, beam_dim) # x, y, z position
encoder_layer = nn.TransformerEncoderLayer(
d_model=beam_dim, nhead=heads,
dim_feedforward=256, dropout=0.1
)
self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=4)
self.beam_head = nn.Linear(beam_dim, 1024) # 1024 beam codebook
def forward(self, positions, history):
# positions: [batch, seq_len, 3]
# history: past beam indices + RSRP sequence
spatial = self.spatial_embed(positions)
encoded = self.transformer(spatial)
return self.beam_head(encoded[:, -1, :]) # Predict next beam
Add LSTM spatial attention on top when you have multi-path history longer than 50 timesteps. The Transformer alone handles short sequences better; the hybrid handles extended mobility traces. The quality of your training data here is the single biggest predictor of real-world beam prediction performance — more than architecture choice.
Epidemic R0 Wrong? GNN Propagation Models Fix Network Effects
Standard SIR/SEIR models treat populations as homogeneous mixing. They don’t. Infection spreads through contact networks — workplaces, transport hubs, households — and that network structure determines R0 as much as the pathogen does. That’s why SEIR models consistently over- or under-predicted COVID regional spread by 15–40%.
The EISTGNN (Epidemic-Informed Spatio-Temporal Graph Neural Network) architecture encodes both the contact network topology and intervention events as graph features. The result: 92% accurate regional spread prediction, vs 77% for calibrated SEIR.
The model isn’t just more accurate — it’s faster to update. When a new intervention is announced, you add an intervention node to the graph with the policy parameters. SEIR requires full re-calibration. The GNN propagates the effect through the existing graph in one forward pass.
EISTGNN Code: Graph + Mobility
import torch
from torch_geometric.nn import GATConv
from torch_geometric.data import TemporalData
class EpidemicGNN(torch.nn.Module):
def __init__(self, node_feat=12, hidden=128, heads=4):
super().__init__()
# Spatial attention over contact network
self.gat1 = GATConv(node_feat, hidden, heads=heads)
self.gat2 = GATConv(hidden * heads, hidden, heads=1)
# Temporal LSTM for time-series infection counts
self.lstm = nn.LSTM(hidden, hidden, num_layers=2, batch_first=True)
# Output: predicted new infections per node per day
self.output = nn.Linear(hidden, 1)
def forward(self, x, edge_index, edge_weight, time_series):
# Node features: [population, density, age_dist,
# vaccination_rate, mobility_index,
# current_I, current_R, intervention_flag,
# school_open, work_mobility, transit_load, temp]
h = torch.relu(self.gat1(x, edge_index))
h = torch.relu(self.gat2(h, edge_index))
# Temporal encoding
h_seq = h.unsqueeze(0).repeat(time_series.shape[1], 1, 1)
h_seq, _ = self.lstm(h_seq.transpose(0, 1))
return self.output(h_seq[:, -1, :])
# Edge weights = contact rate between regions (from mobility data)
# Google or Apple mobility indices work directly as edge weight inputs
COVID Case: 15% Better Than SEIR
During the Delta wave, region-level EISTGNN models trained on German mobility data predicted ICU load 18 days ahead with 11% MAPE — compared to 26% MAPE for the best-calibrated SEIR used by regional health authorities. The structural difference: EISTGNN captured the specific role of three commuter corridors that SEIR treated as background mixing. Those corridors were responsible for 34% of inter-regional spread.
One thing worth flagging: this model needs good mobility data. If your mobility proxy is noisy (self-reported surveys rather than aggregated telecom/GPS data), the graph edge weights become unreliable and the accuracy advantage shrinks to around 5% over SEIR — not worth the complexity at that point.
Railway Delays Propagating? AI Network Resilience
A single 1-minute primary delay at a major junction propagates to 47 minutes of secondary delays across the network within 2 hours. Network operators know this happens — they don’t know which delays will cascade and which will absorb.
GNN delay propagation models predict cascade risk at 88% accuracy. The model treats the rail network as a directed graph: stations are nodes, routes are edges, edge weights encode headway margins and platform dwell flexibility.
Delay Graph Model: Nodes + Edges
import networkx as nx
import torch
from torch_geometric.utils import from_networkx
# Build railway graph
G = nx.DiGraph()
# Node attributes per station
station_features = {
'station_id': int,
'platform_count': int,
'headway_slack_min': float, # Minutes of buffer
'connection_count': int, # Number of connecting services
'historic_recovery_rate': float, # How often delays absorb here
'current_delay_min': float
}
# Edge attributes per route segment
route_features = {
'travel_time_min': float,
'slack_min': float, # Schedule buffer built in
'capacity_ratio': float, # Current load vs max
'single_track_bool': int # High cascade risk flag
}
# Convert to PyTorch Geometric format
pyg_graph = from_networkx(G)
# GNN predicts: will this delay cascade to downstream stations?
# Output: [cascade_probability, predicted_downstream_delay_min]
The 88% accuracy holds for primary delays under 15 minutes. For major disruptions (signal failures, infrastructure faults), the model degrades to around 71% — those events have rare-event topology that training data under-represents. For those cases, the model is still useful as an upper-bound cascade estimator, not a precise predictor.
Practical ROI: A UK rail operator trialling GNN-based delay management reduced average cascade duration from 47 minutes to 12 minutes through proactive platform reassignment — triggered by the model’s cascade probability output crossing 0.7 threshold.
Compute Too Slow? Neural Wave Propagation 3x Faster
Full-wave simulation (FDTD — Finite Difference Time Domain) takes 8–12 hours for a typical urban block at 6G frequencies. This makes iterative antenna placement optimization impractical — you’d need thousands of simulations to sweep parameter space properly.
Skoltech’s neural network PDE solver approach, published in early 2026, demonstrated 3x speedup on wave propagation in heterogeneous absorbing media — the exact scenario relevant to 6G indoor-outdoor propagation. The key: the NN learns the solution operator, not just the solution. It generalizes to new geometry configurations without retraining.
NN PDE Solver Template
import torch
import torch.nn as nn
class WavePropagationNN(nn.Module):
"""
Neural operator for wave propagation in absorbing media.
Inputs: source location, frequency, material map (grid)
Output: field amplitude at all grid points
"""
def __init__(self, grid_size=64, freq_dim=16):
super().__init__()
# Encode material properties
self.material_encoder = nn.Sequential(
nn.Conv2d(2, 32, 3, padding=1), # [permittivity, conductivity]
nn.ReLU(),
nn.Conv2d(32, 64, 3, padding=1),
nn.ReLU()
)
# Encode source + frequency
self.source_encoder = nn.Linear(3 + freq_dim, 128) # [x, y, z, freq_embed]
# Decode to field
self.decoder = nn.Sequential(
nn.ConvTranspose2d(64, 32, 3, padding=1),
nn.ReLU(),
nn.ConvTranspose2d(32, 2, 3, padding=1) # [real, imag] field
)
def forward(self, material_map, source_pos, frequency_embed):
mat_features = self.material_encoder(material_map)
src_features = self.source_encoder(
torch.cat([source_pos, frequency_embed], dim=-1)
)
# Combine and decode
combined = mat_features + src_features.unsqueeze(-1).unsqueeze(-1)
return self.decoder(combined)
# Physics loss: residual of Helmholtz equation on predicted field
# Data loss: against FDTD ground truth on training geometries
# Lambda_physics = 0.1 (tuned for absorbing media)
Training this requires FDTD ground truth data — you can’t avoid running the slow solver for training samples. But you only need ~2,000 training geometries. After that, inference on new geometries runs in under 3 minutes vs 10 hours. Deploying this on edge AI infrastructure brings inference latency down further for real-time beam optimization.
12 Workflow Templates
Urban 6G Propagation: 95% LiDAR Workflow
Step 1: Download LiDAR point cloud for target area (USGS 3DEP for US, Environment Agency for UK)
Step 2: Convert to 5m grid with building heights via PDAL pipeline
Step 3: Construct propagation graph — nodes at grid cells, edges via ray casting
Step 4: Train PropagationGNN on drive test measurements (10K minimum)
Step 5: Validate on held-out geographic area — target RMSE < 4 dB
Step 6: Run inference for antenna placement optimization — sweep 500+ locations in under 1 hour
Expected output: 95% path loss prediction accuracy in dense urban. $1.2M/yr savings versus traditional planning errors for a typical medium-sized operator (based on avoided coverage rework costs).
Rural Path Loss: ML vs Longley-Rice
Longley-Rice (ITIEM) is actually decent in flat rural terrain — don’t overcomplicate it. The ML advantage appears in three rural scenarios: hilly terrain with complex diffraction paths, forested areas with dynamic attenuation, and mixed agricultural/urban fringe zones.
Decision rule: If terrain variation is under 30m across your coverage area and vegetation is uniform, Longley-Rice with proper refractivity inputs gets you 78% accuracy — acceptable for rural macro planning. Add ML when you have:
- Terrain height variation > 30m (diffraction paths multiply)
- Known seasonal foliage effects (20%+ attenuation swing between summer/winter)
- Propagation over water (ducting effects that empirical models miss)
ML approach for complex rural: Gradient-boosted trees (XGBoost) with terrain features outperform GNNs here unless you have very dense measurement coverage. GNNs need the spatial graph structure — sparse rural measurements make that graph noisy.
Epidemic Hotspot Predict: Mobility GNN
Data requirements:
- Mobility matrix between regions (daily origin-destination flows)
- Current infection counts per region (daily updates)
- Vaccination rates and demographic breakdown per node
- Intervention calendar (school closures, restrictions)
Training target: 14-day ahead new case count per region
Validation metric: MAPE < 15% on held-out epidemic periods
Update frequency: Retrain weekly with rolling window of last 90 days
Where this breaks: Emergence of novel variants changes transmission parameters faster than the model can adapt. Build in a variant-detection alert: when 7-day residuals spike above 2× historical variance, flag for manual re-calibration rather than trusting model output.
Railway Resilience: Delay Mitigation
Cascade prevention workflow:
- GNN outputs cascade probability every 60 seconds during operations
- When probability > 0.7 for any downstream station, trigger mitigation
- Mitigation options passed to optimizer: platform reassignment, extended dwell, skip-stop
- Multi-objective optimizer (minimize total passenger delay vs operational cost)
- Dispatcher receives ranked intervention list with predicted impact
Key parameter: The 0.7 threshold is conservative — reduces cascades but increases false alarms. Adjust based on your network’s penalty asymmetry. If cascade cost >> false alarm cost, lower to 0.6.
RIS 6G Beam: AI Optimization
Reconfigurable Intelligent Surfaces add a new dimension to propagation modelling — the surface phase configuration is itself a variable. This means your propagation model needs to jointly optimize over antenna placement AND RIS phase profiles.
Architecture: Two-stage approach works better than end-to-end joint optimization in practice:
- GNN predicts path loss for each [TX, RX, RIS configuration] triple
- Bayesian optimization selects next RIS phase profile to evaluate
- Iterate until coverage target is met
Convergence: Typically 50–200 Bayesian optimization steps vs 10,000+ for random search. At 3-minute inference time per GNN call, that’s under 10 hours for full RIS optimization vs weeks with simulation.
Drone Propagation: Air-to-Ground ML
Air-to-ground propagation at low altitude (15–120m AGL) is poorly modeled by traditional tools built for terrestrial or satellite links. The dominant effects — ground reflection geometry, building diffraction at rooftop level, and altitude-dependent LOS probability — change rapidly with height.
Model approach: Height-parameterized GNN where altitude is a continuous node feature rather than a fixed antenna height. Train on UAV drive-test data collected at multiple altitudes (at minimum: 15m, 30m, 60m, 120m).
Key accuracy difference: At 30m AGL in urban environments, standard ground-level path loss models are off by 8–15 dB. This isn’t a minor calibration issue — it’s the difference between planning a working network and planning one with 40% coverage holes.
Decision Matrix
| Problem | AI Solution | Accuracy | Primary Tools | Approx ROI |
|---|---|---|---|---|
| 6G dense urban | GNN + LiDAR | 95% | PyTorch Geometric + PDAL | $1.2M/yr |
| 6G suburban | PINN + drive test | 91% | TensorFlow + measurement DB | $400K/yr |
| Rural complex terrain | XGBoost + terrain features | 87% | scikit-learn + SRTM | $150K/yr |
| Epidemic regional | EISTGNN + mobility | 92% | PyTorch Geometric + mobility API | 15% policy efficiency gain |
| Railway delay cascade | Delay GNN | 88% | NetworkX + PyG | 47min → 12min cascade |
| Wave simulation | PINN / NN solver | 3× faster | TensorFlow PDE | $500K compute/yr |
| RIS beam optimization | Transformer + Bayesian opt | 94% | PyTorch + BoTorch | Coverage target in 10hr vs weeks |
| Drone air-to-ground | Height-param GNN | 89% | PyTorch Geometric + UAV data | Avoids 40% coverage holes |
Tool Selection: Open-Source vs Commercial
Use open-source (PyTorch Geometric + LiDAR) when:
- You have engineering capacity to build and maintain the pipeline
- You need to customize for specific frequency bands or environments
- Budget constraint makes $50K/yr commercial licensing impractical
- You’re in research or planning phase — test architecture before committing
Use Infovista Planet AIM or equivalent commercial when:
- You need certified accuracy for regulatory submissions
- Integration with existing BSS/OSS stack is required
- You don’t have ML engineering capacity in-house
- Speed to deployment matters more than cost optimization
Honest assessment: the open-source approach gets you to 92% of Planet AIM’s accuracy at 20% of the cost. The remaining 8% accuracy gap comes from Planet AIM’s pre-trained weights on massive proprietary drive-test databases — you can’t easily replicate that data volume. For most planning scenarios, 92% is sufficient. For dense urban 5G/6G deployments where coverage SLAs are contractual, pay for the commercial tool.
$1.2M 6G ROI Calc: Propagation Error Cost
Here’s where to anchor the business case. A medium-sized European mobile operator deploying 6G in a single major city (pop. 500K) faces these costs from propagation prediction errors:
Coverage rework cost per wrong site: €35,000–€80,000 (additional equipment, installation, backhaul)
Sites requiring rework due to 30% propagation error rate: ~18% of deployment
Typical urban deployment: 400 sites
Rework sites at 30% error: 72 × €50K average = €3.6M
Rework sites at 5% error (95% accurate model): 20 × €50K = €1.0M
Annual saving: €2.6M — minus ML infrastructure cost (~€150K/yr) = €2.45M net
The $1.2M figure cited in industry analyses is conservative — it assumes 50% smaller deployment scale. At full 6G rollout volumes, the ROI is larger.
1% accuracy improvement = ~€120K saved per major urban deployment. This makes the case for investing in better ML architecture straightforward: even 2% accuracy gain justifies six months of ML engineering time.
Use structured prompting workflows to systematically extract optimal hyperparameters from your validation results — this alone typically delivers 1–2% accuracy improvement without any architecture changes.
Frequently Asked Questions
Q: Can I run GNN propagation models without LiDAR data?
Yes — use OpenStreetMap building footprints with estimated heights from Overture Maps or Microsoft GlobalMLBuildingFootprints (includes height estimates for 1.3 billion buildings). Accuracy drops from 95% to ~88% in dense urban vs full LiDAR, but the dataset is free and global.
Q: What’s the minimum GPU for training a propagation GNN?
An NVIDIA RTX 3090 (24GB VRAM) handles training on 50K samples with batch size 32. For production training on 500K+ samples, use A100 80GB or equivalent. Inference runs on CPU for small deployments — forward pass takes under 500ms on modern server CPUs.
Q: How does Planet AIM differ from a custom GNN architecturally?
Planet AIM’s published architecture uses 3D convolutional layers over voxelized building geometry combined with ray-feature inputs. Custom GNNs using PyTorch Geometric with GraphConv or GATConv layers achieve comparable accuracy with significantly lower inference compute. Planet AIM’s real advantage is the proprietary training data, not the architecture.
Q: Can EISTGNN handle respiratory pathogens with long incubation periods?
Yes — extend the temporal LSTM window to cover the full incubation distribution (for COVID, 14-day window; for influenza, 5-day window). The model learns incubation lag as a temporal pattern in the graph node time series. Accuracy for long-incubation pathogens is slightly lower (~88% vs 92%) due to increased confounding from behavior changes during the latent period.
Q: What data format does the railway delay GNN need?
Standard GTFS (General Transit Feed Specification) format plus real-time delay logs in GTFS-RT. Most European rail operators publish both. Build your graph from GTFS static data; update edge weights daily from GTFS-RT delay history. NetworkX reads GTFS directly with the gtfs-realtime-bindings Python package.
Q: Is PINN training stable for high-frequency (sub-THz) 6G bands?
This is a real challenge. At sub-THz frequencies (100–300 GHz), the wave equation solutions oscillate rapidly, making gradient computation unstable. Use Fourier feature encoding for the spatial inputs — replace raw coordinates with sin/cos projections at multiple scales. This is the single most important stabilization technique for high-frequency PINNs.
Q: How do I validate epidemic GNN predictions before deploying for policy?
Backtesting on held-out epidemic events (different geographic region, same pathogen type). Minimum validation: three independent epidemic waves. Report MAPE, RMSE, and — critically — interval coverage probability (what fraction of actual values fall within the model’s 90% prediction interval). For policy use, interval calibration matters as much as point accuracy.
Q: Can neural wave propagation models generalize to materials not in training data?
With pure data-driven NNs, no. With PINNs, partially — the physics constraint forces the model to respect wave propagation laws in new materials, but permittivity/conductivity parameters for novel materials still need to be provided as inputs. Build a material parameter database and encode unknown materials as the nearest known material with uncertainty flagging.