Society Policy
Researchers propose a way to measure model errors in alternative futures
New artificial intelligence research delves into a rarely examined problem: how well predictive models of the future actually perform when the world does not unfold exactly as the forecast assumed. Emily Howerton and Justin Lessler explore so-called counterfactual scenarios – 'what if' exercises that evaluate what would happen if a certain decision were made or not made.
Such scenarios are widely used in planning for politics, healthcare, and climate actions. Yet, the scenario projections produced by models are rarely evaluated retrospectively. When a forecast differs from reality, there are two reasons: either the actual development deviates from the scenario used, or the model itself was miscalibrated, meaning systematically flawed.
The researchers emphasize that model calibration error is crucial when assessing whether a model should be used in decision-making at all. However, determining this is challenging because it requires evaluating errors in worlds that never occurred – precisely in these alternative futures.
The article presents and compares three different approaches to assessing such counterfactual errors. The benefits and limitations of the methods are tested in a simulation experiment, where researchers can controllably compare the scenarios produced by the model to the 'real' development in a simulated environment.
At the end of the work, Howerton and Lessler provide recommendations on how counterfactual error should be practically assessed and which parts of scenario design are critical for it. The goal is to make scenario models more transparent and useful as support for decision-making.
Source: Assessing model error in counterfactual worlds, ArXiv (AI).
Such scenarios are widely used in planning for politics, healthcare, and climate actions. Yet, the scenario projections produced by models are rarely evaluated retrospectively. When a forecast differs from reality, there are two reasons: either the actual development deviates from the scenario used, or the model itself was miscalibrated, meaning systematically flawed.
The researchers emphasize that model calibration error is crucial when assessing whether a model should be used in decision-making at all. However, determining this is challenging because it requires evaluating errors in worlds that never occurred – precisely in these alternative futures.
The article presents and compares three different approaches to assessing such counterfactual errors. The benefits and limitations of the methods are tested in a simulation experiment, where researchers can controllably compare the scenarios produced by the model to the 'real' development in a simulated environment.
At the end of the work, Howerton and Lessler provide recommendations on how counterfactual error should be practically assessed and which parts of scenario design are critical for it. The goal is to make scenario models more transparent and useful as support for decision-making.
Source: Assessing model error in counterfactual worlds, ArXiv (AI).
This text was generated with AI assistance and may contain errors. Please verify details from the original source.
Original research: Assessing model error in counterfactual worlds
Publisher: ArXiv (AI)
Authors: Emily Howerton, Justin Lessler
December 24, 2025
Read original →