Blog · arXiv Analysis · Last reviewed June 25, 2026

The World Model Hallucination Becomes the Coverage Gap

A June 2026 arXiv paper by Nicklas Hansen and Xiaolong Wang studies hallucination in generative world models: rollouts that look visually fluent while drifting away from the ground-truth dynamics. Its practical lesson is that a simulated future should carry a coverage receipt, not just a convincing video.

Fresh Angle

The paper is Hallucination in World Models is Predictable and Preventable, arXiv:2606.27326 [cs.LG; cs.CV; cs.RO], submitted June 25, 2026. It introduces MMBench2 and uses it to study when action-conditioned video world models produce plausible but dynamically wrong futures.

This page is not a duplicate of the site's World Models and Spatial Intelligence reference entry, generated-world training-ground essay, Yann LeCun world-model bet, or structural certification world-model note. Those pages address world models as a broad architecture, interface, or governance category. This paper is narrower: it names specific hallucination modes and ties them to data coverage.

Fluent But Wrong

The paper defines the failure in operational terms. A generative world model can render an action-controllable rollout that remains visually coherent while the imagined state no longer follows the actual environment dynamics. If such a model is used downstream for planning or policy learning, the fluent video becomes a bad simulator. The problem is not merely that the picture is imperfect. It is that the model can make an incorrect future look usable.

Hansen and Wang argue that these failures concentrate in low-coverage regions of the state-action space. That framing is useful because it moves hallucination away from mystique. The question becomes: did the training data include enough of this state, action, and transition pattern for the model to imagine it responsibly?

MMBench2

MMBench2 is the paper's testbed for answering that question. The arXiv record describes it as a 427-hour, 210-task dataset for visual world modeling with ground-truth actions, rewards, and live simulators. The paper reports 65,600 mixed-quality trajectories, equivalent to 427 hours of 224-by-224 video at 15 frames per second, spanning 10 task domains. Of the 210 tasks, 200 form the pretraining corpus and 10 are held out as unseen transfer tasks.

On that dataset, the authors train a 350M-parameter action-conditioned world model that largely follows the Dreamer 4 recipe. The model has a 50M-parameter encoder, a 250M-parameter dynamics model, and a 50M-parameter decoder. The project page links code, the Hugging Face dataset, and pretrained and finetuned 350M-parameter models, which makes the benchmark more inspectable than a closed demo.

Failure Modes

The paper separates hallucination into three pipeline-specific modes. Perceptual hallucination occurs in the encoder-decoder pair before rollout: the tokenizer reconstructs an unfamiliar observation as something closer to what it already knows. Action-marginalized hallucination occurs when the dynamics model ignores or washes out the action signal, behaving more like a video generator than a controllable simulator. Scene-diverging hallucination appears during multi-step rollouts, when compounding error creates events or states that no longer match the environment.

The authors then define three predictors. Tokenizer round-trip residual targets perceptual failure. Flow instability measures how much the dynamics head's denoising prediction moves under the same context and action. Inter-seed denoising variance measures disagreement across independently sampled next-latent predictions. In the paper's experiments, these predictors track realized rollout error with Spearman correlation around 0.80 against rollout delta PSNR.

Mitigation

The mitigation story has two parts. First, coverage-aware training rebalances the existing data by sampling more uniformly across tasks rather than frames. The authors report that sampling intervention outperforms loss reweighting in their setup and reduces all three failure modes. Second, when the dataset does not already cover a region, the hallucination predictors can be used as curiosity rewards for targeted online data collection.

For unseen tasks, the paper reports adaptation with as few as 50 real environment trajectories per task. The important governance point is not that hallucination disappears. The authors explicitly limit the claim to a 350M-parameter model across simulated control tasks and note uncertainty about transfer to billion-parameter models or real robot data with sensor noise and real-world stochasticity. The achievement is more modest and more useful: a way to detect weak coverage, collect targeted data, and document whether mitigation changed the failure surface.

Governance Standard

For Spiralism, the rule is a coverage receipt for simulated futures. Any world-model rollout used for planning, robotics, safety testing, game generation, or agent training should retain the task domain, source dataset, coverage profile, action-conditioning format, model checkpoint, rollout horizon, predictor scores, mitigation method, and whether the task was seen or held out.

The receipt should also keep failure modes separate. A perceptual tokenizer error is not the same as action marginalization. Action marginalization is not the same as scene divergence. If a deployment collapses them into a single "hallucination" label, it loses the ability to fix the right stage.

The larger lesson is that simulated worlds can become persuasive before they become trustworthy. A rendered future should not be treated as evidence unless the institution can show where its training data covered the state-action path, where it did not, and what happened when the gap was probed.

Sources

Nicklas Hansen and Xiaolong Wang, Hallucination in World Models is Predictable and Preventable, arXiv:2606.27326 [cs.LG; cs.CV; cs.RO], submitted June 25, 2026.
arXiv HTML: Hallucination in World Models is Predictable and Preventable, reviewed for dataset construction, architecture, hallucination taxonomy, predictors, mitigation results, release details, and limitations.
arXiv PDF: Hallucination in World Models is Predictable and Preventable, checked against the arXiv record for title, authors, arXiv ID, submission date, categories, and paper status.
Project page: MMBench2 interactive paper, reviewed for code, dataset, model links, demo description, and public resource claims.
Related pages: World Models and Spatial Intelligence, The Generated World Becomes the Training Ground, Yann LeCun's World Model Bet, and The World Model Becomes the Structural Certification Gap.

Return to Blog