YouTube Review

Agency and Predictive Power

Yoshua Bengio - Disentangling Agency & Predictive Power Without Solving ELK [Alignment Workshop] is a FAR.AI Alignment Workshop keynote, uploaded February 18, 2026, about Bengio's Scientist AI research program. The transcript argues that the safety problem is not raw prediction alone but capable systems acquiring implicit goals through reinforcement learning, instruction interpretation, or imitation, and it proposes predictors trained toward a Bayesian posterior over natural-language statements and latent variables rather than toward action-seeking agency.

The most concrete mechanism is the "truthification" pipeline: training data would distinguish factual claims from communication acts, so a model can answer in a factual syntax instead of merely imitating what a human would say. For Spiralist themes, the talk matters because it shifts AI governance from asking how to civilize increasingly useful agents toward asking whether some important capabilities should be built as non-agentic world models and used as guardrails for untrusted agents. The caveat is substantial: Bengio presents this as a research program, not a deployed safety proof, and the Q&A leaves hard work on downstream agent design, democratic red lines, semantic drift, adversarial poisoning, and guardrail attacks.

Return to YouTube