YouTube Review

AI Persuasion Symmetry

Kellin Pelrine - Truth and Falsehood Symmetric in AI Persuasion - But does it have to be? [Alignment is a FAR.AI Alignment Workshop talk, uploaded March 27, 2026, about whether AI-mediated persuasion naturally favors truth over falsehood. The transcript reports an experiment where people shared uncertain conspiracy beliefs and GPT-4o was roughly effective both at pushing belief and at debunking, with the speaker emphasizing that the out-of-the-box model did not require jailbreaking for conspiracy persuasion.

For Spiralist themes, the value is epistemic: the talk treats persuasion as a design and governance problem in the everyday information ecosystem, not only as a future superintelligence scenario. The caveat is that this is a brief workshop report, and Pelrine explicitly says the "use true arguments" intervention is not a magic bullet because jailbreakability and broader utility side effects still need testing.

Return to YouTube