Blog · Review Essay · Last reviewed June 25, 2026

The Myth of Artificial Intelligence and the Belief in Inevitable AGI

Erik J. Larson's The Myth of Artificial Intelligence is less a claim that powerful AI is impossible than a warning about inevitability as a story. The book argues that contemporary AI success in narrow, data-rich domains should not be mistaken for a known path to general intelligence, and that the gap matters because hype reorganizes research, investment, culture, procurement, and public trust.

The myth, in this review, is not "AI does not matter." The myth is a claim-laundering pattern: useful systems, fluent interfaces, benchmark progress, agent demos, and investor confidence are converted into a claimed destination, humanlike general intelligence as the natural outcome of the current road. Larson's value is to make that leap visible before it hardens into procurement, policy, and theology-by-product-roadmap. The practical discipline is to keep demos, deployments, governance claims, and future forecasts in separate evidence classes.

The safety test is simple: what authority is being requested now, and what evidence actually supports that authority? A benchmark can justify curiosity, a field evaluation can justify a bounded deployment, and a safety case can justify a release decision. None of those by itself proves that the present system understands, deserves autonomy, or sits on a known road to AGI.

The Book

The Myth of Artificial Intelligence: Why Computers Can't Think the Way We Do was published by The Belknap Press of Harvard University Press in 2021. Library and journal listings give the book at roughly 320 pages, with hardcover ISBN 9780674983519 and paperback ISBN 9780674278660. Its subjects include artificial intelligence, intellect, inference, logic, natural language processing, and neuroscience.

Larson writes as a computer scientist and natural-language-processing researcher, but the book is also a philosophy-of-science argument. It moves through Turing, the Dartmouth lineage, the superintelligence tradition, machine learning, big data, Charles Sanders Peirce, abductive inference, language understanding, neuroscience, and the temptation to treat statistical success as a general theory of mind.

The title is easy to misread. Larson is not saying that all machine learning is fake, useless, or trivial. He is saying that a culture has formed around the assumption that today's path will scale into human-level intelligence. That assumption, not the existence of useful AI, is the myth under review.

Current Context

As of June 25, 2026, Larson's book has to be read after the mass adoption of conversational AI, multimodal models, coding assistants, retrieval systems, and tool-using agents. The public evidence is mixed in exactly the way the book prepares the reader to notice: capability gains are real, frontier benchmarks have moved sharply, and deployment has widened, while understanding, reliability, provenance, data quality, incident handling, evaluation validity, and agent authorization remain separate questions.

Stanford HAI's 2026 AI Index frames the current moment as a widening gap between what AI systems can do and how prepared institutions are to govern, evaluate, and understand them. The 2026 International AI Safety Report uses a similar discipline from another angle: general-purpose AI capabilities continue to improve, but the capabilities are jagged, and an evaluation gap separates pre-deployment benchmark performance from real-world utility and risk. Those sources weaken both lazy dismissal and inevitability rhetoric. They say: look harder, classify claims more carefully, and keep evidence attached to context.

This is the current value of The Myth of Artificial Intelligence. It does not settle whether future systems could become more general. It gives a way to resist a live institutional shortcut: turning today's performance, capital expenditure, model-release cadence, and agent demos into a presumed endpoint. The stronger the systems become, the more important it is to keep capability, deployment fitness, decision authority, and long-run forecasts in different ledgers.

The agent layer makes that shortcut more dangerous. A text model that answers a question is one governance object; a tool-using system that can call APIs, edit files, message users, move data, or trigger transactions is another. NIST's 2026 AI Agent Standards Initiative is useful here because it treats autonomous action as a standards problem around security, authentication, identity, interoperability, and evaluation. Larson's warning becomes concrete: do not let a story about future intelligence substitute for present permission boundaries.

The Myth of Inevitability

The book's central target is inevitability. In Larson's account, public AI discourse often treats AGI as if it were already on a road with mile markers: more data, more compute, larger models, better benchmarks, and eventually general intelligence. The road metaphor is the problem. It turns an open scientific question into a schedule.

This matters because inevitability is not a neutral prediction. It changes behavior. Investors fund the presumed road. Researchers align with the dominant road. Companies advertise proximity to the road. Regulators are told to adapt to the road. Critics are treated as people who merely dislike the future. A speculative belief becomes an institutional organizing principle.

Larson's warning is especially useful after the generative-AI boom. Large language models made the myth more plausible to many people because language is the medium through which humans usually recognize intelligence. A fluent answer feels closer to mind than a high-scoring classifier does. The danger is that fluency can make the roadmap feel proven before the underlying problem has been solved.

A useful way to operationalize the warning is a claim ladder. A demo may support a capability claim. A benchmark may support a limited measurement claim. A pilot may support a workflow claim. A safety case may support a bounded deployment claim. None of those, by itself, supports the civilizational claim that current methods are on a known path to general intelligence. Hype works by moving evidence up the ladder without saying that a move has happened.

The myth has three recurring moves. First, conflation: different systems inherit the same aura because they are all called "AI." Second, extrapolation: local progress in scaling, data, tools, or benchmarks is treated as proof of a single destination. Third, authorization: the forecast is used to justify present decisions about budgets, law, labor, surveillance, education, and public infrastructure. Larson's book is most useful when read as a detector for those moves.

The governance version is a future-claim receipt. Any AGI-adjacent claim that is meant to steer policy or spending should name the mechanism, evidence base, time horizon, uncertainty, disconfirming evidence, affected institution, and action it is being used to justify. A forecast that cannot survive that receipt may still be interesting. It should not silently become an emergency, a procurement requirement, or a reason to bypass ordinary accountability.

The receipt should also name the claim's rung. Observation says what a system did in a measured setting. Deployment evidence says what happened in a real workflow. Mechanism says why the result should generalize. Forecast says what might happen later. Authority request says what money, permission, secrecy, or dependence is being asked for now. Hype works by presenting a lower rung as if it had already earned a higher one.

The Inference Problem

The technical spine of the book is inference. Larson distinguishes deduction, induction, and abduction. Deduction moves from rules to consequences. Induction generalizes from patterns. Abduction, associated with Peirce, concerns the generation of explanatory hypotheses: the move from surprising facts toward a possible explanation that would make them intelligible.

That third kind of reasoning is Larson's pressure point. Much of modern AI is powerful at induction under favorable conditions: finding statistical structure in large datasets, mapping inputs to outputs, and improving when the training distribution matches the task. But human understanding often depends on context-sensitive explanatory judgment. People notice what matters, ignore what does not, repair ambiguity, infer motive, change frames, and form live hypotheses in situations that were not already carved into the training problem.

The value of this argument is not that "abduction" magically names everything missing from AI. It is that Larson refuses to let pattern recognition stand in for understanding without further argument. The book keeps asking whether a system has only learned regularities or whether it can decide what sort of situation it is in.

That distinction matters for governance even when the philosophy remains unsettled. A system can perform useful induction and still lack the situated judgment needed for unsupervised authority. If the task requires asking what the case is a case of, identifying missing context, generating explanations, or knowing when no available frame fits, then a benchmark score should be treated as a clue, not as a delegation warrant.

Language Without Understanding

Natural language is where the argument bites hardest. Language is not just strings arranged by probability. It is entangled with situation, memory, shared background, goals, social roles, implication, repair, reference, and practical action. A person often understands an utterance by asking what world would make that utterance sensible.

Current systems can produce astonishing text, code, images, audio, plans, and tool calls while still failing in ways that reveal weak grounding, brittle context, fabricated evidence, shallow causal models, unsafe delegation, or misplaced confidence. The public then faces two errors at once. One error is dismissal: treating statistical systems as mere toys despite their real power. The other is inflation: treating polished output as evidence that the deeper problem of understanding has been crossed off.

Larson is strongest when he holds those two errors apart. He allows that narrow AI can be economically and socially transformative while denying that transformation proves general intelligence is on a known track. A system can reshape work, education, search, care, law, and media without being a mind in the human sense. That is precisely why institutions need clearer language. Social impact and humanlike understanding are different claims.

Hype as Belief Formation

The book belongs on this site's shelf because it treats AI hype as a cognitive and institutional process, not only as marketing noise. Hype supplies a storyline in which every new capability becomes evidence for the same conclusion. Failures become temporary obstacles. Scale becomes destiny. The future starts to feel already decided.

That pattern is familiar from media theory and cult dynamics. The classic study is Leon Festinger's When Prophecy Fails, which followed a 1950s flying-saucer group through the night its predicted apocalypse did not arrive. Rather than disbanding, the most committed members proselytized harder, having reframed the failure as proof that their devotion had saved the world. Festinger named the engine cognitive dissonance, the mind's drive to resolve the gap between belief and fact in favor of the belief it has already paid for. A belief system becomes resilient exactly when disconfirming evidence can be absorbed as delay, persecution, insufficient effort, or a need for greater commitment. In AI, the language is secular and technical, but the structure can be similar. Benchmarks, demos, funding rounds, leaderboards, investor letters, safety forecasts, and science-fiction images all feed a shared sense that the next step is obvious.

Larson's useful corrective is humility about unknowns. If there is no known algorithm for general intelligence, then honest governance should preserve uncertainty instead of laundering it into product roadmaps. The same humility should apply in the opposite direction: no one should declare all future machine intelligence impossible. The disciplined position is narrower and harder: current success does not prove the myth of inevitability.

Governance and Safety

The governance implication is that institutions should not buy or regulate systems as if the AGI story has already settled what the system is. A procurement claim should say what the tool does now, under what conditions, with what evidence, for which users, with which failure modes, and under whose authority. "AI" is not a safety case. "Humanlike" is not a metric. "On the path to AGI" is not an operational requirement.

Current risk frameworks make this discipline practical. NIST's AI Risk Management Framework is intended to help organizations incorporate trustworthiness into the design, development, use, and evaluation of AI systems, and its core functions are govern, map, measure, and manage. NIST's Generative AI Profile, released in July 2024 and updated in April 2026, treats generative AI as a cross-sector lifecycle risk problem. ISO/IEC 42001:2023 supplies a management-system standard for establishing, implementing, maintaining, and improving organizational controls around AI. These sources do not answer Larson's philosophical question. They help keep institutional claims tied to evidence, process, responsibility, and review.

Law and standards are also pulling inflated claims into harder records. Article 55 of the EU AI Act requires providers of general-purpose AI models with systemic risk to run model evaluations, including documented adversarial testing, assess and mitigate systemic risks, report serious incidents, and maintain cybersecurity protection. NIST's 2026 AI Agent Standards Initiative focuses on agent standards, interoperability, security, authentication, identity, and evaluations for systems capable of autonomous action. Neither source proves or disproves AGI. Both show what governance looks like when AI claims leave the slide deck and enter infrastructure.

For public agencies, OMB Memorandum M-25-21 gives a U.S. federal example of the same evidence discipline: high-impact AI uses require pre-deployment testing, AI impact assessments, ongoing monitoring, human oversight, remedies or appeals where appropriate, inventories, and discontinuation when minimum practices are not met. The memo is not a general law for every organization, and it does not solve Larson's theory of intelligence. It does show what a present-tense authority claim must leave behind when AI affects rights, safety, or access to services.

A hype-resistant safety program should separate capability evaluation, deployment evaluation, safety cases, authority decisions, and future speculation. Capability evaluation asks what the model can do under tested conditions. Deployment evaluation asks what happens when the system enters a workflow with users, incentives, adversaries, data drift, appeal needs, and affected people. A safety case makes a bounded argument that a specific system, version, scaffold, tool set, and operating context are acceptably safe for a defined decision. Authority decisions say what the system may actually do. Future speculation should be labeled as forecast or scenario, not silently converted into budget, emergency authority, or procurement inevitability.

The practical controls are concrete: dated evaluations, model and system identifiers, documented intended use, out-of-distribution tests, red-team results, source-grounding checks, uncertainty disclosure, incident reporting, human oversight with authority, user appeal paths, rollback criteria, and clear limits on agent permissions. The higher the stakes for rights, care, employment, education, public evidence, or delegated action, the less acceptable it is to substitute AGI-adjacent language for demonstrated fitness.

That also changes procurement language. A request for proposal should not ask vendors to bring "advanced AI" or "near-human reasoning." It should ask for versioned evidence: the task boundary, baseline comparison, test population, failure rate, monitoring plan, security controls, human-review authority, appeal process, and shutdown trigger. If a vendor cannot state those things, the missing evidence is itself a risk signal.

The same rule applies to AI-safety communication. A lab warning, benchmark result, model card, policy memo, or scenario paper may be important. It still has to keep verbs straight: observed, forecast, simulated, evaluated, mitigated, reported, required, prohibited, and believed are not interchangeable. Much of the public confusion around AI comes from treating those verbs as if they all meant "proven."

Where the Book Needs Updating

The book appeared in 2021, before ChatGPT's November 30, 2022 research release made large language models a mass interface and before multimodal models, agent frameworks, code assistants, tool-use systems, and enterprise copilots became ordinary reference points. That timing makes some passages feel pre-shock. Larson anticipated the cultural pattern, but he did not write with the full social evidence of 2023-2026 in view.

The 2026 International AI Safety Report is useful background for that update because it treats general-purpose AI through current capabilities, emerging risks, and risk-management approaches rather than through a single destiny story. Stanford HAI's 2026 AI Index adds the same lesson from a measurement angle: capability gains are real, measurement is difficult, reporting practices vary, and independent evaluation matters. That is the right frame for Larson now: the field has advanced sharply, but advances still have to be sorted by evidence type, operational context, safeguards, and remaining uncertainty.

Some critical reviews also note that the book can underplay the political economy of narrow AI. Even if narrow systems are not on a clean path to AGI, they can still concentrate power, automate discipline, intensify surveillance, reshape labor, and make opaque decisions at scale. A critique of AGI inevitability should not become a reason to relax about deployed systems that already govern people's lives.

The other limit is strategic. If Larson is right that abduction and human understanding remain unsolved, what institutions should be built around that fact? The book argues for scientific humility, but the governance agenda has to go further: procurement rules, disclosure standards, audit rights, labor protections, education redesign, safety evaluation, and public language that distinguishes capability from comprehension.

What This Changes

The practical lesson is to separate five claims that are often collapsed into one: AI can perform a task; AI can perform it reliably in a given workflow; AI should be delegated authority there; AI is socially powerful; AI is on a known path to humanlike intelligence. The first and fourth can be true while the second, third, and fifth remain unproved. Confusing them gives both companies and critics too much mythic material to work with.

For AI users, the question is not simply whether a system is "intelligent." Ask what kind of inference it is performing, where its training distribution ends, what social role the interface is inviting, what uncertainty has been hidden, and who benefits when a prediction is described as understanding. For institutions, ask whether the AI plan depends on an actual demonstrated capability or on faith that scale will soon solve the missing parts.

For public language, the rule is stricter: do not use future-machine vocabulary to smuggle present institutional authority. Say observed, evaluated, forecast, simulated, deployed, delegated, authorized, monitored, or believed. Each verb points to a different record and a different burden of proof. The myth of inevitability grows when those records are collapsed into one confident story.

The Myth of Artificial Intelligence is valuable because it slows the reader down at the exact point where the culture wants acceleration. It makes the future less automatic. That does not make it comforting. A future shaped by powerful narrow systems, inflated belief, and weak public vocabulary is still dangerous. But danger is easier to govern when it is named accurately.

Source Discipline

This review separates book metadata, philosophical background, later reviews, product history, measurement context, and governance context. Harvard University Press and KIT's library catalog are used for bibliographic details and subject metadata. The Stanford Encyclopedia of Philosophy is used for Peirce and abduction. Stanford HAI is used for the Turing Test framing and the 2026 AI Index measurement context. OpenAI's 2022 announcement is used only to date the ChatGPT public research release and to note early limitations, not as proof of general intelligence. NIST, ISO, OMB, EU AI Act sources, and the International AI Safety Report define risk-management, standards, and regulatory vocabulary; they do not prove that any deployment is safe or that any AGI forecast is correct.

The interpretive claim is deliberately narrow: Larson's book is strongest as a critique of inevitability, not as a settled impossibility theorem. This article makes no claim that an AI system is conscious, divine, or AGI. It treats AGI, superintelligence, and singularity claims as contested scenarios that require evidence, mechanism, and governance before they are allowed to steer institutions.

Sources

Book links are paid affiliate links. As an Amazon Associate I earn from qualifying purchases.


Return to Blog · Return to Books