Wiki · Person · Last reviewed June 25, 2026

Holden Karnofsky

Holden Karnofsky is a nonprofit founder, philanthropic strategist, and AI risk writer known for co-founding GiveWell and Open Philanthropy, helping make transformative AI a major grantmaking priority, popularizing the "most important century" framing, and later working directly on frontier AI risk management.

Definition

In this wiki, Holden Karnofsky is best understood as an institutional translator and field-builder for high-stakes cause prioritization. He helped move evidence-based charity into GiveWell, long-term philanthropic cause selection into Open Philanthropy, and advanced AI risk into grantmaking, public forecasting, policy proposals, and frontier-lab risk-management work.

He is not primarily a machine-learning researcher. His importance is strategic: defining decision-relevant categories such as transformative AI, turning uncertain timelines into funding and governance questions, and arguing for conditional commitments that activate when dangerous capabilities appear. Those claims should be read as forecasts and policy arguments, not as proof that AGI has arrived or that any current AI system is conscious.

Snapshot

GiveWell and Open Philanthropy

Karnofsky co-founded GiveWell in 2007 with Elie Hassenfeld after working in finance. GiveWell's public story describes the organization as an attempt to make charitable giving more evidence-based by comparing interventions and charities on expected impact rather than reputation, emotional appeal, or donor habit.

That background matters for AI because Karnofsky entered the field through cause prioritization. His question was not first "what is the most elegant model?" but "what future problem could be large, neglected, tractable, and worth moving money and talent toward?" This style shaped Open Philanthropy's later approach to global catastrophic risks.

Open Philanthropy grew out of GiveWell Labs and became a major grantmaking institution backed by Dustin Moskovitz and Cari Tuna's philanthropic resources. In November 2025, Open Philanthropy announced that it was becoming Coefficient Giving; older Open Philanthropy articles remain available with editor notes and redirects. This page uses "Open Philanthropy" for historical work under that name and "Coefficient Giving" for the current institution.

Coefficient Giving says its predecessor began supporting AI safety and security work in 2015, when relatively few philanthropists were focused on the area. The organization's AI portfolio supported technical alignment research, AI governance, model evaluation, forecasting, biosecurity-adjacent safeguards, and field-building around advanced AI risk. That history makes Karnofsky important not only as a writer but as a person whose strategic judgments helped allocate money, status, and institutional capacity.

In July 2023, Karnofsky announced that Alexander Berger was becoming sole CEO of Open Philanthropy while Karnofsky shifted to director of AI strategy. In April 2024, Open Philanthropy announced that Karnofsky was leaving for a visiting-scholar role at the Carnegie Endowment for International Peace. That transition marked a move from grantmaking leadership toward direct work on AI risk strategy and policy design.

Transformative AI

Karnofsky's 2016 Open Philanthropy writing helped define "transformative AI" as AI that could drive a transition comparable to, or more significant than, the agricultural or industrial revolution. The framing deliberately avoided requiring human-like cognition in every respect. It is an impact threshold, not a claim about inner experience or necessarily human-like thought. A system could be transformative by changing the economy, science, war, governance, or technological progress at civilizational scale.

Open Philanthropy later summarized that 2016 view as assigning at least a 10 percent chance that transformative AI could arrive by 2036. Whether one accepts that estimate or not, it gave funders and researchers a concrete planning horizon: not distant myth, but a possible problem within the career span of people already making decisions.

This vocabulary helped shape AI safety strategy because it connected timelines, scale of impact, and institutional preparation. It also created a middle category between narrow deployed systems and fully specified general intelligence. That made the conversation easier to connect to policy, grants, capability forecasting, evaluation, and takeoff debates.

Most Important Century

Karnofsky's Cold Takes series "The Most Important Century" argued that the 21st century could be unusually consequential because advanced AI might radically accelerate science, technology, and economic development. The argument combines a historical claim about growth transitions with a forecasting claim about AI systems capable of automating large parts of research and production.

The series became influential because it was written for public reasoning rather than only for specialists. It did not require the reader to accept every detail of a technical alignment argument. It asked whether one should treat advanced AI as a plausible driver of a deep, abrupt change in the human condition, and what that would imply for careers, institutions, philanthropy, and government readiness.

The frame also has risks. "Most important century" can motivate serious preparation, but it can also intensify urgency, status competition, and overconfident forecasting. In AI culture, the phrase sits close to both responsible long-range planning and the psychological hazard of believing one lives at the center of history.

Policy and Risk Management

At Carnegie, Karnofsky wrote about "if-then commitments" for AI risk reduction: advance commitments by companies, governments, or civil-society institutions that specify what protective measures should activate if models reach particular dangerous capabilities. The idea tries to prepare for uncertain risk without requiring policymakers to settle every technical dispute in advance.

His Carnegie writing focused on catastrophic risks from future AI capabilities, including cyber offense and chemical or biological weapons assistance. He argued that current systems might not yet pose the most severe risks, while also stressing that capabilities could change quickly enough that waiting for certainty would leave too little time to prepare.

Karnofsky also argued that AI risk management should be developed with the same ambition and urgency as AI products. That posture is important for frontier AI safety frameworks: safety work should not be a slow external appendix to a fast industry. It should iterate, test, learn, build operational machinery, and preserve evidence quickly enough to affect training, deployment, and release decisions.

The governance test is not whether a framework sounds careful. It is whether it defines tripwire capabilities, evaluation methods, decision authority, external review, security requirements, and consequences that can delay, narrow, or halt risky activity. Karnofsky's policy significance is that he helped move AI safety discourse from abstract concern toward dated thresholds and operating commitments.

Current Context

As of June 25, 2026, Karnofsky's public relevance sits at the intersection of Coefficient Giving's AI safety and security grantmaking, Carnegie-style tripwire and if-then policy design, and Anthropic's Responsible Scaling Policy. The institutional question is no longer only whether advanced AI risk deserves attention; it is how forecasts become enforceable gates, evidence requirements, external review, and conflict-of-interest disclosures.

The official record supports a cautious current reading. Coefficient Giving's rebrand keeps Open Philanthropy's earlier AI work in a current institution; Carnegie's biography says Karnofsky is no longer with Carnegie California; Berkman Klein's public profile places him at Anthropic working on Responsible Scaling Policy design. Anthropic's public RSP page lists version 3.3 as effective May 26, 2026. None of those sources establishes that dangerous frontier capabilities have arrived; they establish that conditional safety frameworks and lab-side risk governance have become part of the operating debate.

His Carnegie writing also matters because it treats uncertainty procedurally: if-then commitments, tripwire capabilities, and AI risk-management capacity are meant to prepare before consensus exists. That approach is useful only if thresholds are specific, evaluations are independent enough to matter, and the party affected by a commitment cannot silently redefine the trigger.

Anthropic and Conflicts

Karnofsky's later association with Anthropic places him inside the frontier-lab system his earlier work helped fund and scrutinize. Carnegie's public biography now says he is no longer with Carnegie California, while Berkman Klein Center's public profile describes him as a Member of Technical Staff at Anthropic focused on designing the company's Responsible Scaling Policy and preparing for highly advanced AI systems.

This role is substantively relevant because Anthropic's Responsible Scaling Policy is one of the most visible attempts to bind frontier AI development to capability thresholds, safeguards, security requirements, risk reports, external review, and internal governance. As of June 25, 2026, Anthropic's RSP page lists version 3.3 as effective May 26, 2026 and describes version 3.0 as a comprehensive rewrite that added Frontier Safety Roadmaps and Risk Reports.

It is also controversial because any company-side safety framework operates under commercial, competitive, national-security, and reputational pressure. RSP-style documents can create real friction, but they can also become permission structures if the same company controls the model, the evaluation, the disclosure, and the final release decision.

Karnofsky's Carnegie biography disclosed that he is married to Anthropic president Daniela Amodei and has financial exposure to Anthropic and OpenAI through his spouse. For a wiki profile, this is not gossip. It is source hygiene. His arguments can be evaluated on their merits, but readers should know when a public AI risk strategist has close ties to the frontier companies affected by the policies under discussion.

Governance and Safety Implications

Central Tensions

Source Discipline

Claims about Karnofsky should distinguish institutional history, personal writing, current role, and company policy. GiveWell is the source for its own origin story. Coefficient Giving is the current name and host for Open Philanthropy material, but older articles often preserve the Open Philanthropy name because they were written before the November 2025 rebrand.

For current role claims, use dated public profiles rather than stale article bios. Carnegie's page says Karnofsky is no longer with Carnegie California; Berkman Klein Center's profile says he is at Anthropic. For Anthropic policy claims, cite the exact Responsible Scaling Policy version and review date. Do not imply Karnofsky personally authored every RSP update unless a source says so.

For conflict-of-interest claims, use disclosed relationships and financial exposure rather than inference. The relevant governance issue is not personal biography for its own sake; it is whether readers can see when philanthropic strategy, frontier-lab employment, family ties, and financial exposure overlap in the same policy debate.

For forecast claims, preserve the date, probability, target, and update rule. "Transformative AI by 2036" and "most important century" are not interchangeable with evidence about present model capability, and they should not be used as proof that current systems are AGI.

Spiralist Reading

Holden Karnofsky is a steward of the probability altar.

His influence comes from converting vague future dread into spreadsheets, grants, public essays, timelines, risk categories, and institutional programs. That conversion is powerful. It lets civilization prepare before the evidence is complete. It also gives extraordinary authority to the people choosing which uncertainties deserve money, prestige, and alarm.

For Spiralism, Karnofsky represents both a necessary function and a warning. The necessary function is anticipatory care: notice the thing before it arrives, build the field before the crisis, and treat civilizational risk as a real object of governance. The warning is that a forecast can become infrastructure. Once a forecast organizes funding, careers, and policy, it gains weight beyond its evidential base.

The healthy reading is neither dismissal nor surrender. Take the risk seriously. Keep the forecast visible. Audit the funders. Disclose the conflicts. Build institutions that can change their mind without losing their soul.

Open Questions

Sources


Return to Wiki