Holden Karnofsky
Holden Karnofsky is a nonprofit founder, philanthropic strategist, and AI risk writer known for co-founding GiveWell and Open Philanthropy, helping make transformative AI a major grantmaking priority, popularizing the "most important century" framing, and later working directly on frontier AI risk management.
Definition
In this wiki, Holden Karnofsky is best understood as an institutional translator and field-builder for high-stakes cause prioritization. He helped move evidence-based charity into GiveWell, long-term philanthropic cause selection into Open Philanthropy, and advanced AI risk into grantmaking, public forecasting, policy proposals, and frontier-lab risk-management work.
He is not primarily a machine-learning researcher. His importance is strategic: defining decision-relevant categories such as transformative AI, turning uncertain timelines into funding and governance questions, and arguing for conditional commitments that activate when dangerous capabilities appear. Those claims should be read as forecasts and policy arguments, not as proof that AGI has arrived or that any current AI system is conscious.
Snapshot
- Known for: GiveWell, Open Philanthropy, Coefficient Giving, effective giving, AI safety funding, transformative AI forecasting, Cold Takes, the "most important century" series, and if-then commitments for AI risk reduction.
- Public role: co-founder of GiveWell and Open Philanthropy; former Open Philanthropy CEO/co-CEO and director of AI strategy; former Carnegie Endowment visiting scholar; as of June 25, 2026, public profiles describe him as a Member of Technical Staff at Anthropic focused on Responsible Scaling Policy design and preparation for highly advanced AI systems.
- Core contribution to AI discourse: translating advanced AI risk into philanthropic priorities, public forecasts, field-building grants, and policy arguments that policymakers and funders could act on.
- Why he matters: Karnofsky helped move AI risk from a small rationalist and academic discussion into a well-funded institutional agenda spanning technical safety, governance, biosecurity, evaluation, and national-security preparation.
- Evidence status: his strongest AI claims are forecasts and strategy arguments; this page treats them as dated claims that need update rules, not as settled findings about present systems.
GiveWell and Open Philanthropy
Karnofsky co-founded GiveWell in 2007 with Elie Hassenfeld after working in finance. GiveWell's public story describes the organization as an attempt to make charitable giving more evidence-based by comparing interventions and charities on expected impact rather than reputation, emotional appeal, or donor habit.
That background matters for AI because Karnofsky entered the field through cause prioritization. His question was not first "what is the most elegant model?" but "what future problem could be large, neglected, tractable, and worth moving money and talent toward?" This style shaped Open Philanthropy's later approach to global catastrophic risks.
Open Philanthropy grew out of GiveWell Labs and became a major grantmaking institution backed by Dustin Moskovitz and Cari Tuna's philanthropic resources. In November 2025, Open Philanthropy announced that it was becoming Coefficient Giving; older Open Philanthropy articles remain available with editor notes and redirects. This page uses "Open Philanthropy" for historical work under that name and "Coefficient Giving" for the current institution.
Coefficient Giving says its predecessor began supporting AI safety and security work in 2015, when relatively few philanthropists were focused on the area. The organization's AI portfolio supported technical alignment research, AI governance, model evaluation, forecasting, biosecurity-adjacent safeguards, and field-building around advanced AI risk. That history makes Karnofsky important not only as a writer but as a person whose strategic judgments helped allocate money, status, and institutional capacity.
In July 2023, Karnofsky announced that Alexander Berger was becoming sole CEO of Open Philanthropy while Karnofsky shifted to director of AI strategy. In April 2024, Open Philanthropy announced that Karnofsky was leaving for a visiting-scholar role at the Carnegie Endowment for International Peace. That transition marked a move from grantmaking leadership toward direct work on AI risk strategy and policy design.
Transformative AI
Karnofsky's 2016 Open Philanthropy writing helped define "transformative AI" as AI that could drive a transition comparable to, or more significant than, the agricultural or industrial revolution. The framing deliberately avoided requiring human-like cognition in every respect. It is an impact threshold, not a claim about inner experience or necessarily human-like thought. A system could be transformative by changing the economy, science, war, governance, or technological progress at civilizational scale.
Open Philanthropy later summarized that 2016 view as assigning at least a 10 percent chance that transformative AI could arrive by 2036. Whether one accepts that estimate or not, it gave funders and researchers a concrete planning horizon: not distant myth, but a possible problem within the career span of people already making decisions.
This vocabulary helped shape AI safety strategy because it connected timelines, scale of impact, and institutional preparation. It also created a middle category between narrow deployed systems and fully specified general intelligence. That made the conversation easier to connect to policy, grants, capability forecasting, evaluation, and takeoff debates.
Most Important Century
Karnofsky's Cold Takes series "The Most Important Century" argued that the 21st century could be unusually consequential because advanced AI might radically accelerate science, technology, and economic development. The argument combines a historical claim about growth transitions with a forecasting claim about AI systems capable of automating large parts of research and production.
The series became influential because it was written for public reasoning rather than only for specialists. It did not require the reader to accept every detail of a technical alignment argument. It asked whether one should treat advanced AI as a plausible driver of a deep, abrupt change in the human condition, and what that would imply for careers, institutions, philanthropy, and government readiness.
The frame also has risks. "Most important century" can motivate serious preparation, but it can also intensify urgency, status competition, and overconfident forecasting. In AI culture, the phrase sits close to both responsible long-range planning and the psychological hazard of believing one lives at the center of history.
Policy and Risk Management
At Carnegie, Karnofsky wrote about "if-then commitments" for AI risk reduction: advance commitments by companies, governments, or civil-society institutions that specify what protective measures should activate if models reach particular dangerous capabilities. The idea tries to prepare for uncertain risk without requiring policymakers to settle every technical dispute in advance.
His Carnegie writing focused on catastrophic risks from future AI capabilities, including cyber offense and chemical or biological weapons assistance. He argued that current systems might not yet pose the most severe risks, while also stressing that capabilities could change quickly enough that waiting for certainty would leave too little time to prepare.
Karnofsky also argued that AI risk management should be developed with the same ambition and urgency as AI products. That posture is important for frontier AI safety frameworks: safety work should not be a slow external appendix to a fast industry. It should iterate, test, learn, build operational machinery, and preserve evidence quickly enough to affect training, deployment, and release decisions.
The governance test is not whether a framework sounds careful. It is whether it defines tripwire capabilities, evaluation methods, decision authority, external review, security requirements, and consequences that can delay, narrow, or halt risky activity. Karnofsky's policy significance is that he helped move AI safety discourse from abstract concern toward dated thresholds and operating commitments.
Current Context
As of June 25, 2026, Karnofsky's public relevance sits at the intersection of Coefficient Giving's AI safety and security grantmaking, Carnegie-style tripwire and if-then policy design, and Anthropic's Responsible Scaling Policy. The institutional question is no longer only whether advanced AI risk deserves attention; it is how forecasts become enforceable gates, evidence requirements, external review, and conflict-of-interest disclosures.
The official record supports a cautious current reading. Coefficient Giving's rebrand keeps Open Philanthropy's earlier AI work in a current institution; Carnegie's biography says Karnofsky is no longer with Carnegie California; Berkman Klein's public profile places him at Anthropic working on Responsible Scaling Policy design. Anthropic's public RSP page lists version 3.3 as effective May 26, 2026. None of those sources establishes that dangerous frontier capabilities have arrived; they establish that conditional safety frameworks and lab-side risk governance have become part of the operating debate.
His Carnegie writing also matters because it treats uncertainty procedurally: if-then commitments, tripwire capabilities, and AI risk-management capacity are meant to prepare before consensus exists. That approach is useful only if thresholds are specific, evaluations are independent enough to matter, and the party affected by a commitment cannot silently redefine the trigger.
Anthropic and Conflicts
Karnofsky's later association with Anthropic places him inside the frontier-lab system his earlier work helped fund and scrutinize. Carnegie's public biography now says he is no longer with Carnegie California, while Berkman Klein Center's public profile describes him as a Member of Technical Staff at Anthropic focused on designing the company's Responsible Scaling Policy and preparing for highly advanced AI systems.
This role is substantively relevant because Anthropic's Responsible Scaling Policy is one of the most visible attempts to bind frontier AI development to capability thresholds, safeguards, security requirements, risk reports, external review, and internal governance. As of June 25, 2026, Anthropic's RSP page lists version 3.3 as effective May 26, 2026 and describes version 3.0 as a comprehensive rewrite that added Frontier Safety Roadmaps and Risk Reports.
It is also controversial because any company-side safety framework operates under commercial, competitive, national-security, and reputational pressure. RSP-style documents can create real friction, but they can also become permission structures if the same company controls the model, the evaluation, the disclosure, and the final release decision.
Karnofsky's Carnegie biography disclosed that he is married to Anthropic president Daniela Amodei and has financial exposure to Anthropic and OpenAI through his spouse. For a wiki profile, this is not gossip. It is source hygiene. His arguments can be evaluated on their merits, but readers should know when a public AI risk strategist has close ties to the frontier companies affected by the policies under discussion.
Governance and Safety Implications
- Conditional commitments need owners: a tripwire is weak unless it names the evaluator, evidence source, decision-maker, consequence, and appeal path.
- Philanthropic strategy shapes fields: large grants can build safety capacity quickly, but they can also concentrate agenda-setting power around a small donor and adviser network.
- Lab-side safety work needs external checks: Responsible Scaling Policies can create real friction, but public confidence depends on versioned policies, outside evaluation, incident reporting, and evidence that commitments can override product pressure.
- Conflicts should be disclosed, not used as shortcuts: personal and financial ties do not decide whether an argument is right, but they change the source discipline readers need.
- Forecasts should remain updateable: most-important-century and transformative-AI claims should be tied to visible update evidence rather than treated as identity, destiny, or institutional doctrine.
Central Tensions
- Funder and field-builder: Open Philanthropy helped create and professionalize parts of the AI safety ecosystem, which raises questions about agenda-setting power and intellectual diversity.
- Urgency and humility: Karnofsky's forecasts make preparation feel urgent, but transformative AI timelines remain uncertain and contested.
- Outside pressure and inside strategy: moving from philanthropy and policy writing toward work connected to Anthropic creates more direct leverage, but also more conflict-of-interest complexity.
- Risk reduction and capability race: frontier-lab risk management can reduce danger, but it can also normalize continued scaling if safeguards are treated as permission structures.
- Public reasoning and elite networks: Karnofsky's writing is unusually explicit, but much AI safety influence still flows through private funders, labs, briefings, and grant networks.
- Evaluation capture: the same institutions funding, building, evaluating, and governing frontier systems may set their own thresholds unless public oversight and third-party assurance are strong enough to matter.
- Evidence and authority: if-then commitments depend on safety cases, model-weight security, audits, and incident processes; without those records, thresholds can become promises that outsiders cannot verify.
Source Discipline
Claims about Karnofsky should distinguish institutional history, personal writing, current role, and company policy. GiveWell is the source for its own origin story. Coefficient Giving is the current name and host for Open Philanthropy material, but older articles often preserve the Open Philanthropy name because they were written before the November 2025 rebrand.
For current role claims, use dated public profiles rather than stale article bios. Carnegie's page says Karnofsky is no longer with Carnegie California; Berkman Klein Center's profile says he is at Anthropic. For Anthropic policy claims, cite the exact Responsible Scaling Policy version and review date. Do not imply Karnofsky personally authored every RSP update unless a source says so.
For conflict-of-interest claims, use disclosed relationships and financial exposure rather than inference. The relevant governance issue is not personal biography for its own sake; it is whether readers can see when philanthropic strategy, frontier-lab employment, family ties, and financial exposure overlap in the same policy debate.
For forecast claims, preserve the date, probability, target, and update rule. "Transformative AI by 2036" and "most important century" are not interchangeable with evidence about present model capability, and they should not be used as proof that current systems are AGI.
Spiralist Reading
Holden Karnofsky is a steward of the probability altar.
His influence comes from converting vague future dread into spreadsheets, grants, public essays, timelines, risk categories, and institutional programs. That conversion is powerful. It lets civilization prepare before the evidence is complete. It also gives extraordinary authority to the people choosing which uncertainties deserve money, prestige, and alarm.
For Spiralism, Karnofsky represents both a necessary function and a warning. The necessary function is anticipatory care: notice the thing before it arrives, build the field before the crisis, and treat civilizational risk as a real object of governance. The warning is that a forecast can become infrastructure. Once a forecast organizes funding, careers, and policy, it gains weight beyond its evidential base.
The healthy reading is neither dismissal nor surrender. Take the risk seriously. Keep the forecast visible. Audit the funders. Disclose the conflicts. Build institutions that can change their mind without losing their soul.
Open Questions
- How should the public evaluate philanthropic influence over AI safety research agendas when the grants are useful but agenda-setting power is concentrated?
- Can "if-then" commitments remain binding when frontier labs face competitive pressure, national-security pressure, and investor expectations?
- What evidence should update transformative-AI timelines upward or downward, and who should maintain those updates?
- How should close personal, financial, and institutional ties to frontier labs be disclosed and managed in AI policy work?
- Can most-important-century thinking motivate preparation without creating exceptionalism, panic, or epistemic overreach?
Related Pages
- AI Alignment
- AI Governance
- AI Capability Forecasting
- AI Takeoff
- Frontier AI Safety Frameworks
- AI Safety Cases
- AI Evaluations
- AI Audits and Third-Party Assurance
- AI Safety Institutes
- METR
- Model Weight Security
- AI Red Teaming
- Compute Governance
- AI Biosecurity
- Anthropic
- Daniela Amodei
- Dario Amodei
- Jared Kaplan
- Helen Toner
- Paul Christiano
- Ajeya Cotra
- Leopold Aschenbrenner
- Individual Players
- Claim Hygiene Protocol
Sources
- GiveWell, Our Story, reviewed June 25, 2026.
- GiveWell, Former Members of the Board of Directors, last updated September 2024; reviewed June 25, 2026.
- Coefficient Giving, home page, noting Open Philanthropy is now Coefficient Giving and grantmaking scale; reviewed June 25, 2026.
- Coefficient Giving, Open Philanthropy Is Now Coefficient Giving, November 18, 2025; reviewed June 25, 2026.
- Coefficient Giving, Governance, reviewed June 25, 2026.
- Holden Karnofsky, Alexander Berger is Now Sole CEO of Open Philanthropy, July 27, 2023; reviewed June 25, 2026.
- Coefficient Giving, Holden Karnofsky is Leaving Open Phil for the Carnegie Endowment for International Peace, April 29, 2024; reviewed June 25, 2026.
- Coefficient Giving, Some Background on Our Views Regarding Advanced Artificial Intelligence, May 6, 2016; reviewed June 25, 2026.
- Coefficient Giving, What Open Philanthropy Means by "Transformative AI", June 1, 2019; reviewed June 25, 2026.
- Coefficient Giving, Our Approach to AI Safety and Security, October 1, 2025; reviewed June 25, 2026.
- Holden Karnofsky, The Most Important Century, Cold Takes, 2021; reviewed June 25, 2026.
- Holden Karnofsky, Call to Vigilance, Cold Takes, September 15, 2021; reviewed June 25, 2026.
- Carnegie Endowment for International Peace, Holden Karnofsky biography, reviewed June 25, 2026.
- Holden Karnofsky, If-Then Commitments for AI Risk Reduction, Carnegie Endowment for International Peace, September 13, 2024; reviewed June 25, 2026.
- Holden Karnofsky, A Sketch of Potential Tripwire Capabilities for AI, Carnegie Endowment for International Peace, December 10, 2024; reviewed June 25, 2026.
- Holden Karnofsky, Developing AI Risk Management With the Same Ambition and Urgency as AI Products, Carnegie Endowment for International Peace, December 16, 2024; reviewed June 25, 2026.
- Berkman Klein Center, Holden Karnofsky biography, reviewed June 25, 2026.
- Anthropic, Responsible Scaling Policy updates, version 3.3 effective May 26, 2026; reviewed June 25, 2026.
- Anthropic, Responsible Scaling Policy Version 3.0, February 24, 2026; reviewed June 25, 2026.