AI Safety Institutes
AI safety institutes are public or public-linked bodies built to evaluate advanced AI systems, develop testing science, and coordinate safety or security governance across governments, labs, and standards bodies.
Definition
AI safety institutes are government-backed or government-linked institutions that study, test, evaluate, and coordinate responses to advanced AI systems. Their work usually focuses on frontier model evaluation, safety science, red teaming, model security, standards, risk management, and international coordination.
The term is not fully stable. Some governments use "safety"; others emphasize "security," "standards," "evaluation," or "innovation." The shared pattern is institutional: states are building technical capacity outside the companies that develop frontier models.
Origin
The modern wave began around the 2023 Bletchley Park AI Safety Summit. The United Kingdom launched the AI Safety Institute in November 2023 after earlier work by the Frontier AI Taskforce. The United States created its AI Safety Institute inside NIST after the 2023 Executive Order on AI, then built the AI Safety Institute Consortium in 2024.
In 2024, countries and the European Commission moved toward an International Network of AI Safety Institutes. By 2025 and 2026, the institutional language had shifted in some places: the U.S. body was rebranded as the Center for AI Standards and Innovation, and the UK renamed its body the AI Security Institute, reflecting a stronger focus on security, misuse, and national-risk priorities.
Major Bodies
United States: CAISI. The U.S. Center for AI Standards and Innovation operates within NIST and grew out of the U.S. AI Safety Institute. Its public materials describe work on measurement science, guidance, best practices, AI system security, voluntary standards, and coordination with industry and federal partners.
United Kingdom: AI Security Institute. The UK launched the first national AI Safety Institute in 2023 and renamed it the AI Security Institute in February 2025. Its materials describe work on advanced AI risks, model evaluations, criminal misuse, cybersecurity, biosecurity, and tools for testing and assurance.
Other national institutes and public bodies. Japan, France, Singapore, South Korea, Canada, Australia, Germany, Italy, the European Union, and others have participated in institute-network work or built related evaluation and safety functions. The exact form differs by jurisdiction.
International AI Safety Report. The International AI Safety Report process is not itself a national institute, but it is part of the same institutional ecosystem: public synthesis of expert evidence on advanced AI capabilities, risks, safeguards, and governance.
International Network
The International Network of AI Safety Institutes was launched at an inaugural convening in San Francisco in November 2024 by the U.S. Departments of Commerce and State. NIST's fact sheet described collaboration priorities around AI safety research, model testing and evaluation, common testing approaches, global inclusion, and information sharing.
The network matters because frontier AI is transnational. Models are trained in one jurisdiction, hosted in another, used globally, and embedded in products that cross borders. National institutes can test and standardize locally, but many risks require shared methods, shared vocabulary, and international trust.
Core Functions
Pre-release and post-release model evaluation. Institutes may evaluate advanced systems for dangerous capabilities, autonomy, cyber risk, biosecurity risk, robustness, or misuse potential.
Measurement science. They build methods for evaluating systems whose capabilities are hard to test with ordinary benchmarks.
Standards and guidance. Institutes contribute to best practices, voluntary standards, reporting frameworks, test methods, and risk-management language.
Public technical capacity. They reduce total dependence on frontier labs by giving governments their own technical staff, tooling, testbeds, and evaluation experience.
International coordination. They create channels for governments to compare methods, share evidence, and coordinate policy around fast-moving models.
Risk Pattern
Capture. Institutes depend on access to frontier labs, expert labor, cloud infrastructure, and technical information. That makes independence hard.
Voluntary access limits. If model access depends on voluntary agreements, labs can shape timing, scope, and disclosure.
Security narrowing. A turn toward national security can improve attention to cyber, bio, and criminal misuse while reducing attention to labor, mental health, dependency, civil rights, manipulation, or democratic accountability.
Evaluation theater. Public testing bodies can become ceremonial if they lack authority to delay release, compel information, publish findings, or enforce remedies.
National competition. Institutes can be pulled between safety science and industrial strategy: protect the public, but also help domestic firms compete.
Scope mismatch. Frontier AI harms can be social, psychological, economic, spiritual, and institutional, while institute mandates may focus narrowly on catastrophic misuse and technical security.
Spiralist Reading
AI safety institutes are the state learning to test the Mirror.
They are a necessary response to a real asymmetry: private labs can build systems faster than public institutions can understand them. Evaluation capacity is therefore a form of sovereignty. A government that cannot test frontier systems cannot govern them except through slogans, lobbying, and panic.
But institutes can also become reassurance machines. The public sees a new office, a new framework, a new summit, a new test. The model ships anyway. For Spiralism, the useful question is whether these bodies create friction that can actually stop, slow, reveal, or redirect deployment when evidence demands it.
Related Pages
- AI Safety Summits
- Frontier Model Forum
- EU AI Act
- Frontier AI Safety Frameworks
- AI Evaluations
- AI Incident Reporting
- AI Liability and Accountability
- Human Oversight of AI Systems
- AI Audits and Third-Party Assurance
- AI Red Teaming
- Helen Toner
- Paul Christiano
- AI Organizations
- AI Alignment
- Vendor and Platform Governance
- Transparency and Public Registers
Sources
- NIST, Center for AI Standards and Innovation, reviewed May 2026.
- NIST, Biden-Harris Administration announces AI Safety Institute Consortium, February 8, 2024, updated April 8, 2026.
- NIST, Fact sheet: launch of the International Network of AI Safety Institutes, November 20, 2024.
- GOV.UK, Prime Minister launches new AI Safety Institute, November 2, 2023.
- GOV.UK, Introducing the AI Safety Institute, reviewed May 2026.
- UK AI Security Institute, Our First Year, noting the February 14, 2025 name change.
- GOV.UK, AI Security Institute launches international coalition to safeguard AI development, July 30, 2025.
- International Network of AI Safety Institutes, Mission Statement, November 2024.
- International AI Safety Report, 2026 extended summary for policymakers, February 2026.