The Regulatory Sandbox Becomes the Exception Machine
AI sandboxes can teach regulators how systems behave before rules harden around guesses. They can also turn temporary exceptions into a quiet path around public law.
For this essay, a regulatory sandbox is a bounded legal-learning process: a regulator allows a defined system, use case, participant, rule question, testing period, and safeguard set to operate under supervision so public law can learn without surrendering its protective purpose. The exception machine appears when that bounded process becomes a reusable path for regulatory relief, private validation, or market access without equivalent public evidence, affected-person safeguards, and exit discipline.
Why Sandboxes Now
The regulatory sandbox has become one of the favored institutional answers to artificial intelligence.
The phrase sounds modest. Put the system in a controlled environment. Let innovators test. Let regulators observe. Learn before writing rigid rules. Avoid blocking useful systems because old statutes were written for old technology. Avoid letting dangerous systems scale before anyone understands them. In a field where technical behavior, business models, and social harms are all moving quickly, that bargain has obvious appeal.
As of June 25, 2026, the European Union had built AI regulatory sandboxes directly into the AI Act while moving sandbox changes through the AI Omnibus. The current Regulation frames the sandbox as a controlled environment for developing, training, testing, and validating innovative AI systems for a limited time under an agreed plan, and Article 57 said Member States should ensure at least one national AI regulatory sandbox was operational by August 2, 2026, either alone or jointly with other Member States. The Omnibus track changed the implementation picture: Council and Parliament negotiators reached a provisional agreement on May 7, 2026, and Parliament gave final approval on June 16, 2026 by 423 votes to 57, with 174 abstentions.
A current-status update is now necessary. On June 29, 2026, the Council gave final green light to the Digital Omnibus on AI. The Council release says the new regulation postpones the national AI regulatory sandbox deadline until August 2, 2027, and that the legislative act will be published in the Official Journal and enter into force on the third day after publication. The disciplined reading is therefore to keep four artifacts separate: the existing EUR-Lex AI Act text, the December 2025 draft sandbox implementing act, the adopted Digital Omnibus text awaiting Official Journal publication and entry into force, and later consolidated legal text.
The underlying EU mechanism still matters. Article 57 says providers can use sandbox documentation in conformity assessment and that authorities should identify risks to fundamental rights, health, and safety. Article 58 requires Commission implementing acts on detailed arrangements; the Commission published a draft implementing act for feedback in December 2025, with feedback closing on January 13, 2026. The draft says participation does not grant approval, market authorization, or a presumption of conformity by itself. That sentence should be the moral center of every sandbox: supervised learning is not public permission.
U.S. states are moving in a different but related direction. Utah created an Office of Artificial Intelligence Policy and an AI Learning Laboratory model aimed at regulatory relief and policy learning; its public process describes applications, consultation, mitigation agreements, safeguards, reporting duties, and ongoing monitoring. Texas's Responsible Artificial Intelligence Governance Act, signed in 2025 and effective January 1, 2026, creates a regulatory sandbox program allowing approved participants to test AI systems with legal protection and limited market access without ordinary licenses or regulatory authorizations, subject to approval, oversight, quarterly reports, and limits on what can be waived.
Outside AI-specific law, the United Kingdom's Medicines and Healthcare products Regulatory Agency is running AI Airlock for AI as a Medical Device. Its June 2026 update says the pilot and second phase have completed, reports are available, phase 3 design is in progress, and the program is intended to inform future MHRA guidance and policy. On June 9, 2026, MHRA also announced a separate AI sandbox for medicines development and safety, showing how quickly the sandbox label is spreading across regulated life-sciences domains. The UK Government's AI Growth Lab call for evidence, updated in December 2025 and closed in January 2026, went further by proposing a cross-economy sandbox with live-market pilots, targeted regulatory modifications, time-limited exemptions, careful monitoring, public and parliamentary scrutiny, and non-waivable red lines around protections such as consumer rights, safety, fundamental rights, workers' protections, and intellectual property. Singapore's AI Verify Foundation and IMDA have turned a Global AI Assurance Pilot into an ongoing Global AI Assurance Sandbox, described as a testing ground for generative-AI applications rather than the underlying foundation models. The Organization for Economic Cooperation and Development treats AI sandboxes as a serious governance tool, especially for regulatory learning, interoperability, eligibility criteria, competition effects, and supervised experimentation.
The pattern is now clear: when institutions do not know how to regulate a technology, they build a smaller room and call it learning.
What a Sandbox Does
A sandbox is not just a pilot. It is a negotiated exception with a learning theory.
A regulatory sandbox should be distinguished from a technical sandbox. A technical sandbox isolates code, files, network access, tools, or execution permissions. A regulatory sandbox isolates legal uncertainty. It defines who may test, what rule is unclear or temporarily relaxed, what safeguards apply, which regulator is watching, what records are kept, what happens if harm appears, and whether the results become public learning. Participation is not certification unless the governing law explicitly makes it so.
The exception can take several forms: interpretive guidance, temporary non-enforcement, limited market access, permission to test under modified process, a supervised evidence pathway, data-processing flexibility, or coordination among regulators who would otherwise see only fragments of the system. Those are different legal objects. A serious sandbox names which one is being used.
A sandbox should also name what is not being relaxed. If civil-rights law, product-safety duties, privacy rights, complaint channels, professional duties, or liability pathways remain intact, the record should say so. If any of them are modified, the record should say that too. Regulatory uncertainty is not a license to make the boundary vague.
In the strongest version, a company or public body enters with a defined system, use case, risk profile, testing plan, data boundary, user population, reporting duty, and exit condition. Regulators observe what happens, identify which rules are unclear or poorly fitted, require mitigations, collect evidence, and convert that evidence into guidance, standards, enforcement priorities, or legislative repair.
That can be valuable. AI systems often fail at the boundary between model behavior and institutional use. A medical triage model, lending model, hiring assistant, educational tutor, policing tool, or public-benefits chatbot cannot be understood only from a benchmark score. Its risk depends on workflow, training data, interface design, fallback paths, user trust, appeal rights, data retention, automation bias, vendor contracts, and the incentives of the organization deploying it.
A sandbox can force those details into view. It can let regulators ask questions before a product has already become infrastructure. Who is affected? What data is used? What decision does the model influence? What happens when it is wrong? Can a person override it? Are users told they are part of a test? Which logs exist? Which harms would stop the test?
But the same structure can become evasive. A sandbox can make the exception more visible than the public. The institution may focus on the provider's need for regulatory certainty while affected people become test conditions. The regulator may become too close to the firms it supervises. A confidential pilot may produce private learning and public legitimacy. The phrase "sandbox" can soften the fact that real people, real data, and real decisions may be involved.
The governance question is therefore not whether sandboxes are good or bad. The question is what kind of institutional memory they create.
Relief Taxonomy
The word "sandbox" hides several different legal moves. The first discipline is to classify the relief before deciding whether the process is legitimate.
- Interpretive relief. The regulator explains how existing duties apply to a novel system, but no duty is waived and no enforcement posture changes.
- Non-enforcement relief. The regulator states that it will not pursue specified violations during a defined test window if the participant stays inside agreed safeguards.
- Process substitution. The participant follows an alternative evidence, reporting, registration, conformity, or procurement path instead of the ordinary process.
- Limited market access. The system may operate for a bounded population, geography, sector, time period, user count, or transaction volume before ordinary authorization.
- Data-use accommodation. The participant receives a defined pathway for processing, reusing, or protecting data that would otherwise be unavailable or harder to use.
- Multi-regulator coordination. Several authorities create one supervised channel without necessarily relaxing any underlying law.
Most programs mix these forms. The EU AI Act emphasizes supervised testing, evidence, and strict conditions for further personal-data processing. Texas HB 149 explicitly creates legal protection for waived laws during the testing period while preserving non-waivable duties. Utah's public materials name exemptions, capped penalties, cure periods, safe harbors, and tailored mitigation agreements. The UK AI Growth Lab proposal describes time-limited exemptions and targeted statutory modifications with red lines. These differences are not administrative trivia. Each form changes who bears risk, who learns, who can object, and whether the result can later be sold as approval.
The Exception Ledger
The smallest useful unit is the exception ledger: a record of what ordinary rule, process, license, evidence duty, or enforcement posture changed because the participant entered the sandbox. Without that ledger, "sandbox" becomes a legal mood rather than a governed exception.
A useful exception ledger should identify the baseline rule, the changed treatment, the legal authority for the change, the affected population, the data and decision boundary, the compensating safeguards, the reporting cadence, the stop rule, the expiry date, the public summary, and the exit outcome. It should also say which protections were not changed. If privacy law, civil-rights duties, clinical responsibility, product safety, consumer remedies, complaint routes, or liability remain fully in force, the record should make that explicit.
The ledger should preserve the delta rather than the mood. It should say: here is the ordinary rule, here is the sandbox change, here is the compensating control, here is the evidence that will decide whether the change expires, narrows, expands, or becomes a general rule. If that delta cannot be written down, the participant has not entered a learning process. It has entered a permission cloud.
The ledger has a second function: it prevents an exception from becoming invisible precedent. If several participants receive similar relief, regulators and legislators should be able to see whether the rule itself needs repair or whether a favored class of firms is being coached around it. That connects the sandbox to AI audit trails, post-market monitoring, data minimization, AI procurement, and governance-document revalidation.
From Fintech to AI
The sandbox idea did not begin with AI.
Financial regulators used sandboxes to handle fintech products that did not fit neatly into inherited categories. The United Kingdom's Financial Conduct Authority opened its regulatory sandbox in 2016 and later described lessons around testing innovation while building consumer-protection safeguards. OECD's 2023 report on AI regulatory sandboxes points to fintech experience as part of the background, while warning that AI sandboxes raise their own challenges: interdisciplinary expertise, eligibility criteria, competition effects, interoperability, and the difficulty of assessing trials.
The migration from fintech to AI changes the stakes. Financial technology often tests payment flows, credit tools, identity products, compliance software, trading tools, and consumer finance interfaces. Those are already consequential. AI expands the sandbox into a broader set of institutional judgments: diagnosis, education, policing, welfare, hiring, immigration, cybersecurity, scientific discovery, workplace management, and public administration.
AI also changes what it means to test. A conventional product test may ask whether a system works as specified. An AI test often has to ask whether the specification is stable enough to govern behavior at all. Model outputs vary with prompts, users, retrieved context, deployment settings, update cycles, and tool access. In agentic systems, the model may act through browsers, APIs, databases, payment rails, or workplace software. The unit being tested is not only a model. It is a socio-technical arrangement.
That is why an AI sandbox cannot be only a compliance-help desk. It must be a site of institutional inquiry. If the regulator merely tells the provider how to reach market faster, the sandbox becomes acceleration with paperwork. If the regulator learns how a system changes rights, records, incentives, and dependency, the sandbox can improve public law.
The Real-World Problem
The hardest word in sandbox governance is "controlled."
The EU AI Act allows sandboxes to include supervised testing in real-world conditions. It also separately regulates real-world testing of certain high-risk AI systems outside sandboxes, including testing plans, authority approval, registration, oversight, informed consent in relevant cases, incident reporting, and liability. Article 60 limits such testing to the time needed to achieve its objective, capped at six months with one possible six-month extension, and requires that predictions, recommendations, or decisions can be reversed and disregarded. Article 61 specifies informed-consent requirements for real-world testing outside sandboxes. Article 59 permits some further processing of personal data in sandboxes for public-interest AI systems, but only under cumulative conditions such as necessity, risk monitoring, isolated processing environments, deletion rules, documentation, and a published project summary.
Those details matter because AI systems often need live conditions to reveal their risks. A hiring model may look fair in a dataset and fail when recruiters trust it too much. A medical AI tool may perform well in validation and fail across local workflows, accents, clinical norms, or post-market drift. A public-service chatbot may answer correctly in a demo and mislead people when they ask desperate, underspecified, multilingual questions about benefits or rights.
Real-world testing can expose these failures. It can also expose people to them.
That is the moral tension. A sandbox is supposed to protect the public from untested systems. But testing itself can become a public-facing intervention. If a model ranks a job applicant, nudges a clinician, answers a tenant, flags a student, routes a patient, or advises a caseworker, the person affected is no longer outside the experiment. They are part of the test surface.
This is where sandbox language can become dangerous. "Pilot" sounds temporary. "Learning laboratory" sounds benign. "Regulatory relief" sounds technical. But for the person whose claim, care, school record, insurance premium, immigration file, or employment opportunity is touched by the system, the test may feel indistinguishable from ordinary authority.
The standard should be simple: if a sandboxed system can materially affect a person, that person needs notice, safeguards, a human path, and a way to contest or exit. That connects sandbox governance to notice and appeal, human oversight, and AI liability and accountability. Regulatory learning cannot be purchased with hidden exposure.
Confidential Learning
Sandboxes sit between public law and private information.
Companies entering a sandbox will often disclose business models, data practices, technical designs, evaluation results, trade secrets, failures, and risk mitigations. Some confidentiality is legitimate. A regulator cannot learn much if every technical disclosure instantly becomes a public exhibit or competitor roadmap. The EU AI Act includes confidentiality protections, and Article 57 makes exit reports publicly available only if both provider and competent authority explicitly agree. Texas HB 149 requires confidentiality for intellectual property, trade secrets, and other sensitive information obtained through the program. Utah's AI regulatory-relief materials describe protected information, but also expect public reporting about pilot phase, benchmarks, disagreement rates, and high-level cases where human reviewers disagree with AI decisions.
But if too much stays confidential, the sandbox becomes a private chapel of public legitimacy. The company gets regulator proximity. The regulator gets insight. The public gets a press release.
That asymmetry matters because sandboxes are not only about one product. They shape future policy. If the learning remains mostly inside regulator-provider channels, then public law may be formed by evidence that affected people, competitors, civil-society groups, researchers, and journalists cannot inspect. The regulator may sincerely learn, but the public cannot tell what was learned, whose risks were counted, which harms appeared, or why rules were later softened or tightened.
Good sandbox governance therefore needs public output even when raw materials remain protected. At minimum, each mature sandbox should publish project categories, selection criteria, test objectives, affected populations, safeguards, aggregate findings, failure patterns, stop conditions, and policy implications. The EU Act's exit reports and project-summary logic point in this direction, but much depends on implementation. A summary that says "no major issues identified" is not enough. A sandbox should make regulatory learning auditable without exposing secrets or personal data.
The public summary should also include a redaction map. It should say which classes of evidence were withheld, why they were withheld, and which regulator, auditor, or authorized reviewer can inspect the unredacted record. Otherwise confidentiality becomes a black box: formally justified, practically impossible to contest.
The public does not need every line of code. It needs to know whether the exception taught the institution something real.
Failure Modes
The first failure mode is exception laundering. A participant receives temporary relief or supervised flexibility, then markets participation as if it were approval, certification, or proof of safety.
The second is affected-person invisibility. The application, regulator agreement, and public summary talk about innovation and compliance, while the people whose records, care, benefits, credit, education, employment, or safety are touched by the test remain unnamed as a protected class of participants.
The third is regulatory capture by intimacy. The sandbox gives regulators useful technical access, but repeated private collaboration with selected firms changes what regulators notice, whose costs feel urgent, and which safeguards seem practical.
The fourth is confidentiality creep. Trade-secret and personal-data protections are legitimate, but they expand until failure patterns, risk mitigations, affected populations, and policy lessons disappear from public view.
The fifth is pilot drift. A temporary test becomes operational infrastructure because users adapt, staff workflows change, vendors integrate deeper, and stopping the pilot starts to look disruptive.
The sixth is evidence capture. The participant controls the metrics, logs, comparison group, user feedback, incident definitions, or evaluation frame, so the sandbox produces evidence optimized for continuation rather than learning.
The seventh is access inequality. Well-funded firms, incumbents, or politically connected sectors get regulatory coaching and relief, while smaller organizations, public-interest projects, civil-society groups, and affected communities remain outside the learning room.
The eighth is liability fog. When harm occurs, the provider, deployer, regulator, sandbox operator, and public agency each point to the test status as context, and affected people struggle to identify who had the duty to prevent or repair the harm.
The ninth is relief ratchet. Temporary flexibility becomes the new baseline because every later participant can point to earlier relief as fairness precedent. The exception starts as learning and ends as quiet deregulation.
The tenth is entry-result substitution. A participant is admitted because the test is worth learning from, but later treats admission as if the system had passed the test. Entry is a question. Exit is the answer.
The eleventh is jurisdictional laundering. Evidence from one sandbox is carried into another country, sector, procurement process, or product claim without saying that the legal duties, affected population, data rights, professional obligations, and liability rules have changed.
The twelfth is mitigation debt. The test works only because it has small scale, unusually close regulator attention, manual review, special staff training, narrow data access, or hand-built incident response. When the system scales, those temporary supports disappear while the legitimacy claim remains.
The Sandbox Record
The core governance artifact should be a sandbox record: a bounded evidence package that survives the exception. It should connect the application, approval, testing plan, safeguards, data-processing boundary, affected-person notice, incident triggers, regulator communications, test logs, public summaries, exit decision, and follow-up policy change.
That record has two audiences. The confidential version lets regulators inspect technical designs, proprietary details, security evidence, raw evaluation artifacts, and sensitive data handling. The public version lets affected people, journalists, researchers, competitors, legislators, and civil-society groups understand what was tested, under what constraints, what was learned, and what remains uncertain.
Public registers matter here. A sandbox should not be discoverable only through press releases or procurement rumor. At minimum, a public register should identify the participant category, sector, legal uncertainty, testing window, affected-person category, safeguards, contact route, current status, and exit outcome. The same record should also connect to the deployer's AI system inventory and vendor governance file. That connects sandbox governance to Transparency and Public Registers, AI audits and assurance, algorithmic impact assessments, privacy and data stewardship, and AI incident reporting.
The record should also say what participation does not mean. A sandbox can supply evidence for conformity assessment, guidance, enforcement discretion, procurement review, or rulemaking. It should not silently become immunity, endorsement, or a private substitute for public accountability.
For consequential deployments, the record should be portable into later assurance. A hospital, benefits agency, school district, employer, insurer, procurement office, or regulator should be able to tell whether a sandbox result covered the same model version, workflow, population, data boundary, human-review path, and complaint route now being scaled. Otherwise the test artifact becomes a detached credential, which is the same failure pattern described in validity certificates and test artifacts.
Boundary Tests
Before a sandbox begins, the regulator should be able to answer ten boundary questions in writing.
- Legal status. Is the participant receiving a waiver, non-enforcement position, interpretive advice, data-processing pathway, procurement trial, conformity-assessment evidence path, or only technical feedback?
- Participant identity. Who is the provider, deployer, public authority, subcontractor, and responsible human contact for the test?
- System boundary. Which model, interface, dataset, workflow, version, tool access, and user population are inside the sandbox?
- Affected people. Whose rights, care, work, benefits, education, credit, immigration status, safety, or records can be touched, and what notice or consent route applies?
- Baseline and comparator. What ordinary process would happen without the sandbox, and how will benefits, errors, appeals, overrides, and harms be measured against it?
- Evidence and logs. Which evaluations, incidents, complaints, overrides, human decisions, data flows, and model changes must be preserved?
- Regulator capacity. Which expertise, access rights, independence controls, funding, and conflict checks let the regulator verify participant claims rather than merely receive them?
- Stop rules. What performance failure, incident, complaint pattern, privacy breach, bias signal, security issue, or deceptive marketing claim pauses or ends the test?
- Expiry and reuse. When does the relief expire, and what evidence must be refreshed before the same claim can support procurement, certification, scaling, or a new jurisdiction?
- Public learning. What will be published at entry, during the test, and on exit, even if trade secrets and personal data remain protected?
These are not paperwork decorations. They prevent a sandbox from becoming a vague permission zone. If the boundary cannot be written down, the system is not ready for supervised public learning.
The Governance Standard
A serious AI sandbox should meet a higher standard than supervised acceleration.
First, eligibility should be public and narrow. Sandboxes should prioritize real regulatory uncertainty, public-interest learning, and systems whose risks can be bounded. A product should not enter just because a company wants a faster path to market or a regulator's aura.
Second, the testing plan should name the affected public. It should specify who may be touched by the system, what decisions or recommendations are in scope, what data is processed, what consent or notice is required, what complaint path exists, and what would count as unacceptable harm.
Third, exceptions should be explicit. If a law, license, registration, procedure, or compliance step is waived, suspended, relaxed, or interpreted flexibly, the record should say so. Governance fails when "sandbox" hides which ordinary protections have been set aside.
Fourth, human paths should remain live. People affected by sandboxed systems should have access to human review, correction, appeal, and non-AI alternatives where the system touches rights, benefits, care, employment, education, safety, or public services.
Fifth, data boundaries should be strict. Sandbox data should not quietly become product-training data, marketing evidence, or a reusable private asset without a lawful basis, disclosure, retention limits, and deletion rules.
Sixth, public reporting should be useful. Regulators should publish aggregate findings, not only participant counts. The report should explain what risks appeared, which mitigations worked, which failed, which rules were unclear, what incidents or near misses occurred, what affected people reported, and what policy changes follow.
Seventh, regulators need capacity. A sandbox run by under-resourced staff becomes provider-led education. AI sandboxes require technical, legal, domain, civil-rights, privacy, security, procurement, and human-factors expertise.
Eighth, exit should be a decision, not drift. A sandboxed system should not become ordinary infrastructure merely because the test period ended. Exit should produce one of several explicit outcomes: stop, extend, approve under conditions, require redesign, refer for enforcement, or convert learning into general guidance.
Ninth, marketing should be controlled. A provider should not be allowed to imply that sandbox participation means regulatory approval, safety certification, public endorsement, or general legal immunity. The public-facing claim should match the legal effect.
Tenth, participation should be auditable. Sandboxes should track who applied, who was accepted, who was rejected, which sectors and firm sizes benefited, and whether incumbents or well-funded applicants received disproportionate access. An exception machine can create market power as well as regulatory learning.
Eleventh, the sandbox record should be portable. If the participant later sells the system, enters procurement, seeks certification, scales to another jurisdiction, or suffers an incident, the evidence created inside the sandbox should be usable by the next regulator or public body without depending on vendor memory.
Twelfth, affected people should not lose ordinary rights. A sandbox may modify compliance process for a participant, but it should not erase privacy, civil-rights, consumer-protection, safety, complaint, appeal, or liability pathways for people touched by the test.
Thirteenth, independent feedback should reach the regulator. Affected people, frontline staff, clinicians, teachers, caseworkers, auditors, and civil-society observers should have a path to report harms or near misses that does not depend on the participant filtering the story.
Fourteenth, sandbox evidence should not outlive its scope. A participant should not reuse a stale sandbox result after a model update, workflow change, new data source, broader population, tool integration, or jurisdictional expansion without revalidation.
Fifteenth, permanent reform should require a public bridge. If a sandbox result later supports guidance, a code of practice, procurement policy, statutory amendment, or relaxed rule, the regulator should publish which evidence generalized beyond the pilot and which limits did not.
Sixteenth, red lines should be named before testing starts. Some protections may be modifiable only by legislature, not by sandbox agreement. Consumer redress, safety duties, fundamental rights, worker protections, privacy rights, professional accountability, and intellectual property should never vanish through administrative ambiguity.
What This Changes
The sandbox is a small model of the larger AI governance problem.
It promises to hold uncertainty inside a controlled frame. It gives regulators a way to learn from the machine before the machine becomes normal. It gives firms a path through ambiguity. It gives politicians a story that innovation and protection can coexist.
That story can be true. But only if the sandbox remembers that it is an exception, not a separate world.
AI governance is full of softening words: pilot, beta, assistant, copilot, preview, experiment, learning lab, airlock, sandbox. These words reduce panic and make adoption administratively possible. They also blur the moment when a system starts acting on the world. A tool that begins as a test can become a workflow. A workflow can become an expectation. An expectation can become a rule no legislature ever debated.
Recursive reality appears here as policy formation. A regulator creates a test environment to learn how AI behaves. The test environment shapes what evidence is visible. That evidence shapes future rules. Future rules shape which AI systems are built. Those systems then reshape the world the regulator later observes. The sandbox is not outside reality. It is one of the machines producing it.
The right answer is not to reject sandboxes. It is to govern them as exception machines. They should produce public learning, not private permission. They should expose hidden risks, not normalize them. They should make affected people more visible, not turn them into test substrate. They should make law smarter without making law quieter.
A good AI sandbox is a room with windows, logs, exits, and witnesses. A bad one is a door around the law.
Source Discipline
The legal claims in this essay should be read from the strongest source available. EUR-Lex is the operative source for the current EU AI Act; the AI Act Service Desk is useful because it reproduces and explains Articles 57 through 61, but its summaries are not binding law. European Commission consultation pages establish the status of draft implementing acts and feedback windows; they do not prove final adoption unless the final implementing act is published. Council, Commission, and Parliament AI Omnibus pages establish proposal, political agreement, Parliament approval, Council final green light, and pending Official Journal steps; the legal effect still depends on publication and entry into force. Texas Legislature Online is the operative source for HB 149's enrolled text. Utah, MHRA, IMDA, AI Verify, and UK AI Growth Lab pages are program documents: they describe how agencies or foundations structure sandboxes, assurance programs, or proposals, not proof that any participant is safe, effective, or approved.
Policy reports need a different reading. The FCA lessons-learned report documents the fintech origin story and live-market safeguard model; it should not be treated as evidence that AI sandboxes work in health, education, employment, public benefits, or policing. The OECD AI sandbox report, OECD.AI analysis, and OECD sandbox toolkit are policy references about design principles, limits, cross-jurisdiction learning, and regulatory experimentation; they are not regulator orders. For any actual sandboxed deployment, the evidence hierarchy should separate statute, regulator agreement, participant application, public summary, test logs, incident reports, affected-person complaints, independent evaluation, and marketing claims. Flattening those artifacts into one "approved by the sandbox" label is the failure this essay is warning about.
Current-source claims were checked against primary or regulator sources where possible, with a June 29, 2026 status check added for the Council's final-green-light notice after the page's requested June 25 metadata date.
Related Pages
- AI Regulatory Sandboxes
- EU AI Act
- Compliance Calendar
- The AI Act Omnibus Becomes the Legitimacy Test
- The State AI Law Becomes the Regulator
- The Standard Becomes the Law
- The Regulatory Context Becomes the Protocol
- The Governance Policy Becomes the Mechanical Gate
- The AI Audit Becomes the Compliance Interface
- The Agent Sandbox Becomes the Airlock
- Transparency and Public Registers
- Privacy and Data
- Risk and Insurance
- AI System Inventory
- AI Change Management
- AI Audit Trails
- AI Post-Market Monitoring
- Data Minimization
- AI Procurement
- AI in Healthcare
- AI in Government and Public Services
- Notice and Appeal
- Human Oversight of AI Systems
- AI Liability and Accountability
- AI Audits and Assurance
- Algorithmic Impact Assessments
- AI Incident Reporting
- The Incident Report Becomes Public Memory
- The Governance Document Becomes a Revalidation Problem
- The Validity Certificate Becomes the Policy Proof
- The Test Artifact Becomes the Governance Object
Sources
- European Union, Regulation (EU) 2024/1689, Artificial Intelligence Act, official text.
- European Union AI Act Service Desk, Article 57: AI regulatory sandboxes, reviewed June 25, 2026.
- European Union AI Act Service Desk, Article 58: Detailed arrangements for, and functioning of, AI regulatory sandboxes, reviewed June 25, 2026.
- European Union AI Act Service Desk, Article 59: Further processing of personal data for developing certain AI systems in the public interest in the AI regulatory sandbox, reviewed June 25, 2026.
- European Union AI Act Service Desk, Article 60: Testing of high-risk AI systems in real world conditions outside AI regulatory sandboxes, reviewed June 25, 2026.
- European Union AI Act Service Desk, Article 61: Informed consent to participate in testing in real world conditions outside AI regulatory sandboxes, reviewed June 25, 2026.
- European Commission, Commission seeks feedback on draft implementing act to establish AI regulatory sandboxes under the AI Act, published December 2, 2025, last updated December 15, 2025.
- Council of the European Union, Artificial Intelligence: Council and Parliament agree to simplify and streamline rules, May 7, 2026, reviewed June 25, 2026.
- European Parliament, AI Act: EP approves simplification measures and "nudifier" app ban, June 16, 2026, reviewed June 25, 2026.
- Council of the European Union, Artificial Intelligence: Council gives final green light to simplify and streamline rules, June 29, 2026, checked for current status.
- European Commission, AI Act implementation and AI Omnibus timeline, reviewed June 25, 2026.
- Texas Legislature Online, HB 149 enrolled text, Texas Responsible Artificial Intelligence Governance Act, 2025.
- Utah Office of Artificial Intelligence Policy, AI Learning Lab, reviewed June 25, 2026.
- Utah Department of Commerce, AI Regulatory Relief Process and AI FAQ, reviewed June 25, 2026.
- UK Medicines and Healthcare products Regulatory Agency, AI Airlock: the regulatory sandbox for AI as a Medical Device, updated June 9, 2026.
- UK Medicines and Healthcare products Regulatory Agency, AI Airlock Sandbox Phase 2 Programme Report, published June 9, 2026.
- UK Medicines and Healthcare products Regulatory Agency, MHRA launches AI sandbox to accelerate medicines development and improve safety, June 9, 2026.
- UK Department for Science, Innovation and Technology, AI Growth Lab call for evidence, updated December 18, 2025, reviewed June 25, 2026.
- Singapore Infocomm Media Development Authority and AI Verify Foundation, Global AI Assurance Sandbox, reviewed June 25, 2026.
- OECD, Regulatory Sandboxes in Artificial Intelligence, OECD Digital Economy Papers No. 356, July 13, 2023.
- OECD, Regulatory Sandbox Toolkit: A comprehensive guide for regulators to establish and manage regulatory sandboxes effectively, 2025.
- OECD.AI, Why AI Sandboxes matter for responsible innovation and public trust, March 18, 2026.
- Financial Conduct Authority, Regulatory sandbox lessons learned report, October 2017.