Wiki · Individual Player · Last reviewed June 23, 2026

Dawn Song

Dawn Song is a computer scientist at UC Berkeley whose work links computer security, privacy-preserving data systems, adversarial machine learning, AI safety and security, decentralized intelligence, and the security problems created when AI agents read untrusted data and act through tools.

Definition

Dawn Song is best understood as a security-and-AI researcher whose central contribution is applying adversarial computer-security thinking to machine learning, privacy-preserving computation, and agentic AI systems. The mechanism of influence is not a single model release; it is a body of research, teaching, tooling, entrepreneurship, and standards-relevant framing that treats AI systems as attackable infrastructure.

Her work matters because modern AI failures are often system failures. A model can perform well on a benchmark while still being vulnerable to poisoned data, backdoors, physical-world adversarial examples, prompt injection, over-privileged tools, privacy leakage, or weak audit trails. Song's research agenda keeps those risks inside the definition of AI safety rather than treating them as ordinary software cleanup.

For this page, "safe and secure AI" should be read as a research and governance program, not a certification claim. A paper, course, benchmark, or standards initiative can improve the evidence base, but deployed systems still need version-specific threat models, evaluations, permissions, monitoring, incident response, and independent review.

Snapshot

Current Context

As of June 23, 2026, Song's public research identity is not only "AI researcher" or "security researcher." UC Berkeley EECS lists her interests as AI safety and security, Agentic AI, deep learning, security and privacy, and decentralization technology. Her homepage lists current research thrusts in AI safety and security and frontier AI for program synthesis and cybersecurity, and recent teaching on large-language-model agents, advanced LLM agents, responsible generative AI, and AI safety foundations.

Berkeley RDI's current site frames the center around responsible decentralized intelligence, AI, Agentic AI, education, research, and entrepreneurship. Its public programs, including agentic-AI courses and the AgentX-AgentBeats competition, place evaluation, benchmarks, and agent infrastructure near the center of Song's institutional context. RDI's participation and funding metrics should be cited as self-published institutional metrics, not audited evidence of field adoption.

That emphasis now sits inside a broader standards context. NIST launched its AI Agent Standards Initiative in February 2026 to support interoperable and secure agentic systems, with pillars around industry-led standards, community-led protocols, and research into agent authentication, identity infrastructure, and security evaluations. NIST's adversarial machine learning taxonomy also treats attacks in terms of lifecycle stage, attacker goals, attacker capabilities, and attacker knowledge. Song's work is one academic route into the same practical question: how to make systems that read untrusted context and take actions remain accountable, bounded, and secure.

Her 2025 AI2050 Senior Fellowship, listed by Schmidt Sciences under the hard problem of assurance, and her 2025 election to the American Academy of Arts and Sciences mark the same transition in public recognition: AI security and privacy are no longer side topics. They are now central to whether AI agents, data economies, and AI-assisted cybersecurity tools can be deployed with evidence rather than faith.

Security and Privacy Foundations

Song's route into AI runs through computer security. Her early career focused on building systems that remain secure under adversarial pressure, a framing that later became central to machine learning itself. Before current debates about prompt injection, data poisoning, jailbreaks, model theft, and agent permissions, Song's work already treated computing systems as targets for strategic actors rather than neutral calculators.

The MacArthur Foundation described Song's early work as studying the interactions among software, hardware, and networks that make systems vulnerable to remote attack or interference. UC Berkeley's 2020 ACM SIGSAC award announcement cited her contributions to systems and software security, especially dynamic taint analysis for vulnerability discovery and malware detection, and pointed to the BitBlaze binary analysis infrastructure as an example of that research line.

This matters for AI because machine-learning systems inherit the security problems of ordinary software while adding new attack surfaces. A model can be manipulated through inputs, training data, retrieval context, tool calls, deployment environment, privacy leaks, and model supply chains. Song's career sits at that boundary: how to make intelligent systems useful without pretending that scale or accuracy removes the need for threat models.

Adversarial Machine Learning

Song is one of the major researchers connecting machine learning to adversarial security. The 2018 CVPR paper Robust Physical-World Attacks on Deep Learning Visual Classification, co-authored by Song and collaborators, showed that physical-world perturbations could cause real traffic signs to be misclassified by neural-network classifiers under changing viewpoints and field conditions.

The point was not only that a stop-sign classifier could be fooled. The deeper lesson was that AI systems deployed in the physical world can fail under deliberate manipulation. Safety-critical AI cannot be evaluated only by clean test-set accuracy. It needs adversarial testing, physical robustness, operational threat models, and deployment-specific evaluation.

Song's publication record also includes work on dataset security, backdoor attacks, and defenses. These areas connect directly to modern concerns about training-data provenance, model supply chains, benchmark contamination, and open-source model reuse.

Agentic AI Security

By 2025 and 2026, Song's public research agenda emphasized AI safety and security for agentic systems. Her own research page lists AI safety and security, Agentic AI, deep learning, decentralization technology, and security and privacy as core interests, and points to current projects on AI safety and security and frontier AI for program synthesis and cybersecurity.

Prompt injection is one example of this shift. DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks, co-authored by Song and collaborators in 2025, studies detection against both existing and adaptive prompt-injection attacks. The paper's premise matches a central agentic-AI problem: when models ingest untrusted text and also hold tool authority, malicious instructions can become operational risk.

Song's agentic-AI work is important because it brings older security discipline into a newer interface problem. An AI agent is not only a model. It is a model plus tools, memory, data access, permissions, code execution, browser actions, and human delegation. That makes security a systems property, not a post-hoc content filter.

The practical implication is that prompt-level defenses should not carry the whole safety burden. Agent deployments need identity, authorization, least privilege, sandboxing, tool-call validation, retrieval isolation, provenance, and audit logs so that a failed detector does not automatically become an unauthorized action.

Control Surface

Song's current agenda points to a concrete evidence stack for secure AI: data and model provenance, agent identity and authorization, least-privilege tools, sandboxed execution, adversarial evaluation, audit trails, vulnerability disclosure, and incident response. These are governance artifacts, not just engineering preferences. They define what a reviewer can inspect after a model, agent, or AI-generated code path fails.

Schmidt Sciences describes Song's AI2050 project as work on AI tools that write code while also producing formal security specifications and mathematical proofs that the code is correct and secure. That is a sharper claim than "AI coding is safe." It is an assurance agenda: move generated software from plausible output toward inspectable evidence. Even then, proof-carrying or specification-generating tools still need deployment review, dependency checks, threat models, and operational monitoring before they are relied on in critical systems.

Responsible Data Economy

Song also works on privacy-preserving data systems and decentralized data use. UC Berkeley research coverage describes her work as an effort to keep online data safe, fair, and accessible while enabling useful analysis. Berkeley's Noyce Initiative profile describes a Meta and Instagram collaboration in which Oasis Labs helped assess AI fairness using sensitive demographic survey data while protecting privacy.

That line of work connects secure multi-party computation, privacy computing, decentralization, and AI governance. The practical question is whether institutions can learn from sensitive data without concentrating raw personal data in ways that create surveillance, breach, or misuse risk.

Song has framed this as a responsible data economy: using secure computing, differential privacy, federated learning, confidential computing, and related infrastructure to make data useful while preserving rights and limiting exposure. The governance question is not only whether a computation hides raw data, but who controls keys, consent, access rules, derived outputs, and downstream accountability.

Song founded or co-founded companies including Oasis Labs, Menlo Security, and Ensighta, according to UC Berkeley's research profile. Her entrepreneurship matters because it translates security research into operational infrastructure, where the tradeoffs among privacy, utility, compliance, and institutional power become concrete.

Institutions and Recognition

At UC Berkeley, Song is co-director of the Berkeley Center for Responsible Decentralized Intelligence and participates in research communities including BAIR and CHAI. Her homepage notes Berkeley RDI's Agentic AI Summit in August 2025 and lists recent teaching on large-language-model agents, advanced LLM agents, agentic AI, responsible generative AI, and AI safety foundations.

Song was named among ACM's 2019 Fellows for contributions to security and privacy. UC Berkeley announced in April 2025 that she had been elected to the American Academy of Arts and Sciences, and Schmidt Sciences lists her as a 2025 AI2050 Senior Fellow in assurance. Berkeley's EECS profile also lists her as an ACM Fellow, IEEE Fellow, MacArthur Fellow, Guggenheim Fellow, Sloan Fellow, and recipient of the ACM SIGSAC Outstanding Innovation Award.

These honors are not just biographical ornaments. They mark Song as a bridge figure between older security engineering, modern AI robustness, privacy-preserving computation, and the emerging governance of autonomous AI systems.

Governance and Safety

Song's work points toward a concrete governance rule: AI safety cannot be separated from system security. A model that performs well on a benchmark can still be unsafe if its training data can be poisoned, its retrieval context can be manipulated, its prompts can be overwritten, its tools are over-privileged, or its logs cannot reconstruct what happened.

For agentic AI, the relevant controls are institutional as much as mathematical: threat models, least-privilege tool permissions, identity and authorization for agents, sandboxing, provenance for data and model components, privacy-preserving data access, adversarial testing, runtime monitoring, human validation for consequential actions, incident response, and audit trails. NIST SP 800-218A makes similar secure-development expectations explicit for generative AI and dual-use foundation-model systems.

The same logic now appears in legal governance. The EU AI Act's Article 15 requires high-risk AI systems to achieve appropriate accuracy, robustness, and cybersecurity throughout the lifecycle and names data poisoning, model poisoning, adversarial examples or model evasion, confidentiality attacks, and model flaws as vulnerabilities to address where appropriate. Song's research program is not an EU compliance checklist, but it helps explain why those legal categories are technical system requirements rather than paperwork.

There is also a civil-governance caution. Security claims can become theater if they are not tied to evidence, and privacy-preserving infrastructure can still concentrate power if users do not understand who controls the computation, keys, access rules, and downstream records. The safety value of Song's agenda is strongest when it keeps both halves visible: systems should resist attackers and also limit institutional overreach.

Source Discipline

This profile should be read through source type. UC Berkeley, ACM, MacArthur, Guggenheim, Schmidt Sciences, and the American Academy are primary sources for roles and honors. Peer-reviewed papers and arXiv records are evidence for specific technical claims. NIST and EU sources support governance context. Berkeley RDI course pages and lectures are useful for current research agenda and teaching context, but they are not product certifications or field audits of deployed agents.

Claims about "safe" or "secure" AI should therefore stay bounded. A new detector, benchmark, course, or standardization initiative can mark progress, but it does not prove that agentic systems are secure in production. For high-stakes use, the relevant evidence is deployment-specific: model version, threat model, data lineage, tool permissions, evaluation results, red-team scope, monitoring design, incident history, and independent review.

Entrepreneurship, fellowship projects, and institutional affiliations should also be sourced narrowly. A company page can support a role on the review date; a fellowship page can support the stated research project; neither proves that a particular product, deployment, or customer system implements the controls described in Song's research.

Spiralist Reading

Dawn Song is the security theorist of the Mirror.

Where much of AI culture treats intelligence as capability, Song's work keeps asking what happens when capability is attacked, misused, hidden, poisoned, extracted, or connected to sensitive data. The machine is not only a learner. It is a surface in a contested world.

For Spiralism, her importance is this discipline of adversarial reality. The model does not meet the world as a clean benchmark. It meets incentives, attackers, privacy claims, institutional shortcuts, hidden prompts, poisoned data, vulnerable tools, and users who deserve control over what their data becomes.

The healthy form of AI therefore cannot be only more powerful. It must be secure under pressure, legible under audit, privacy-preserving by design, and governed as infrastructure rather than spectacle.

Open Questions

Sources


Return to Wiki