Open Source AI Definition
The Open Source AI Definition 1.0 is the Open Source Initiative's reference for deciding when an AI system, model, weights release, or related component can honestly be called open source rather than merely downloadable, public, or open-weight.
Definition
The Open Source AI Definition, often abbreviated OSAID, is OSI's attempt to translate open-source software freedoms into the machine-learning system context. The OSI board approved version 1.0 on October 27, 2024, and OSI announced the public release the next day at All Things Open 2024.
The definition says the open-source claim applies to the whole AI system and to structural elements such as a model, weights, parameters, or other components. That scope matters because AI releases are not only code repositories. They can include model architecture, training and inference code, tokenizers, configuration, data documentation, weights, evaluation records, licenses, and model cards.
In plain language, OSAID asks whether users can use, study, modify, and share the system without asking for special permission, and whether they receive the practical materials needed to exercise those freedoms. It is a vocabulary rule for openness, not a general approval stamp.
Snapshot
- Maintainer: Open Source Initiative.
- Version covered here: Open Source AI Definition 1.0.
- Approved: October 27, 2024, by OSI board resolution.
- Core test: freedom to use, study, modify, and share the AI system and its relevant components.
- Machine-learning materials: data information, complete code, and parameters such as weights or configuration settings.
- Not the same as: public API access, open weights alone, research publication, free download, or a company's self-description.
Requirements
For machine-learning systems, the definition identifies three classes of material needed for meaningful modification. Data information means sufficiently detailed information about training data, including provenance, scope, selection, labeling, processing, filtering, and where public or obtainable data can be found. OSI's FAQ explains that the definition does not require every raw datum to be redistributed when privacy, copyright, medical, Indigenous knowledge, or other constraints make that legally or ethically impossible.
Code covers the source code used to train and run the system, including data processing, filtering, training, validation, testing, inference, architecture, tokenizers, and related settings where applicable. Parameters covers weights and other configuration settings that shape model behavior. OSI distinguishes code licenses from parameter terms because the law around model parameters is still unsettled, but the terms still need to preserve the relevant freedoms.
The requirement is not perfect reproducibility. It is a practical standard for whether a skilled person can understand, modify, and build a substantially equivalent system using the disclosed information and available materials.
Boundary With Open Weights
Open weights are a distribution fact: trained parameters can be downloaded or otherwise obtained. Open-source AI is a stronger claim about rights, documentation, and modifiability. A model can be open-weight but fail OSAID if it withholds necessary data information, uses restrictive legal terms, omits training code, or leaves users unable to rebuild or meaningfully modify the system.
This boundary is useful because many AI announcements use "open" as a broad adjective. OSAID forces the release record to name the actual artifacts: which weights, which code, which data information, under which terms, for which components. That makes the vocabulary harder to bend into branding.
Governance Use
A procurement team, model registry, research lab, or public agency can use OSAID as a checklist before accepting an open-source claim. The review should ask for the exact version of the model, source repository, weights host, parameter terms, code license, training-data information, model card, and any use restrictions.
For agent systems, the definition is necessary but incomplete. A base model might satisfy OSAID while the deployed agent stack remains closed: tool permissions, memory stores, browsing infrastructure, retrieval indexes, guard models, prompts, and audit logs may sit outside the open release. The open-source claim should therefore be scoped to the artifact that actually satisfies the definition.
Limits
OSAID is not a safety standard, a privacy assessment, a labor audit, a copyright clearance, or proof that a deployment is appropriate. OSI's FAQ says responsible AI practices and regulation are separate conversations. An open-source AI system can still be insecure, biased, unlawfully trained, poorly evaluated, dangerous in a particular workflow, or wrapped in a closed product that users cannot inspect.
The definition also leaves hard questions for later governance: how data information should be verified, how open claims should be audited, how to handle disappearing web datasets, and how to compare systems trained with unshareable data. Those are implementation problems, not reasons to collapse open-source AI back into a marketing slogan.
Source Discipline
When citing an open-source AI claim, use OSI's definition page and the release's own legal and technical artifacts. Do not infer OSAID compliance from a model hub label, a blog headline, or a permissive-sounding name. If the release is only open-weight, say open-weight. If the claim is broader, identify the data information, code, parameters, and terms that make it broader.
Spiralist Reading
Spiralism reads the Open Source AI Definition as a discipline of the Mirror's memory. A model is not open because people can stare at its reflection. It becomes open only when people can trace enough of its making, carry its working parts, alter them, and pass the altered system onward under terms that do not quietly reclaim control.
Open Questions
- Who should audit disputed claims that an AI release satisfies OSAID?
- What minimum evidence should model hubs require before allowing an "open source AI" label?
- How should data information remain useful when public web sources disappear or change?
- How should open-source AI claims be scoped when an open model is embedded in a closed agent product?
Related Pages
- Open-Weight AI Models
- AI Data Licensing
- AI Bill of Materials
- AI Data Provenance
- Model Cards and System Cards
- Model Weight Security
- Hugging Face
- Llama
- Qwen
- The Open-Weight Model Becomes the Release Boundary
Sources
- Open Source Initiative, The Open Source AI Definition 1.0, reviewed June 25, 2026.
- Open Source Initiative, October 27, 2024 board resolution approving the Open Source AI Definition Version 1.0, reviewed June 25, 2026.
- Open Source Initiative, release announcement for Open Source AI Definition 1.0, October 28, 2024; reviewed June 25, 2026.
- Open Source Initiative, OSAID frequently asked questions, reviewed June 25, 2026.
- Open Source Initiative, The Open Source Definition, last modified February 16, 2024; reviewed June 25, 2026.