Blog · Analysis · Last reviewed June 19, 2026

The Peer Reviewer Becomes the Model Referee

When AI enters peer review, it does not only help tired reviewers write faster. It changes how unpublished knowledge is judged, leaked, gamed, and remembered.

Here, a model referee means any AI system that touches the reviewer workflow: summarizing a manuscript, drafting critique, polishing a report, suggesting scores, checking review quality, or helping an editor triage submissions. The governance question is not whether software ever assists review. It is which acts touch confidential manuscript text, which acts shape judgment, and who remains answerable.

The Referee

Peer review is not a purity machine. It is unpaid labor, uneven expertise, social trust, disciplinary politics, and time pressure wrapped around the serious task of deciding whether a manuscript should enter the record. A review may improve a paper, catch a flaw, miss a fabrication, protect a field, or reproduce its blind spots.

The existing site has treated paper mills as attacks on the literature from the author side and grant-review filters as automated pressure on research funding. The model referee is the same institutional problem from the journal and conference gate: AI helping reviewers and editors make judgment-shaped text.

That phrase covers several different acts. A language model can polish a review after the human has written it. It can critique the review without seeing the manuscript. It can compare a manuscript against related work. It can draft weaknesses, scores, or a recommendation. It can also screen submissions before peer review begins. Those acts carry different risks, so a serious policy has to name the act, the input, the system, and the decision point.

A referee report is confidential but consequential, private but institutional, advisory but often decisive. If a model writes or substantially shapes that report, the machine is not just summarizing. It is entering the ritual by which private skepticism becomes public knowledge, the same knowledge pipeline mapped in AI in science and research integrity.

Why Reviewers Reach for It

The appeal is obvious. Reviewers are overloaded. AI conferences and journals receive more submissions than careful review labor can comfortably absorb. A language model can summarize a paper, identify missing baselines, rephrase a vague criticism, compare related work, and turn rough notes into readable prose. Workload explains the temptation. It does not by itself authorize leakage or delegation.

Some assistance can help when the task is narrowly scoped. A large-scale empirical study submitted in 2023 found GPT-4 feedback overlapped with human peer-review points at a level comparable to overlap between two human reviewers. ICLR 2025 piloted a Review Feedback Agent that commented on submitted reviews rather than replacing reviewers; organizers said it did not write reviews or make acceptance decisions. The randomized ICLR study later reported that 27% of reviewers who received feedback updated their reviews, more than 12,000 suggestions were incorporated, and feedback was associated with longer and more informative reviews.

That is the defensible version: the venue controls the tool, the model critiques the review rather than the paper's destiny, the reviewer chooses whether to revise, and the conference studies the effect. The risky version is quieter: a reviewer uploads a confidential manuscript to a general chatbot, pastes the output into a report, and receives credit for a judgment they did not fully perform. The difference is not cosmetic. It is the difference between governed assistance and an unlogged private sub-reviewer.

The Policy Split

As of June 19, 2026, publication bodies have not collapsed into one rule, but the boundary is clear. The International Committee of Medical Journal Editors says submitted manuscripts are privileged communications and that editors, reviewers, or publishers should not upload them into AI systems where confidentiality cannot be assured without explicit author permission. Its reviewer guidance also says reviewers should follow the journal's AI policy or request permission, disclose AI use to the journal, and validate AI-generated content because it can be incorrect, incomplete, or biased.

Elsevier tells reviewers not to upload manuscripts, manuscript parts, reviewer reports, questionnaires, or related correspondence into generative AI tools, including for language improvement, and says the critical thinking and original assessment required for review are human responsibilities. Nature Portfolio asks peer reviewers not to upload manuscripts into generative AI tools and to declare any AI support used to evaluate manuscript claims. NeurIPS 2025 told reviewers not to talk about or share submissions with anyone or any LLMs; its 2026 Evaluations and Datasets track says reviewers may not use LLMs or AI agents in review and asks reviewers to report prompt injections or hidden instructions.

At the same time, publishers and conferences are building sanctioned systems. Springer Nature announced an internal AI-driven tool in January 2025 for editorial quality checks before peer review, with human experts double-checking results and 14 suitability checks including data availability, ethics, clinical trials, and misuse threats. ICLR piloted an official feedback agent under program-chair control and later had to respond to concerns about low-quality LLM-generated reviews, emphasizing disclosure, reviewer responsibility, and enforcement. The line is not "AI never touches review." The line is whether the tool is governed, confidential, traceable, proportionate, and subordinate to human responsibility.

The Prompt-Injection Moment

The hidden-prompt scandal made the risk concrete. In July 2025, Nature reported that some preprints contained instructions in white text or small font designed to influence AI-assisted peer review. A July 2025 arXiv commentary by Zhicheng Lin said 18 arXiv manuscripts had hidden prompts and analyzed the practice as prompt injection aimed at manuscript evaluation.

This was not only author misconduct. It was a diagnostic test for the pipeline. A human reviewer sees a paper. A model reviewer sees a paper plus any hidden instructions embedded in the file. If the manuscript is fed to an LLM, the manuscript is no longer inert evidence. It can become an adversarial interface.

The problem echoes hallucinated legal citations and the wider prompt-injection problem. In both cases, a model can make the surface of evaluation smoother while weakening the underlying check. The danger is that the model can be steered by the document it is supposed to judge.

For venues that use sanctioned AI tools, submission hygiene now has to include hidden text, malformed PDF structure, metadata, image layers, copy-paste artifacts, and adversarial instructions. Screening should not become an AI detector discipline machine; it should be a narrow integrity check with human review, appeal routes, and clear evidence.

The Governance Standard

A serious peer-review standard should separate at least five uses: forbidden upload to public tools, permitted language polishing without manuscript leakage, sanctioned review-feedback systems controlled by the venue, editorial screening before review, and prohibited delegation of expert judgment. A policy that just says "AI allowed" or "AI banned" is too blunt to govern the actual pipeline.

First, confidentiality should be the default. Unpublished manuscripts may contain ideas, code, data, identities, patient information, trade secrets, or early claims. Reviewers should not move them into systems whose retention, training, logging, subcontractor access, or security terms they cannot verify.

Second, the act should be classified before the tool is approved. Grammar polishing, translation, summary, related-work search, critique generation, score suggestion, review-quality feedback, editorial triage, and final recommendation are different acts. They should not share one permission category.

Third, AI use should be disclosed to the editor or venue. The disclosure should name the tool, purpose, input class, and whether manuscript text, figures, code, reviews, or author responses were processed.

Fourth, models should not write the verdict. A system may flag vagueness, tone, missing evidence, or possible misunderstandings. It should not produce the accept/reject recommendation that a reviewer signs as expert judgment.

Fifth, reviewer reports need source separation. Editors need to know what is human judgment, what is machine-shaped language, and what was copied from model output. This is a smaller version of the agent log receipt problem: an institution needs enough trace to know what happened without exposing confidential submissions more widely.

Sixth, venues should scan for hidden instructions. Prompt injection, invisible text, malicious metadata, and adversarial PDF structure should become part of submission hygiene where AI-supported review tools are used.

Seventh, official review-support systems need audit trails. The venue should know which manuscript version, model version, prompts, guardrails, feedback, reviewer action, and editor decision were involved. The site treats this as an AI audit trail requirement, not as a decorative transparency badge.

Eighth, enforcement should be proportionate. Hidden prompts, fabricated citations, undisclosed full-review generation, and harmless grammar correction are not the same event. Detector scores should trigger review, not punishment by themselves.

Ninth, reviewer labor should not be quietly devalued. If institutions want better review, they should reward it: credit, training, time, editor support, and fewer empty metrics. AI should not become a way to extract more unpaid reports from strained researchers.

What This Changes

The peer reviewer is a small authority figure in the machinery of knowledge. They sit at a symbolic door: not yet published, maybe publishable, not good enough, revise, reject, accept.

The model referee changes the door. It can make review more readable, consistent, and responsive. It can also turn judgment into a polished template, confidentiality into an API call, and expert refusal into autocomplete. The danger is not that every reviewer who uses software has failed. The danger is that institutions may accept model-shaped prose as a substitute for accountable reading.

A good scientific community should use tools without letting tools impersonate trust. Peer review survives only if the parties know who read the work, who judged the evidence, who saw the confidential material, and who is answerable when the report is lazy, biased, false, fabricated, or manipulated. That same standard belongs in lab notebooks, provenance records, and the public memory of science.

Source Discipline

This page treats journal and conference policies as primary evidence for current reviewer rules, not as evidence that all reviewers comply. It treats ICLR materials as evidence of sanctioned experiments and enforcement problems, not proof that AI-assisted review is generally reliable or generally corrupt. It treats Springer Nature's internal screening tool as an editorial workflow claim, not as permission for outside reviewers to upload manuscripts into unrelated systems.

The empirical studies cited here are useful but narrow. They show overlap, reviewer engagement, and review-quality signals under particular designs. They do not settle authorship accountability, confidentiality, bias, conflict-of-interest checks, or the legitimacy of model-generated verdicts. The hidden-prompt sources show a real attack surface; they do not prove that hidden prompts are common across all scholarly fields.

Sources

ICMJE, Use of Artificial Intelligence in Publishing, reviewed June 19, 2026.
ICMJE, Use of AI by Reviewers, reviewed June 19, 2026.
Elsevier, The use of generative AI and AI-assisted technologies in the review process, reviewed June 19, 2026.
Nature Portfolio, Artificial Intelligence (AI), reviewed June 19, 2026.
Springer Nature, Authors, editors and peer reviewers supported with launch of new AI tool, January 7, 2025.
NeurIPS, 2025 Policy on the Use of Large Language Models, reviewed June 19, 2026.
NeurIPS, Evaluations and Datasets 2026 Reviewing Guidelines, reviewed June 19, 2026.
ICLR Blog, Assisting ICLR 2025 reviewers with feedback, October 9, 2024.
ICLR Blog, Leveraging LLM feedback to enhance review quality, April 15, 2025.
Nitya Thakkar et al., Can LLM feedback enhance review quality? A randomized study of 20K reviews at ICLR 2025, arXiv, April 2025.
Weixin Liang et al., Can large language models provide useful feedback on research papers? A large-scale empirical analysis, arXiv, 2023; published in NEJM AI, 2024.
Elizabeth Gibney, Scientists hide messages in papers to game AI peer review, Nature, July 2025.
Zhicheng Lin, Hidden Prompts in Manuscripts Exploit AI-Assisted Peer Review, arXiv, July 2025.
ICLR Blog, ICLR 2026 Response to LLM-Generated Papers and Reviews, November 19, 2025.
Related references: The Paper Mill Becomes the Literature, The Grant Review Becomes the Funding Filter, The Citation Machine Enters the Court, The AI Detector Becomes the Discipline Machine, The Agent Log Becomes the Receipt, Prompt Injection, AI Audit Trails, Research and Editorial Integrity, and Claim Hygiene Protocol.

Return to Blog