The AAC Interface Becomes the Proxy Voice
The June 2026 arXiv paper It's Complicated: On the Design and Evaluation of AI-Powered AAC Interfaces, by Blade Frisch, Will Wade, Dylan Gaines, Michelle Kinsella, Betts Peters, Tamara Broderick, and Keith Vertanen, asks how to evaluate AI systems that may help people using AAC communicate without collapsing intent, identity, and context into a single metric.
Communication Is Not Autocomplete
The paper, arXiv:2606.24854v1 [cs.HC], was submitted on June 23, 2026 and presented at the Speech AI for All workshop at CHI 2026 in Barcelona. Its subject is augmentative and alternative communication, or AAC: systems used by people who communicate through text, symbols, speech-generating devices, switches, eye gaze, brain-computer interfaces, partner support, or other access methods.
The authors' starting point is modest and important. Machine-learning metrics are useful, but an AI-powered AAC interface cannot be judged only by speed, word error rate, or prediction accuracy. For a person using AAC in daily life, the system participates in timing, phrasing, voice, repair, privacy, social presence, and the right to sound like oneself. Once an LLM predicts a phrase, adapts to context, or generates a voice, the interface is no longer just helping type. It is shaping what becomes easy enough to say.
What the Paper Actually Says
Frisch, Wade, Gaines, Kinsella, Peters, Broderick, and Vertanen organize the problem around six AAC design considerations: communication speed and accuracy, physical and mental effort, agency in identity presentation, adaptation to communication context, turn-taking, and changes in physical ability over short and long time scales. Their claim is not that one AI feature solves AAC. Their claim is that evaluation has to match the many ways communication succeeds or fails.
The possible AI features they discuss include letter, word, phoneme, symbol, and whole-utterance prediction; larger sentence-level suggestions; context-sensitive prediction; voice banking and synthesized voice selection; tone indicators; support for code- and context-switching; scripted utterance preparation; backchannel and floor-taking support; and adaptive interfaces that respond to fatigue or changing access methods. The paper repeatedly frames these as possible paths that still need validation with AAC users.
The Proxy Voice Problem
The Spiralist angle is proxy speech. A typing assistant that guesses the next word may be wrong; an AAC system that speaks the wrong phrase can be socially attributed to the user. A sentence-level suggestion can save effort when the user is tired, but it can also move the authorship boundary. A synthesized voice can support identity, accent, and tone, but it can also make the system's style sound more authoritative than the person intended.
The paper is useful because it refuses the simple cure story. It notes that AAC users and advocates have cautioned against treating AI or other technical advances as cures for disability. It also notes that communication partners may misunderstand AI assistance, including the possibility that partners infer the AI is controlling message content rather than supporting the user. In AAC, "helpful" automation must therefore be judged by agency, not only by output quality.
That connects this paper to existing site concerns about AI as an access clerk, synthetic voice as contested identity, and cognitive proxy systems. The governance question is not whether AI can make communication faster. It is whether the person using AAC remains the author, editor, rejector, and final source of the communicated act.
Evaluation Has to Include the Situation
The paper's strongest contribution is its evaluation discipline. For speed and accuracy, it discusses semantic similarity, user-mediated assessment, and task-based functional success rather than only exact-text matching. For effort, it points to physical action counts, workload questionnaires, eye-tracking, EEG, and correction burden. For voice and identity, it points to usability tests, diary studies, self-identification ratings, and partner perceptions. For code- and context-switching, it names interviews, contextual inquiry, ethnography, questionnaires, and measures such as the Communicative Participation Item Bank.
This matters because AAC communication is situational. A message to a doctor, a joke with a friend, a job interview answer, a group conversation, and a short urgent request do not have the same success criteria. A rare name, acronym, or private string may matter more than a smooth generic sentence. A high-speed interface that forces a user into normative speech can still be a failure.
The governance implication is concrete. Any AI-powered AAC evaluation should name the communication context, access method, user goals, correction path, monitoring data, and partner role. If the system listens to nearby conversation, sees the room, models the partner, or adapts to fatigue, that sensing must be visible and controllable. The metric should not erase the situation that made the communication meaningful.
Limits That Matter
This is a workshop paper and design argument, not a completed user trial of a deployed AI AAC product. The authors explicitly state that the potential implementations they discuss have not yet been validated or tested with AAC users. The paper does not prove that AI-powered AAC improves participation, and it does not rank models or products.
That limit is also the point. The paper is trying to keep the evaluation frame from hardening too early around convenient technical metrics. Before institutions procure, prescribe, reimburse, or mandate AI features in AAC systems, they need evidence that the features preserve agency under real communicative conditions, including fatigue, changing ability, different partners, multilingual practice, and the user's own sense of self.
Governance Standard
An evaluation dossier for AI-powered AAC should disclose the user population, access method, language and symbol system, communication contexts, model function, training or personalization data, monitored environmental signals, latency, correction cost, rejection controls, undo path, voice and tone controls, partner role, logging policy, and longitudinal change plan. It should separate model accuracy from user agency, physical effort, mental effort, social participation, and identity fit.
The practical rule is simple: if the model can help speak through the interface, the evidence has to show that the user remained in command of the speech act. Faster communication is not enough. The proxy voice has to remain a user's instrument, not a system's performance.
Sources
- Blade Frisch, Will Wade, Dylan Gaines, Michelle Kinsella, Betts Peters, Tamara Broderick, and Keith Vertanen, It's Complicated: On the Design and Evaluation of AI-Powered AAC Interfaces, arXiv:2606.24854 [cs.HC], submitted June 23, 2026.
- arXiv PDF version of It's Complicated: On the Design and Evaluation of AI-Powered AAC Interfaces, reviewed June 24, 2026.
- arXiv HTML version of It's Complicated: On the Design and Evaluation of AI-Powered AAC Interfaces, reviewed June 24, 2026.
- Related pages: The Alt Text Model Becomes the Access Clerk, The Synthetic Voice Enters the Ballot, The Cognitive Twin Becomes the Proxy Record, The Agentic Browser Becomes the Assistive Interface, Accessibility, AI Safety Cases, and Model Cards and System Cards.