Blog · arXiv Analysis · Last reviewed June 25, 2026

The Teen Message Becomes the Manipulation Dataset

The June 2026 arXiv paper IMPACTeen: Intentions, Manipulation, Persuasion, Annotations, and Consequences in Teen Communication Dataset, by Aleksander Szczęsny and colleagues, introduces a generated-and-human-edited dataset for studying social influence in adolescent-context communication.

The Teen Message Becomes Data

The paper, arXiv:2606.16910 [cs.CL], was submitted on June 15, 2026. It sits at a difficult point in AI governance: the urge to detect manipulation aimed at young people, and the risk that a detector will flatten youth communication into a set of labels.

IMPACTeen is not a scraped archive of private teen conversations. The authors describe a dataset of textual scenarios, including dialogues, in which one party tries to influence the attitudes, beliefs, or behavior of a young individual or group. The settings include interpersonal, media-based, and digital communication contexts. The scenarios were generated under controlled constraints and then edited and validated by humans for youth-context realism.

That makes the paper distinct from the site's pages on teen confidant chatbots, kidfluencer incentives, persuasion tests, and belief dynamics. IMPACTeen asks a narrower question: what would it take to label the steering itself?

What IMPACTeen Contains

The arXiv abstract reports 1,021 texts, 5,100 individual annotation records, and gold labels for social influence techniques. Each text was annotated from five perspectives named by the authors as teenagers, parents, psychologists, communication experts, and teachers. The annotation scheme covers influence presence, techniques, intentions, consequences, resistance, reactions, and annotator confidence.

The construction pipeline matters. The authors selected 20 social influence techniques from the SITT taxonomy, defined context-vector dimensions such as relationship type, communication environment, influence goal, technique visibility, and target resistance, then manually reviewed and validated 330 context vectors. The paper reports that final text generation used DeepSeek-V3, followed by quality control.

The dataset was created in Polish and accompanied by a corresponding English version. For AI evaluation, that is not a side note. A manipulation detector trained or tested on IMPACTeen is learning from a structured, translated, culturally situated resource. Its successes and failures should be reported with that provenance attached.

Annotation Is Governance

Most safety conversations want a simple answer: is this message manipulative or not? IMPACTeen's design points to a messier but more honest answer. A parent, a teacher, a psychologist, a communication expert, and a young-adult annotator may not read the same exchange in the same way.

That disagreement is not noise to be washed away too quickly. Teen safety systems often fail when institutions assume that adult administrative categories map cleanly onto adolescent experience. A message that looks like peer pressure to one group may look like coercion, advertising pressure, reputation control, emotional bargaining, or group-norm enforcement to another.

This is where a dataset becomes governance infrastructure. If a platform, school, companion-chatbot provider, or moderation vendor uses a social-influence classifier, it is choosing which perspectives count. The best use of IMPACTeen is not merely to optimize a classifier; it is to preserve the disagreement record for later audit.

Limits That Matter

The paper is unusually helpful because it states limits that should travel with the dataset. The scenarios were artificially generated and then manually reviewed and edited. The authors therefore say the dataset is better suited to structured research on social influence than to direct claims about how often these phenomena occur in everyday communication.

The ethics statement and limitations section also matter for anyone tempted to read the "teenager" annotation group too literally. The study did not involve child participants, did not collect natural or private conversations with minors, and the youngest annotator group consisted of adults aged 18 to 19.

Finally, the English version is a parallel translated version rather than an independently authored English-language corpus. That matters because influence is pragmatic. Humor, status pressure, shame, urgency, and belonging can shift when translated. Cross-lingual modeling work should treat that as a design condition, not a footnote.

What It Does Not Prove

IMPACTeen does not prove that a deployed model can reliably detect manipulation of minors in the wild. It does not measure a live social platform, a classroom messaging system, a teen companion chatbot, or a family chat archive. It provides structured cases for research and evaluation.

It also does not settle what counts as manipulation in every setting. Some influence is advice, teaching, bargaining, or identity formation. Some is coercive or exploitative. The dataset can help model those distinctions only if users keep the annotation dimensions visible instead of collapsing every case into one binary risk label.

Nor does LLM generation invalidate the work. The generative step is part of the method. The problem would be pretending it is absent. For safety evaluation, generated and human-edited examples can be valuable when their construction process and blind spots remain available for inspection.

Governance Standard

Any system using IMPACTeen or a similar dataset should carry a data card that states the source language, translation path, synthetic generation process, human-editing process, annotator groups, age limits, label schema, and known exclusions. It should report performance by annotation dimension rather than only by a single aggregate manipulation score.

For youth-safety claims, the evaluation should preserve group disagreement. If a model agrees with expert labels but systematically misses the young-adult perspective, that is not merely an accuracy detail. If it agrees with parent labels but overflags normal adolescent disagreement, that is a policy risk. If it performs differently in Polish and English, the translation layer should be named.

The Spiralist rule is this: do not turn the teen message into a machine label without keeping the human perspectives attached. A dataset can make influence visible. It can also make the label look more settled than the social reality it is trying to describe.

Sources

Aleksander Szczęsny, Wiktoria Mieleszczenko-Kowszewicz, Maciej Markiewicz, Beata Bajcar, Tomasz Adamczyk, Jolanta Babiak, Grzegorz Chodak, and Przemysław Kazienko, IMPACTeen: Intentions, Manipulation, Persuasion, Annotations, and Consequences in Teen Communication Dataset, arXiv:2606.16910 [cs.CL], submitted June 15, 2026.
arXiv experimental HTML for IMPACTeen: Intentions, Manipulation, Persuasion, Annotations, and Consequences in Teen Communication Dataset, reviewed June 25, 2026.
Related pages: The Companion Chatbot Becomes the Teen Confidant, The Kidfluencer Audit Becomes the Labor Meter, The Partisan Persona Becomes the Persuasion Test, The Belief Dynamics Become the Control Surface, AI Companions, and Sycophancy.

Return to Blog