CHI '26 · Honorable mention · full-paper review · confidence medium-high

Playing the Imitation Game: How Perceived Generated Content Shapes Player Experience

Mahsa Bazzaz , Seth Cooper

This paper’s value is not in proving that players can spot AI content, but in showing that perceived authorship itself changes experience. That is a useful CHI reframing: the social meaning of generation matters, and weak cues can bias enjoyment, aesthetics, frustration, and challenge even when players are wrong.

Axes Lens

Rare contribution shape, typical evidence profile. The point here is not a score. It is to show what kind of claim the paper makes, and whether the evidence pattern is unusual or baseline in this 268 -review set.

Contribution shape

Knowledge form: descriptive knowledge typical · 92/268
Novelty type: empirical finding typical · 68/268
Abstraction level: task typical · 36/268
Generalization target: task class typical · 63/268
Validation mode: mixed methods typical · 136/268

Evidence profile

Evidence strength: moderate typical · 105/268
Claim alignment: strong typical · 231/268
Overclaim risk: medium typical · 210/268

Review Summary

This paper makes a clean and timely CHI contribution by shifting the unit of analysis from content detectability to perceived authorship. In other words, it does not merely ask whether players can tell AI-generated levels from human-made ones; it asks what happens to player experience once they form a belief about the creator. That reframing is the main novelty, and it is well aligned with the evidence presented in the abstract and the paper’s stated focus. The reported pattern is intuitively surprising in a useful way: players could not reliably identify the creator, yet their ratings tracked belief rather than truth, with human-made beliefs associated with more fun and aesthetics and AI-made beliefs associated with more frustration and challenge. For CHI, that is a meaningful empirical finding because it highlights a perception bias that can shape interaction outcomes even when the underlying content is identical or misattributed. The validation is reasonably scoped but not broad: the study uses a mixed-method survey on two short, 2D tile-based benchmark games, which makes the result credible for that setting but not automatically generalizable to other genres, longer play, or richer generative systems. The paper is also careful enough to note that the design is observational, so it cannot establish causality; that restraint matters because the strongest interpretation is associative rather than causal. Overall, I would read this as a solid, field-relevant paper with a clear conceptual pivot and a bounded but persuasive empirical basis. Its main limitation is external validity, not internal coherence: the claims fit the evidence, but the evidence is narrow enough that deployment beyond similar game contexts should be treated cautiously.

What Changed

Canon before

Prior CHI and PCG work often emphasized whether players can detect generated content or whether generated levels match human-authored ones; this paper instead centers perceived authorship as the driver of player experience.

Departure from common sense

Players’ enjoyment and frustration track what they believe about who made the level more than who actually made it, which runs against the intuitive expectation that the objective creator should matter most.

Actual novelty

The paper reframes the question from detectability to perception effects: it asks how players’ experience changes based on their belief about the content’s creator, and reports that negative bias can arise spontaneously from unreliable human-likeness cues rather than explicit disclosure.

Evidence

The paper’s core evidence is a mixed-method survey comparing procedurally generated and human-designed levels in Super Mario Bros. and Sokoban. It reports that players could not reliably identify creator, yet beliefs about creator were strongly associated with fun, aesthetics, frustration, and challenge. The study also explicitly frames a shift from technical detectability to perceived creator effects.

“ As for the reason for their judgment, the same game feature (an enemy placed near the player’s spawning position) was interpreted as evidence of AI design by one participant and as evidence of human design by another, illustrating how identical cues can support opposite judgments. Players have different perceptions about human-likeness. The same game feature (an enemy placed near the”

actual novelty · Introduction · confidence 0.93

“ It was restricted to two short, 2D tile-based games (Super Mario Bros. and Sokoban)”

departure from common sense · Discussion 5.2 · confidence 0.95

“ As for the reason for their judgment, the same game feature (an enemy placed near the player’s spawning position) was interpreted as evidence of AI design by one participant and as evidence of human design by another, illustrating how identical cues can support opposite judgments. Players have different perceptions about human-likeness. The same game feature (an enemy placed near the”

limitation · Introduction / Limitations and future directions · confidence 0.98

“ It was restricted to two short, 2D tile-based games (Super Mario Bros. and Sokoban)”

validation scope · Limitations and future directions · confidence 0.96

Limits

Method limits

The study is observational rather than causal, so it supports association between perceived creator and experience but cannot prove that beliefs alone caused the reported experience differences. The evidence is also based on short, 2D tile-based levels and single-study survey measures.

Deployment limits

Findings are most directly applicable to short, tile-based game content and to contexts where players infer authorship from weak cues. Broader deployment across genres, longer play sessions, or richer game systems remains uncertain.

Boundary conditions

Effects were observed in two benchmark 2D games and under conditions where creator identity was not reliably detectable. The reported bias appears tied to ambiguous cues of human-likeness and may differ when authorship is explicit or when content is more complex.

Position in field

This is a CHI-facing contribution to game AI and generative-content perception: it moves beyond detection accuracy toward the human consequences of perceived authorship, offering evidence that belief about generation can shape player experience even when the belief is wrong.

Abstract

With the fast progress of generative AI in recent years, more games are integrating generated content, raising questions regarding how players perceive and respond to this content. To investigate, we ran a mixed-method survey on the games Super Mario Bros. and Sokoban, comparing procedurally generated levels and levels designed by humans to explore how perceptions of the creator relate to players' overall experience of gameplay. Players could not reliably identify the level's creator, yet their experiences were strongly linked to their beliefs about that creator rather than the actual truth. Levels believed to be human-made were rated as more fun and aesthetically pleasing. In contrast, those believed to be AI-generated were rated as more frustrating and challenging. This negative bias appeared spontaneously without knowing the levels' creator and often was based on unreliable cues of "human-likeness.'' Our results underscore the importance of understanding perception biases when integrating generative systems into games.