CHI '26 · Honorable mention · full-paper review · confidence medium-high

The People's Gaze: Co-Designing and Refining Gaze Gestures with Users and Experts

Yaxiong Lei , Xinya Gong , Shijing He , Yafei Wang , Mohamed Khamis , Juan Ye

This is a solid CHI method paper: the main contribution is not a single clever gesture, but a defensible pipeline for deriving gaze gestures from users and then tightening them with experts. The strongest part is the conceptual correction around gaze’s continuity and intentionality; the main caveat is that the paper stops short of proving deployment performance.

Axes Lens

Rare contribution shape, typical evidence profile. The point here is not a score. It is to show what kind of claim the paper makes, and whether the evidence pattern is unusual or baseline in this 268 -review set.

Contribution shape

Knowledge form: method knowledge typical · 29/268
Novelty type: method typical · 21/268
Abstraction level: interaction typical · 22/268
Generalization target: methodological argument typical · 16/268
Validation mode: mixed methods typical · 136/268

Evidence profile

Evidence strength: strong typical · 158/268
Claim alignment: strong typical · 231/268
Overclaim risk: medium typical · 210/268

Review Summary

This paper reads as a strong honorable-mention style contribution because it addresses a real mismatch in the field: gaze interaction has often been designed as if the eye were just another pointing device, while the authors argue—convincingly—that gaze is continuous and therefore needs explicit mechanisms for intentionality. The core novelty is methodological rather than a standalone artifact: four co-design workshops with 20 non-experts produced 102 concepts, and four gaze experts then refined those ideas into 32 gestures. That pipeline matters because it turns gesture design into a structured process and surfaces a compositional grammar of activation plus action, with dwell and blink serving as the intentional “gate” that helps mitigate Midas Touch. The evidence is strong for a design-method contribution and for a user-grounded account of how people naturally reason about gaze gestures. At the same time, the paper is careful about scope: it explicitly says the evaluation is based on expert judgement and workshop ratings, not running recognizers, and it does not claim command-level error rates, throughput, or long-term memorability. The limitations are therefore important and well stated: a small expert panel, recruitment from university communities and mailing lists, and a predefined 9-point grid all constrain generalization. So the paper’s value is in the conceptual reframing and the reusable methodology for deriving gaze gestures, not in proving a deployable interaction system end to end.

What Changed

Canon before

Prior gaze-gesture work has often been expert-driven and treated gaze as a direct analogue of touch, with gesture sets designed top-down rather than grounded in how users naturally conceptualize eye movement.

Departure from common sense

The paper challenges the intuitive but flawed “eye as a finger” framing: gaze is continuous, lacks an inherent off state, and therefore needs explicit intentionality mechanisms rather than direct touch-like mapping. The authors argue that non-experts independently converge on dwell and blink as ways to manage this continuous signal.

Actual novelty

The paper’s novelty is a two-phase co-design-and-expert-refinement method that yields a compositional gaze grammar: activation (dwell) plus action (gaze gesture or blink). Rather than only proposing a gesture set, it combines user-generated concepts with expert filtering into a structured design process and a refined set of 32 gestures.

Evidence

Evidence supports a method contribution grounded in co-design workshops and expert review. The paper reports 20 non-experts generating 102 concepts, then four gaze experts refining them to 32 gestures. It also explicitly states that evaluation centers on expert judgement and workshop ratings, not recognizer performance, and that limitations include a small expert panel, university-biased recruitment, and a predefined 9-point grid.

“ Through a two-phase methodology that combined participatory co-design workshops with 20 non-experts and a structured peer-review by four domain experts, we developed and validated a set of 32 gestures”

actual novelty · Abstract/Contribution (also echoed in Discussion §7.2.1) · confidence 0.76

“ Lesson to Learn: Gaze is not a direct replacement for touch; it’s a distinct modality requiring explicit mechanisms to manage its continuous si”

departure from common sense · Discussion §7.1.1 The Flawed Metaphor: The Eye as a Finger · confidence 0.78

“ Despite scaffolding non-expert creativity through a 9-point grid and briefing materials, limited familiarity with gaze interaction may have constrained the realism or technical depth of some proposals”

limitation · Limitations §8 · confidence 0.86

“ We do not, for example, report command-level error rates, throughput, or long-term memorability in deployed applications”

validation scope · Limitations §8 (also supported by Phase 2 description) · confidence 0.84

Limits

Method limits

Validation is not an end-to-end system test: the paper states it focuses on expert judgement and workshop ratings rather than running recognizers, and does not report command-level error rates, throughput, or long-term memorability. The expert panel is small, and the design space was constrained by a predefined 9-point grid.

Deployment limits

The resulting gesture set is best understood as a design foundation for gaze-enabled mobile interfaces, not as a proven deployable recognizer. Practical deployment would still need recognition robustness, calibration handling, user learning studies, and performance evaluation in real-world contexts.

Boundary conditions

The approach is bounded by mobile gaze interaction, a 9-point grid, and a participant pool drawn mainly from university communities and mailing lists. The expert refinement phase used only four experts, so the resulting set reflects a narrow but informed consensus rather than broad field-wide agreement.

Position in field

This sits in CHI’s interaction-technique/method space: it reframes gaze gestures as a user-grounded, expert-validated design problem and contributes a process for deriving intentional, distinguishable gestures rather than only a final gesture catalog.

Abstract

As eye-tracking becomes increasingly common in modern mobile devices, the potential for hands-free, gaze-based interaction grows, but current gesture sets are largely expert-designed and often misaligned with how users naturally move their eyes. To address this gap, we introduce a two-phase methodology for developing intuitive gaze gestures. First, four co-design workshops with 20 non-expert participants generated 102 initial concepts. Next, four gaze interaction experts reviewed and refined these into a set of 32 gestures. We found that non-experts, after a brief introduction, intuitively anchor gestures in familiar metaphors and develop a compositional grammar; i.e., activation (dwell) + action (gaze gesture or blink), to ensure intentionality and mitigate the classic Midas Touch problem. Experts prioritized gestures that are ergonomically sound, aligned with natural saccades, and reliably distinguishable. The resulting user-grounded, expert-validated gesture set, along with actionable design principles, provides a foundation for developing intuitive, hands-free interfaces for gaze-enabled devices.