CHI '26 · Honorable mention · full-paper review · confidence medium-high

Interaction-Augmented Instruction: Modeling the Synergy of Prompts and Interactions in Human-GenAI Collaboration

Leixian Shen , Yifang Wang , Huamin Qu , Xing Xie , Haotian Li

This is a solid CHI framework paper: it reframes prompt engineering as prompt-plus-interaction composition, then backs that reframing with a compact model, a derived set of atomic paradigms, and scenario-based demonstrations. The contribution is strongest as a descriptive and generative lens, not as a validated predictive theory.

Axes Lens

Rare contribution shape, typical evidence profile. The point here is not a score. It is to show what kind of claim the paper makes, and whether the evidence pattern is unusual or baseline in this 268 -review set.

Contribution shape

Knowledge form: descriptive knowledge typical · 92/268
Novelty type: framework typical · 59/268
Abstraction level: field typical · 41/268
Generalization target: field argument typical · 55/268
Validation mode: mixed methods typical · 136/268

Evidence profile

Evidence strength: moderate typical · 105/268
Claim alignment: medium typical · 32/268
Overclaim risk: medium typical · 210/268

Review Summary

The paper’s main value is conceptual rather than algorithmic. It identifies a real gap in GenAI interaction design: prompts alone often fail to express fine-grained or referential intent, while GUI interactions alone do not capture the full instruction. The IAI model is therefore a useful reframing because it treats the combined prompt-and-interaction package as the unit of analysis. That is a meaningful departure from common practice in which interactions are often treated as secondary controls. The novelty is not just naming the combination, but formalizing it as a compact entity–relation graph and using that structure to derive twelve recurring atomic paradigms from prior tools. That gives the paper a design-language contribution that can support comparison and synthesis across systems. The validation is appropriate for the claim type: the authors revisit a corpus of interfaces, annotate workflows, and show that the model can represent them, then use four scenarios to demonstrate application, refinement, and innovation. That is enough to support a descriptive framework and a generative design argument, but not enough to claim strong causal or predictive power. The limitations are also clear and important: the model is currently scoped to single-user, single-agent interactions, and it does not yet formalize coordination among multiple GenAI nodes or shared augmented instructions. So the paper is best read as a field-level organizing framework for a specific class of GenAI interfaces, with moderate evidence and a sensible scope, rather than as a universal theory of human-GenAI collaboration.

What Changed

Canon before

Prior CHI work on GenAI interaction often treated prompts as the primary interface, with GUI interactions used as auxiliary controls or task-specific refinements. This paper positions prompt-plus-interaction composition as the unit of analysis and formalizes it as an explicit instruction model.

Departure from common sense

The paper’s core move is to reject the default assumption that a prompt alone is the instruction, or that interactions are merely UI affordances. Instead, it argues that intent is often best expressed by composing text and interaction into one augmented instruction, which is a non-obvious reframing of human-GenAI communication.

Actual novelty

The paper’s novelty is the Interaction-Augmented Instruction (IAI) model: a compact entity–relation graph that formalizes how text prompts and GUI interactions combine into an augmented instruction for GenAI. It also derives twelve atomic interaction paradigms and uses them to characterize and compare prior tools.

Evidence

The paper proposes a formal model for prompt-plus-interaction composition, then validates it by revisiting a corpus of GenAI interfaces, extracting recurring atomic paradigms, and demonstrating use through four scenarios. The evidence supports a descriptive and generative framework claim more than a system or artifact claim.

“ To fill this gap, through a deductive and iterative process, we propose the Interaction-Augmented Instruction (IAI) model: a compact entity–relation graph that makes explicit how precise GUI interactions and free-form prompts are composed into the executable instructions consumed by GenAI (Figure 1 -B”

actual novelty · Abstract + 3 Interaction-Augmented Instruction Model · confidence 0.82

“ To fill this gap, through a deductive and iterative process, we propose the Interaction-Augmented Instruction (IAI) model: a compact entity–relation graph that makes explicit how precise GUI interactions and free-form prompts are composed into the executable instructions consumed by GenAI (Figure 1 -B”

departure from common sense · 6 Discussion (Reflection) · confidence 0.74

“ First, the IAI model and the analyzed atomic paradigms currently focus on single-user, single-agent interactions and do not yet formalize coordination among multiple GenAI nodes or the construction of shared augmented instructions”

limitation · Limitations · confidence 0.90

“ 4 Design Paradigms To examine if our IAI model can be generally applied to describe and differentiate existing designs, we revisited and annotated 66 GenAI system interfaces that combine prompts and interactions in human-GenAI communicat”

validation scope · 4 Design Paradigms (4.1 Revisit Corpora) · confidence 0.66

Limits

Method limits

The paper’s own limitation is that the model and paradigms currently focus on single-user, single-agent interactions and do not yet formalize coordination among multiple GenAI nodes or shared augmented instructions. The validation is also qualitative and corpus-based rather than experimental.

Deployment limits

The framework is most directly applicable to single-user GenAI interfaces that combine text prompts with direct manipulation. It is less ready for multi-agent, collaborative, or distributed GenAI settings where coordination and shared instruction structures matter.

Boundary conditions

The model is bounded by the interface patterns represented in the analyzed corpus and by the assumption that prompt-interaction composition can be captured as a compact entity–relation graph. Its stated scope does not yet cover multi-node coordination or shared augmented instructions.

Position in field

This sits in the CHI line of work that tries to formalize emerging GenAI interaction patterns rather than only propose one-off interfaces. Its contribution is a conceptual model and design vocabulary for combining prompts with GUI interactions, aimed at comparison, characterization, and future design.

Abstract

Text prompt is the most common way for human-generative AI (GenAI) communication. Though convenient, it is challenging to convey fine-grained and referential intent. One promising solution is to combine text prompts with precise GUI interactions, like brushing and clicking. However, there lacks a formal model to capture synergistic designs between prompts and interactions, hindering their comparison and innovation. To fill this gap, via an iterative and deductive process, we develop the Interaction-Augmented Instruction (IAI) model, a compact entity–relation graph formalizing how the combination of interactions and text prompts enhances human-GenAI communication. With the model, we distill twelve recurring and composable atomic interaction paradigms from prior tools, verifying our model’s capability to facilitate systematic design characterization and comparison. Four usage scenarios further demonstrate the model’s utility in applying, refining, and innovating these paradigms. These results illustrate the IAI model’s descriptive, discriminative, and generative power for shaping future GenAI systems.