Interaction-Augmented Instruction: Modeling the Synergy of Prompts and Interactions in Human-GenAI Collaboration
This is a solid CHI framework paper: it reframes prompt engineering as prompt-plus-interaction composition, then backs that reframing with a compact model, a derived set of atomic paradigms, and scenario-based demonstrations. The contribution is strongest as a descriptive and generative lens, not as a validated predictive theory.
Axes Lens
Rare contribution shape, typical evidence profile. The point here is not a score. It is to show what kind of claim the paper makes, and whether the evidence pattern is unusual or baseline in this 268 -review set.
Contribution shape
- Knowledge form
- descriptive knowledge typical · 92/268
- Novelty type
- framework typical · 59/268
- Abstraction level
- field typical · 41/268
- Generalization target
- field argument typical · 55/268
- Validation mode
- mixed methods typical · 136/268
Evidence profile
- Evidence strength
- moderate typical · 105/268
- Claim alignment
- medium typical · 32/268
- Overclaim risk
- medium typical · 210/268
Review Summary
The paper’s main value is conceptual rather than algorithmic. It identifies a real gap in GenAI interaction design: prompts alone often fail to express fine-grained or referential intent, while GUI interactions alone do not capture the full instruction. The IAI model is therefore a useful reframing because it treats the combined prompt-and-interaction package as the unit of analysis. That is a meaningful departure from common practice in which interactions are often treated as secondary controls. The novelty is not just naming the combination, but formalizing it as a compact entity–relation graph and using that structure to derive twelve recurring atomic paradigms from prior tools. That gives the paper a design-language contribution that can support comparison and synthesis across systems. The validation is appropriate for the claim type: the authors revisit a corpus of interfaces, annotate workflows, and show that the model can represent them, then use four scenarios to demonstrate application, refinement, and innovation. That is enough to support a descriptive framework and a generative design argument, but not enough to claim strong causal or predictive power. The limitations are also clear and important: the model is currently scoped to single-user, single-agent interactions, and it does not yet formalize coordination among multiple GenAI nodes or shared augmented instructions. So the paper is best read as a field-level organizing framework for a specific class of GenAI interfaces, with moderate evidence and a sensible scope, rather than as a universal theory of human-GenAI collaboration.
What Changed
Canon before
Prior CHI work on GenAI interaction often treated prompts as the primary interface, with GUI interactions used as auxiliary controls or task-specific refinements. This paper positions prompt-plus-interaction composition as the unit of analysis and formalizes it as an explicit instruction model.
Departure from common sense
The paper’s core move is to reject the default assumption that a prompt alone is the instruction, or that interactions are merely UI affordances. Instead, it argues that intent is often best expressed by composing text and interaction into one augmented instruction, which is a non-obvious reframing of human-GenAI communication.
Actual novelty
The paper’s novelty is the Interaction-Augmented Instruction (IAI) model: a compact entity–relation graph that formalizes how text prompts and GUI interactions combine into an augmented instruction for GenAI. It also derives twelve atomic interaction paradigms and uses them to characterize and compare prior tools.
Evidence
The paper proposes a formal model for prompt-plus-interaction composition, then validates it by revisiting a corpus of GenAI interfaces, extracting recurring atomic paradigms, and demonstrating use through four scenarios. The evidence supports a descriptive and generative framework claim more than a system or artifact claim.
“ To fill this gap, through a deductive and iterative process, we propose the Interaction-Augmented Instruction (IAI) model: a compact entity–relation graph that makes explicit how precise GUI interactions and free-form prompts are composed into the executable instructions consumed by GenAI (Figure 1 -B”
actual novelty · Abstract + 3 Interaction-Augmented Instruction Model · confidence 0.82
“ To fill this gap, through a deductive and iterative process, we propose the Interaction-Augmented Instruction (IAI) model: a compact entity–relation graph that makes explicit how precise GUI interactions and free-form prompts are composed into the executable instructions consumed by GenAI (Figure 1 -B”
departure from common sense · 6 Discussion (Reflection) · confidence 0.74
“ First, the IAI model and the analyzed atomic paradigms currently focus on single-user, single-agent interactions and do not yet formalize coordination among multiple GenAI nodes or the construction of shared augmented instructions”
limitation · Limitations · confidence 0.90
“ 4 Design Paradigms To examine if our IAI model can be generally applied to describe and differentiate existing designs, we revisited and annotated 66 GenAI system interfaces that combine prompts and interactions in human-GenAI communicat”
validation scope · 4 Design Paradigms (4.1 Revisit Corpora) · confidence 0.66
Limits
Method limits
The paper’s own limitation is that the model and paradigms currently focus on single-user, single-agent interactions and do not yet formalize coordination among multiple GenAI nodes or shared augmented instructions. The validation is also qualitative and corpus-based rather than experimental.
Deployment limits
The framework is most directly applicable to single-user GenAI interfaces that combine text prompts with direct manipulation. It is less ready for multi-agent, collaborative, or distributed GenAI settings where coordination and shared instruction structures matter.
Boundary conditions
The model is bounded by the interface patterns represented in the analyzed corpus and by the assumption that prompt-interaction composition can be captured as a compact entity–relation graph. Its stated scope does not yet cover multi-node coordination or shared augmented instructions.
Position in field
This sits in the CHI line of work that tries to formalize emerging GenAI interaction patterns rather than only propose one-off interfaces. Its contribution is a conceptual model and design vocabulary for combining prompts with GUI interactions, aimed at comparison, characterization, and future design.