CHI '26 · Best paper · full-paper review · confidence high

Towards Fluent Interaction with Cyber-Physical Architecture

Jesse T Gonzalez , Neeta M Khanuja , Michael Mingxuan Li , Maggie Guo , Layomi Olaitan , Emily Lau , Jenny Pugh , Alexandra Ion , Scott E Hudson

DOI PDF Program page

This paper is compelling because it does not treat shape-changing architecture as just a bigger smart-home interface. Instead, it shows that room-scale robotic environments demand a different interaction model: users mix modalities, leave intent partly unstated, and still expect the system to respond intelligently and preserve autonomy.

Video Figure

Axes Lens

Rare contribution shape, typical evidence profile. The point here is not a score. It is to show what kind of claim the paper makes, and whether the evidence pattern is unusual or baseline in this 268 -review set.

Contribution shape

Knowledge form: generative knowledge typical · 35/268
Novelty type: framework typical · 59/268
Abstraction level: system typical · 61/268
Generalization target: design family typical · 38/268
Validation mode: mixed methods typical · 136/268

Evidence profile

Evidence strength: strong typical · 158/268
Claim alignment: strong typical · 231/268
Overclaim risk: low typical · 53/268

Review Summary

This is a strong and field-shaping CHI contribution because it addresses a genuinely unusual interaction setting: architectural-scale robotic environments that are persistent, physically consequential, and socially embedded. The paper’s main strength is that it does not rely on futurist speculation alone. Instead, it combines two complementary studies to connect aspirational design values with observed interaction behavior. The workshop component surfaces eleven values and, importantly, identifies tensions that are likely to define the area, especially proactive assistance versus autonomy and personalization versus public ownership. The elicitation study then grounds those tensions in practice by showing that users do not converge on a single preferred modality. They fluidly combine voice, gesture, and touch, and many shift toward multimodal strategies as they learn what the system can do. That empirical pattern supports the paper’s broader conceptual move: interaction with cyber-physical architecture should not be framed as a simple mapping from explicit commands to responses, but as ongoing inference over evolving user intent using multimodal evidence. That is the paper’s real novelty and its clearest departure from common assumptions in smart environments and HRI. The evidence is also responsibly bounded. The authors explicitly acknowledge homogeneous participants, a simplified lab task, possible Wizard-of-Oz interpretation effects, and the absence of long-term adaptation data. Those caveats matter, especially for claims about deployment in homes or public spaces. Even with those limits, the paper succeeds because it gives the field a credible conceptual vocabulary and an empirically grounded agenda for designing fluent, trustworthy cyber-physical architecture.

What Changed

Canon before

The dominant assumption in interactive environments has been that built environments are mostly static and that robotic or automation elements embedded within them operate with explicit, user-initiated commands that precisely map to system responses. Prior work in robotic interaction has emphasized direct manipulation or tightly constrained control paradigms, usually with smaller scale or mobile robots, and less focus on architectural-scale, shape-changing interactive environments that remain continuously present. It is also assumed that users primarily want precise and explicit control, with limited tolerance for autonomous actions by the environment that infer or predict user intent.

Departure from common sense

This paper challenges the assumption that users desire strictly explicit control in architectural-scale robotic environments. Instead, it shows that users simultaneously want to feel in control yet expect the environment to infer context and intent that they do not explicitly communicate, creating a paradox where robotic environments must take autonomous actions based on implicit and evolving user intent to be naturally controllable and trusted. This breaks the conventional belief that automation should remain strictly reactive and fully user-commanded in smart environments.

Actual novelty

The paper contributes a set of eleven design values for shape-changing environments derived from speculative design workshops with architects and designers, revealing fundamental tensions between proactive automation and autonomy, and personalization and public ownership. It also provides an empirical characterization of multimodal user interaction strategies with a shape-changing robotic wall obtained through a Wizard-of-Oz elicitation study, including a modal entropy metric to quantify the diversity and evolution of interaction styles. Finally, it proposes reframing interaction as a dynamic, modality-agnostic model of evolving user intent that integrates multimodal evidence over time rather than mapping discrete commands to responses, enabling fluent, contextual interactions in cyber-physical architectural systems.

Evidence

The paper supports its claims with two complementary studies described in the abstract, introduction, workshop analysis, elicitation-study analysis, discussion, and limitations. The workshop study identifies eleven values from speculative design sessions, while the Wizard-of-Oz study analyzes over 2,000 coded multimodal actions and reports that most participants ended with a multimodal preference. The discussion and conclusion tie these findings to a broader argument for intent inference rather than explicit command mapping, and the limitations section clearly bounds generalization by participant homogeneity, lab context, wizard interpretation, and lack of long-term study.

“icate with robotic environments. Specifically, we contribute: (1) Eleven design values for shape-changing environments, derived from speculative workshops with designers and architects. These reveal fundamental tensions between proactive assistance and user autonomy, and between personalization and public ownership. (2) An empirical characterization of multimodal interaction wi”

actual novelty · 1 Introduction · confidence 0.97

“ This leads us to a bit of a paradox: In order to make these systems naturally controllable for users, we may have to allow these systems to take actions that users do not explicitly command”

departure from common sense · 5 Discussion · confidence 0.96

“ Finally, we note that the study focused on initial interactions and did not examine long-term adaptation or learning over time”

limitation · 7 Limitations · confidence 0.99

“ This process resulted in over 2,000 identified actions, tagged with a set of roughly 200 individual codes”

validation scope · 4.2 Analysis Method · confidence 0.95

Limits

Method limits

The speculative design workshops involved participants with architecture and design backgrounds only, limiting population diversity and generalizability of identified values. The Wizard-of-Oz elicitation study was conducted in a controlled lab setting with a simplified block placement task, which may not fully capture naturalistic interactions in real-world environments. The wizard controlling the robotic wall may have introduced interpretation bias. Long-term adaptation and learning over prolonged use were not studied.

Deployment limits

The studied robotic wall prototype has limited mechanical constraints and cannot support body weight, limiting direct scalability to large-scale load-bearing structures. The interaction strategies were studied with a human-operated system; fully autonomous systems will face additional technical challenges to infer intent robustly. Privacy, personalization, and ownership issues identified require further work for acceptable real-world deployment.

Boundary conditions

Findings apply primarily to shape-changing, architectural-scale environments with continuous presence, involving multimodal human input such as voice, gesture, and touch. Results may vary with different user populations, contexts such as private versus public settings, or other robotic and environmental forms. The proposed dynamic intent modeling requires systems with sufficient sensing, memory, and adaptive capabilities. Behavioral generalization beyond the studied prototype requires care due to technological and contextual variability.

Position in field

This work advances the emerging field of cyber-physical architecture by foregrounding user values and naturalistic multimodal interaction strategies at architectural scale. It challenges prevailing HCI assumptions about explicit control in smart environments and contributes a framework emphasizing dynamic intent modeling that integrates multimodal evidence. The paper bridges human-robot interaction with tangible and ambient intelligence research, grounding design theory in empirical evidence and setting a foundation for future design and technical research on fluent, trustworthy robotic environments.

Abstract

What happens when your walls begin to move? This paper explores the design of human-robot interaction for architectural-scale, shape-changing environments. We present findings from two studies: (1) a series of speculative design workshops (N=20) that uncovered aspirational visions for these spaces, and (2) a task-based Wizard-of-Oz elicitation study (N=12) that grounded these visions in the challenges of practical interaction. Our workshop findings reveal a complex landscape of user desires, exposing critical tensions between proactive automation and the preservation of user autonomy, and between personalization and public ownership. Our elicitation study reveals a set of core interaction challenges related to multimodal collaboration; and, most critically: suggests the need for a modality-agnostic model of evolving user intent. We conclude with a set of grounded proposals for creating robotic environments that are collaborative and trusted partners in everyday life.