CHI '26 · Honorable mention · full-paper review · confidence medium-high

An Exploration of Default Images in Text-to-Image Generation

Hannu Simonen , Atte Kiviniemi , Hannah Johnston , Helena Barranha , Jonas Oppenlaender

This is a solid mixed-methods CHI paper that turns an intuitive but underexamined generative-AI failure mode into an empirically grounded phenomenon. Its main contribution is not a new interaction technique, but a careful characterization of default images in Midjourney, backed by scale, user evidence, and explicit limits on causal interpretation.

Axes Lens

Rare contribution shape, typical evidence profile. The point here is not a score. It is to show what kind of claim the paper makes, and whether the evidence pattern is unusual or baseline in this 268 -review set.

Contribution shape

Knowledge form: descriptive knowledge typical · 92/268
Novelty type: empirical finding typical · 68/268
Abstraction level: system typical · 61/268
Generalization target: specific instance rare · 1/268
Validation mode: mixed methods typical · 136/268

Evidence profile

Evidence strength: strong typical · 158/268
Claim alignment: strong typical · 231/268
Overclaim risk: medium typical · 210/268

Review Summary

This paper’s value lies in making an implicit generative-AI behavior legible as a research object. The common-sense expectation in text-to-image systems is that prompts should steer outputs in a prompt-sensitive way; here, the authors show that some prompts instead converge on recurring default images, which can look plausibly responsive while actually reflecting a system-level fallback. That is a meaningful departure from naive assumptions about prompt-to-image mapping. The novelty is primarily empirical: the paper claims to be the first investigation of default images on Midjourney, and the evidence packet supports that framing through a sequence of studies rather than a single anecdotal example. The validation scope is also unusually broad for this topic: manual prompt elicitation and ablation studies establish the phenomenon, a computational analysis over 757,728 images shows consistency across unrelated prompts, and an online user study with 48 participants connects the phenomenon to user satisfaction. At the same time, the paper is careful about what it cannot conclude. Without access to training data or internal model mechanisms, it cannot prove why default images arise, and the authors explicitly note that platform internals such as sampling or prompt rewriting may contribute. So the strongest reading is descriptive and methodological rather than causal. In CHI terms, this is a strong mixed-methods contribution that expands the field’s understanding of failure modes in text-to-image systems, but it should be cited as Midjourney-specific evidence rather than a universal theory of all TTI models. The paper also has a useful positioning effect for the field: it gives researchers and designers a concrete label, a detection strategy, and a set of boundary conditions for thinking about fallback behavior in generative interfaces. That makes it especially relevant for future work on transparency, prompt assistance, auditing, and evaluation benchmarks. The main caution is scope: the phenomenon is demonstrated in one proprietary system under specific prompt and filtering choices, so the paper should not be overgeneralized to all image generators or to all prompt regimes.

What Changed

Canon before

Before this paper, default images in text-to-image generation were not established as a named, empirically studied phenomenon in the CHI literature; the paper positions itself as an initial investigation on Midjourney.

Departure from common sense

The paper argues that a text-to-image system can still return a stable-looking image pattern even when the prompt contains unknown terms, so the output may appear meaningful while actually reflecting a default behavior rather than prompt-specific generation.

Actual novelty

The paper’s novelty is an empirical and methodological characterization of a previously underexamined generative-AI failure mode: it names and operationalizes “default images,” then shows that Midjourney can repeatedly emit visually similar outputs across dissimilar prompts. The contribution is not a new interaction technique or model architecture, but a mixed-methods account that combines prompt elicitation, affinity-based image grouping, ablation studies, a large-scale computational scan, and a user study to establish the phenomenon as a real, user-relevant behavior. That makes the work valuable as a descriptive foundation for later benchmarking, auditing, and interface design around generative-image failures.

Evidence

The paper combines manual prompt construction, ablation studies, a computational analysis of 757,728 Midjourney images, and an online user study with 48 participants. The evidence supports the existence and practical relevance of default images, but the claims remain tied to Midjourney and the studied prompt conditions.

“ We present the first investigation into default images on Midjourney. We describe an initial study in which we manually created input prompts triggering default images, and several ablation studies”

actual novelty · Abstract + Introduction · confidence 0.80

“ When such a prompt is given to the TTI model, it may generate what we call default images – visually similar images resulting from dissimilar prompts (see figures 1 and 5 , and Appendix”

departure from common sense · Abstract + Introduction · confidence 0.66

“ These approaches are applied at training time and assume full access to either the model or training data, or both, making them impracticable for use in black-box models, such as Midjourney, where the training data is not known and the model is not directly accessibl”

limitation · Limitations and Future Work (6.6) · confidence 0.70

“1 Setting the model version aspect ratio 1 Standardizing the image format to square stylize 0 Controlling the artistic flair chaos 0 Controlling surprising random outputs weird 0 Controlling quirky unconventional outputs seed value 123456 Ensuring consistency and reproducibility Image generation parameters used in manual study As a sensitivity analysis to test the robustness of default images, we conducted five small-scale ablation studies.”

validation scope · Method overview + Results · confidence 0.62

Limits

Method limits

The study is constrained by lack of access to Midjourney’s training data and internal model mechanisms, so causal explanations remain inferential rather than directly verified.

Deployment limits

Findings are specific to Midjourney and the prompt conditions studied; transfer to other text-to-image systems or broader prompt types is not established here.

Boundary conditions

The phenomenon is examined in Midjourney, with emphasis on short prompts and prompts that trigger default-image behavior; platform internals such as sampling or prompt rewriting may affect outcomes.

Position in field

This work opens a new empirical line of inquiry for CHI on failure modes and default behaviors in generative image systems, connecting prompt engineering, model opacity, and user satisfaction. It is best read as a field-setting descriptive study rather than a universal theory of all text-to-image models.

Abstract

In the creative practice of text-to-image (TTI) generation, images are synthesized from textual prompts. By design, TTI models always yield an output, even if the prompt contains unknown terms. In this case, the model may generate default images: images that closely resemble each other across many unrelated prompts. Studying default images is valuable for designing better solutions for prompt engineering and TTI generation. We present the first investigation into default images on Midjourney. We describe an initial study in which we manually created input prompts triggering default images, and several ablation studies. Building on these, we conduct a computational analysis of over 750,000 images, revealing consistent default images across unrelated prompts. We also conduct an online user study investigating how default images may affect user satisfaction. Our work lays the foundation for understanding default images in TTI generation, highlighting their practical relevance as well as challenges and future research directions.