CHI '26 · Best paper · full-paper review · confidence high

Interactive Explainable Ranking

Chao Zhang , Abe Davis

This paper stands out because it changes the interaction contract of ranking tools: instead of keeping rankings explainable at all times, it lets users create and inspect mismatches between their preferred ordering and the current weighted explanation. That move makes inconsistency visible and actionable, and the mixed-methods evaluation suggests the approach can improve consistency and confidence while remaining cautious about small-sample limits.

Video Figure

Axes Lens

Rare contribution shape, typical evidence profile. The point here is not a score. It is to show what kind of claim the paper makes, and whether the evidence pattern is unusual or baseline in this 268 -review set.

Contribution shape

Knowledge form: generative knowledge typical · 35/268
Novelty type: interaction technique less common · 7/268
Abstraction level: task typical · 36/268
Generalization target: task class typical · 63/268
Validation mode: mixed methods typical · 136/268

Evidence profile

Evidence strength: strong typical · 158/268
Claim alignment: strong typical · 231/268
Overclaim risk: low typical · 53/268

Review Summary

Interactive Explainable Ranking is compelling because its main contribution is not merely a better ranking algorithm, but a better way to structure human judgment in messy, subjective decisions. The paper argues that prior decision-making tools often avoid disagreement between a ranking and its explanation by constraining what users can edit. Here, the authors deliberately remove that constraint. Users can manipulate the target ranking, criteria, and weights independently, and the system then visualizes the resulting conflicts. That is a meaningful HCI contribution because it treats inconsistency as diagnostic: a sign of missing criteria, unstable reasoning, or unresolved trade-offs. The ERR framing is therefore both a conceptual and interaction-design contribution. UIS complements that idea well. Rather than letting AI or optimization silently determine outcomes, the system uses uncertain priors to accelerate sorting while preserving human verification of each ranking decision. This is a thoughtful middle ground between manual ranking and opaque automation, and it aligns with the paper’s emphasis on accountability and reflection. The validation package is also appropriate for the claim level. The within-subjects study is explicitly aimed at usability and consistency relative to a traditional tool, while the two case studies probe richer, more realistic tasks such as ranking short films and grading open-ended projects. Those cases help show that the tool is not limited to toy examples and that it can support nuanced reasoning in domains where criteria are contested or emergent. Just as importantly, the paper is careful about its limits. It directly says the statistically significant study results are not conclusive because the sample is small, and it points to longitudinal deployment and ablation studies as future work. That restraint lowers overclaim risk. Overall, the paper makes a strong contribution to explainable decision support by showing that systems can help users reason better not by eliminating contradictions, but by surfacing and working through them.

What Changed

Canon before

Existing multi-criteria decision-making tools limit adaptability by requiring rankings to be explainable by construction, often assuming fixed and known criteria. Humans are known to be inconsistent and biased in such complex decision making. Prior tools constrain user interaction to weights or fixed criteria and typically do not allow exploration of inconsistencies between ranking and explanation. Use of AI in decision-making tools raises ethical concerns regarding bias and abuse. Ranking open-ended projects in university courses poses challenges of subjectivity, inconsistency, and scalability that existing methods fail to address.

Departure from common sense

Instead of forcing the user’s ranking to always be explainable by fixed criteria and weights, the paper permits and visualizes conflicts between the ranking and its explanation, treating such conflicts as opportunities to identify inconsistencies or missing criteria. This breaks the common assumption that rankings must be explainable by construction, by allowing users to edit ranking, criteria, and weights independently and then resolve contradictions interactively.

Actual novelty

The paper introduces explainable ranking as a task where rankings and explanations are explored simultaneously, then operationalizes this through Explanation-Rank Resolution (ERR), which lets users independently edit rankings, criteria, and weights while surfacing contradictions between target and explained orderings. It also adds User Insertion Sort (UIS), which uses uncertain priors such as AI or optimization while ensuring each ranking decision is checked by a human user.

Evidence

The focused sections support the core claims directly. The introduction and ERR section explicitly state that the system departs from prior tools by allowing disagreement between rankings and explanations rather than preventing it. The abstract and key-innovation framing support the novelty claim that the tool makes ranking, explanation, and conflict resolution interactive while incorporating UIS for human-checked use of uncertain priors. Validation evidence comes from a within-subjects study aimed at testing usability and consistency against a traditional tool, plus two qualitative case studies on harder ranking tasks including short films and open-ended project grading. The paper also includes a clear limitation statement that the statistically significant within-subjects results are not claimed to be conclusive because of the small sample size, and it frames broader deployment and ablation work as future evaluation.

“ To assist users in explainable ranking, our tool makes the three loops interactive: (1) rank choices, (2) explain with criteria and weights, and (3) identify & resolve conflicts. It visualizes co”

actual novelty · Abstract · confidence 0.96

“those options (a ranking) that is consistent with some weighted combination of simpler or less ambiguous criteria (an explanation). What distinguishes our framing of explainable ranking from problems addressed in previous work is the simultaneous exploration of both rankings and explanations, without the constraint that the two should always remain consistent”

departure from common sense · 1 Introduction · confidence 0.97

“While we computed statistical significance in the with-in subjects study, we do not claim the results to be conclusive due to the small sample size.”

limitation · 10.6.4 Deployment and Longitudinal Study. · confidence 0.98

“8 Study 1: Within-Subjects Study The within-subjects study aims to examine the usability of our tool and whether it can help users create more consistent, explainable rankings compared to a traditional ranking tool.”

validation scope · 8 Study 1: Within-Subjects Study · confidence 0.94

Limits

Method limits

The paper explicitly states that the within-subjects study is not conclusive because of the small sample size. The current evidence is therefore strongest as an initial mixed-methods validation rather than definitive proof of broad effectiveness across all ranking contexts.

Deployment limits

Real-world deployment constraints are visible in the grading case study, where AI features were disabled because of privacy regulations. The paper also positions semester-long deployment, higher-stakes use, and ablation studies as future work, so long-term organizational adoption and robustness are not yet established.

Boundary conditions

The contribution is best suited to subjective multi-criteria ranking tasks where users can articulate, revise, and negotiate criteria over a moderate number of items. Its benefits are most plausible when conflicts between preferred rankings and current explanations are informative rather than unacceptable, and where human oversight over AI or optimization suggestions is important.

Position in field

This paper extends decision-making and explainability tools by treating inconsistency between ranking and explanation as a productive interaction resource rather than a failure to be designed away. It contributes an HCI framing of explainable ranking that combines visualization, optimization, and optional AI with explicit human checking and reflective conflict resolution.

Abstract

We propose an interactive decision-making tool for discovering and exploring explainable rankings for a given set of choices (e.g., job offers, vacation destinations, award candidates). We define an explainable ranking as an ordering of choices based on some consistent weighting of measured criteria. Our tool is designed to help users explore different orderings, criteria, and criterion weights in search of an explainable ranking that reflects their own personal preferences. To achieve this, we combine visualization, optimization, and (optionally) the integration of AI to help users identify and correct or explain inconsistencies in their evaluation of different choices. Through user experiments, we demonstrate that our tool leads to more consistent explainable rankings with greater user confidence.