Can LLM-Simulated Practice and Feedback Upskill Human Counselors? A Randomized Study with 90+ Novice Counselors
This is a strong CHI paper because it does not stop at showing that LLM patient simulation is feasible; it tests whether the training actually changes novice counselor behavior. The key result is nuanced and important: practice alone is not enough, and feedback appears to be the mechanism that drives improvement.
Axes Lens
Rare contribution shape, typical evidence profile. The point here is not a score. It is to show what kind of claim the paper makes, and whether the evidence pattern is unusual or baseline in this 268 -review set.
Contribution shape
- Knowledge form
- causal knowledge typical · 31/268
- Novelty type
- empirical finding typical · 68/268
- Abstraction level
- practice typical · 85/268
- Generalization target
- user population typical · 75/268
- Validation mode
- controlled experiment typical · 47/268
Evidence profile
- Evidence strength
- strong typical · 158/268
- Claim alignment
- strong typical · 231/268
- Overclaim risk
- medium typical · 210/268
Review Summary
This paper’s main contribution is not a new counseling interface in the narrow sense, but a credible causal evaluation of whether LLM-simulated practice can actually upskill novice counselors. That distinction matters in CHI: many systems are compelling as prototypes, but far fewer are tested with randomized evidence against a meaningful comparison condition. Here, the authors compare practice alone versus practice plus feedback in a study with 94 novice counselors, and the results are clear enough to matter. The feedback condition improved client-centered microskills such as reflections and questions, while practice alone did not improve those skills. Even more striking, empathy declined over time in the practice-only group and was significantly worse than in the feedback group. That is a useful departure from common intuition that more practice by itself should help; the paper shows that simulated interaction without structured feedback may be insufficient and can even leave learners stuck in a solution-oriented mode. The novelty is therefore empirical and methodological: the paper provides randomized evidence about the training value of LLM simulation, rather than only arguing that the simulation is realistic or usable. At the same time, the validation scope is appropriately bounded. The demonstrated effects are on client-centered microskills in short, text-based, immediate pre/post training sessions with novice counselors. The paper itself notes that it measures immediate skill acquisition rather than long-term retention or transfer to real clinical encounters, so the results should not be overgeneralized to actual therapy practice, other modalities, or broader counselor populations. Within those limits, the study is strong and well aligned with its claims, and it makes a persuasive case that LLM-based training systems need structured feedback to produce meaningful skill development.
What Changed
Canon before
Prior CHI and HCI work on LLM-based training for counseling has emphasized simulated interactions, usability, and output quality, but not rigorous evidence that such systems improve novice counselor skill development.
Departure from common sense
A simulated patient alone is not enough to upskill novice counselors; in this study, practice without feedback not only failed to improve performance but was associated with worse empathy over time than practice plus feedback.
Actual novelty
The paper’s novelty is a randomized evaluation of an LLM-simulated counseling practice-and-feedback system with novice counselors, directly testing whether the system improves skill development rather than only whether the simulation is plausible or usable.
Evidence
The paper reports a randomized study with 94 novice counselors comparing practice alone versus practice with feedback. The feedback condition improved client-centered microskills such as reflections and questions, while practice alone showed no improvement. Empathy declined in the practice-alone group and was significantly worse than in the feedback group. The paper also states that prior work had not evaluated whether LLM-simulated training systems promote novice skill improvements.
“ Despite increasing interest in using LLMs in mental health, to our knowledge, this is the first study to conduct a large-scale evaluation ( N = 94) of an LLM-based training system for developing core skills in novice counselors”
actual novelty · Introduction gap statement · confidence 0.74
“ We developed an LLM-simulated practice and feedback system and conducted a randomized study with 94 novice counselors, comparing practice alone versus practice with feedback”
departure from common sense · Abstract + Results summary (practice-only empathy decline) · confidence 0.80
“ Regardless of the measurement approach employed, our pre-post randomized study focused primarily on assessing immediate skill acquisition rather than long-term retention or transfer to real-world clinical encounters with actual patients”
limitation · Limitations and Future Work · confidence 0.78
“ The efficacy of LLM-based practice and feedback training was demonstrated only for client-centered microskills, which represent foundational communication techniques that may serve as a base for therapeutic practice”
validation scope · Scope statement in discussion/limitations · confidence 0.72
Limits
Method limits
The evidence is based on a pre-post randomized study focused on immediate skill acquisition, with behavioral scoring of client-centered microskills and qualitative reflections. The paper itself frames the assessment as immediate rather than long-term, so the causal claims should not be extended beyond the measured training interval.
Deployment limits
The validated setting is novice counselors in a controlled training context using simulated practice and structured feedback. The findings do not establish effectiveness for experienced counselors, real clinical encounters, other therapeutic modalities, or deployment without feedback scaffolding.
Boundary conditions
The demonstrated effects are bounded to client-centered microskills and short-term training outcomes in text-based counseling practice. The paper notes that the study assessed immediate skill acquisition rather than long-term retention or transfer to real-world clinical encounters with actual patients.
Position in field
This work moves beyond prior LLM counseling-training papers that emphasized simulation quality or usability by providing randomized evidence that feedback is critical for skill gains. It positions LLM simulation as a potentially scalable training tool, but only when paired with structured feedback.