Comparison page 3 reviewed papers

Major silent speech approaches compared.

The table below stays inside the expert review records. It compares sensing, evidence strength, evaluation, practicality, and open questions without inventing a ranking.

Each row links back to the source review page so every claim stays traceable.

Comparison snapshot

This page uses only fields already stored in the review database. Missing fields are shown as "Not stated in review" instead of being filled in by guesswork.

Paper	Evidence strength	Sensing modality	Evaluation setting	Practicality	Open questions
SilentSpeller: Towards mobile, hands-free, silent speech text entry using electropalatography CHI '22 · 2022	Confidence high · 8 evidence records	electropalatography · SmartPalate custom dental retainer with 124 capacitive electrodes sampled at 100 Hz, connected wired or wireless to processing device.	Offline isolated word recognition with 10-fold cross validation; reserve testing on 100 unseen words; seated vs walking phrase recognition; live interactive text entry with push-to-talk interface and edit gestures.	High for privacy-sensitive communication and hands-busy users; useful where speech is socially inappropriate and users can manage oral hardware.	user_independence; comfort; social_acceptability; broader_symbol_input · The system targets discreet text entry, explicitly trading away naturalness of silent speech for reliability; not a conversational silent speech system. · Recognition confusions occur mainly for letters with similar palatograms, especially EE-sound letters (B/P, D/T/Z). Strong user-dependence; user-independent recognition remains poor.
SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks CHI '19 · 2019	Confidence high · 7 evidence records	ultrasound · 3.5 MHz convex ultrasound probe attached under the jaw, with ultrasound images captured to display monitor and digitized video stored	Quantitative smart speaker success rates, word error rates with Google speech-to-text, and qualitative user adaptation observations.	medium as a systems design contribution and research direction; low-to-medium as a direct deployable interface in reported form	real_time_interaction; open_vocabulary; speaker_independence; wearable_ultrasound · Prototype supports only a fixed small command vocabulary in speaker-dependent training; no demonstration of open vocabulary or continuous real-time interaction. · Speaker-dependent training; latency unsuitable for real-time use (2.61 s per utterance); differences in silent versus voiced articulation require user adaptation; bulky hardware; potential unknown safety issues with continuous ultrasound emission; small vocabulary size.
NasoVoce: A Nose-Mounted Low-Audibility Speech Interface for Always-Available Speech Interaction CHI '26 / arXiv · 2026	Confidence high · 4 evidence records	acoustic; vibration; multimodal · MEMS microphone (Syntiant SPH0141LM4H-1) and MEMS vibration sensor (Syntiant V2S200D) integrated in smart glasses nose pads providing synchronized PDM output.	Quantitative ASR accuracy (WER, CER) on held-out data, objective perceptual quality metrics (PESQ, STOI), MUSHRA subjective ratings with 50 evaluators, and qualitative in-the-wild recordings in four real-world environments.	High for wearable AI voice agents by addressing sensor placement, noise robustness, perceptual quality, and practical evaluation in diverse contexts.	continuous_streaming; adaptive_sensor_gating; longitudinal_wearability; physiological_variability · Targets low-audibility whispered speech, not fully silent speech without any acoustic leakage; assumes hand-covering mouth for privacy. · Fusion model not fully streaming; whispered vibration signals remain weak limiting enhancement quality; performance under extreme noise favors vibration sensor input only at very low SNR.

Source reviews

reviewedconfidence high

SilentSpeller: Towards mobile, hands-free, silent speech text entry using electropalatography

Naoki Kimura, Tan Gemicioglu, Jonathan Womack, Richard Li, Yuhui Zhao, Abdelkareem Bedri, Zixiong Su, Alex Olwal, Jun Rekimoto, Thad Starner

SilentSpeller is a strong, rigorously tested SSI system that reframes silent speech as silent spelling, enabling large vocabulary, live text entry, and walking robustness with in-mouth electropalatography sensors.

reviewedconfidence high

SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks

Naoki Kimura, Michinari Kono, Jun Rekimoto

A solid proof of concept that reconstructs speech audio from ultrasound for controlling unmodified smart speakers, showcasing important system design insight despite prototype limitations in latency, hardware bulk, and speaker dependency.

reviewedconfidence high

NasoVoce: A Nose-Mounted Low-Audibility Speech Interface for Always-Available Speech Interaction

Jun Rekimoto, Yu Nishimura, Bojian Yang

A strong deployment-focused speech interface leveraging a novel nose-pad dual-sensor configuration and multimodal fusion to enable robust low-audibility speech interaction with AI under noise, backed by extensive evaluation.