← Technique taxonomy

modality:ultrasound 16 pages 16 reviewed 0 imported

Ultrasound

This page groups the current SSI review database by the real `modality:` tag `modality:ultrasound`.

The list below includes every paper page that currently carries this technique label.

Papers

reviewedthe Proceedings of Interspeech 20232023

Adaptation of Tongue Ultrasound-Based Silent Speech Interfaces Using Spatial Transformer Networks

László Tóth, Amin Honarmandi Shandiz, Gábor Gosztolya, Tamás Gábor Csapó

Strong full-text-backed evidence that most of the gain comes from fast input alignment, not from inventing a new SSI stack.

reviewedarXiv / imported corpus page2023

Speech Reconstruction from Silent Tongue and Lip Articulation By Pseudo Target Generation and Domain Adversarial Training

Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling

Strong SSI paper improving silent speech reconstruction by generating pseudo acoustic targets and using domain adversarial training to address domain mismatch; validated with TaL dataset showing substantial WER and MOS gains over TaLNet.

reviewedarXiv / imported corpus page2022

Improved Processing of Ultrasound Tongue Videos by Combining ConvLSTM and 3D Convolutional Networks

Amin Honarmandi Shandiz, László Tóth

An empirically supported, incremental advancement showing that hybrid 3D-CNN plus ConvLSTM models modestly outperform prior ultrasound tongue video SSI architectures in mel-spectrogram regression accuracy and model efficiency on single-speaker data.

reviewedarXiv / imported corpus page2021

Speech Synthesis from Text and Ultrasound Tongue Image-based Articulatory Input

Csapó Tamás Gábor, László Tóth, Gosztolya Gábor, Alexandra Markó

Helpful side information, not standalone SSI.

reviewedarXiv / imported corpus page2021

Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces

Honarmandi Shandiz Amin, László Tóth, Gosztolya Gábor, Alexandra Markó, Csapó Tamás Gábor

The ultrasound-based x-vector speaker embedding is highly effective for speaker recognition, achieving under 1% error on unseen speakers, but its integration yields only a marginal improvement in multi-speaker ultrasound-to-speech synthesis accuracy.

reviewedarXiv / imported corpus page2021

Voice Activity Detection for Ultrasound-based Silent Speech Interfaces using Convolutional Neural Networks

Amin Honarmandi Shandiz, László Tóth

Preprocessing paper, narrow but legitimate.

reviewedarXiv / imported corpus page2021

Improving Neural Silent Speech Interface Models by Adversarial Training

Amin Honarmandi Shandiz, László Tóth, Gábor Gosztolya, Alexandra Markó, Tamás Gábor Csapó

A clean, well-executed incremental advance using GAN loss to modestly improve articulatory-to-acoustic mapping from ultrasound, validated objectively on two single-speaker corpora.

reviewedarXiv / imported corpus page2021

3D Convolutional Neural Networks for Ultrasound-Based Silent Speech Interfaces

László Tóth, Amin Honarmandi Shandiz

Temporal context helps, but the evidence is a single-speaker vocoder-parameter study.

reviewedarXiv / imported corpus page2021

Convolutional Neural Network-Based Age Estimation Using B-Mode Ultrasound Tongue Image

Kele Xu, Tamás Gábor Csapó, Ming Feng

Real signal, wrong target for SSI.

reviewedarXiv / imported corpus page2020

Ultra2Speech -- A Deep Learning Framework for Formant Frequency Estimation and Tracking from Ultrasound Tongue Images

Pramit Saha, Yadong Liu, Bryan Gick, Sidney Fels

Strong ultrasound SSI paper with unusually clear quantitative gains.

reviewedarXiv / imported corpus page2019

Ultrasound-based Silent Speech Interface Built on a Continuous Vocoder

Tamás Gábor Csapó, Mohammed Salah Al-Radhi, Géza Németh, Gábor Gosztolya, Tamás Grósz, László Tóth, Alexandra Markó

The key advancement is continuous F0 tracking via CNNs yielding lower pitch error and slight naturalness improvement over discontinuous F0 pipelines in ultrasound SSI.

reviewedarXiv / imported corpus page2019

Autoencoder-Based Articulatory-to-Acoustic Mapping for Ultrasound Silent Speech Interfaces

Gábor Gosztolya, Ádám Pintér, László Tóth, Tamás Grósz, Alexandra Markó, Tamás Gábor Csapó

The paper advances ultrasound silent speech interfaces by compressing ultrasound images using an autoencoder bottleneck prior to spectral parameter prediction, resulting in improved accuracy and more natural synthesized speech with smaller models.

reviewedarXiv / imported corpus page2019

Denoising convolutional autoencoder based B-mode ultrasound tongue image feature extraction

Bo Li, Kele Xu, Dawei Feng, Haibo Mi, Huaimin Wang, Jian Zhu

DCAE provides cleaner, more robust ultrasound tongue features leading to improved silent speech recognition, outperforming prior feature extraction strategies.

reviewedCHI '192019

SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks

Naoki Kimura, Michinari Kono, Jun Rekimoto

A solid proof of concept that reconstructs speech audio from ultrasound for controlling unmodified smart speakers, showcasing important system design insight despite prototype limitations in latency, hardware bulk, and speaker dependency.

reviewedarXiv / imported corpus page2017

Updating the silent speech challenge benchmark with deep learning

Yan Ji, Licheng Liu, Hongcui Wang, Zhilei Liu, Zhibin Niu, B. Denby

Benchmark update with a real, reproducible WER gain.

reviewedarXiv / imported corpus page2016

Contour-based 3d tongue motion visualization using ultrasound image sequences

Kele Xu, Yin Yang, Clémence Leboullenger, Pierre Roussel, B. Denby

Useful tongue-modeling tool, not a recognizer.