2020 · arXiv / imported corpus page · Field expert review · confidence high

Application of Just-Noticeable Difference in Quality as Environment Suitability Test for Crowdsourcing Speech Quality Assessment Task

Babak Naderi, Sebastian Möller

DOI arXiv

Strong crowdsourcing methodology paper, not SSI.

Verdict: full-text draftPriority: mediumConfidence: highBasis: full textCoverage: high

Reading guidance

Verdict: full-text draft · priority medium · confidence high
Why it matters: Useful screening method for crowdsourced speech-quality studies, but not an SSI paper.
What to trust: Basis: full text. Coverage: high. 3 evidence records back the review.
What is weak: Not continuous monitoring; the environment can change after the screening step, and inserting the test too often increases session time. Findings are tied to the tested JND levels, degradation conditions, and the specific crowdsourcing setup. Useful only as an evaluation-control mechanism, not an SSI deployment component. Crowdsourced speech-quality environment screening only. Overclaim risk: low.
Read before: SSI review rubric
Read next: SSI archive

Axes

Task: crowdsourcing environment screening
Modality: speech audio
Hardware: listener playback device + headphone/speaker setup
Output: labels
Metrics: Highest correlation to laboratory MOS came from JND 6 dB with at least 3 of 4 answers correct; the lenient JND 10 dB with at least 1 of 4 answers correct failed only 15% of answers versus 61% for the strict setup
Evaluation mode: laboratory and crowdsourcing subjective evaluation with PCC, SRCC, and RMSE against laboratory MOS
Review confidence: high
Overclaim risk: low

Expert take

The full text supports a practical claim: a short JNDQ-based gate can distinguish better and worse remote listening environments before crowd MOS collection. The strongest result is methodological rather than algorithmic, with the paper quantifying how stricter versus more lenient screening changes correlation to laboratory MOS and rejection rates. That is valuable for speech-quality experiments, but it has no direct SSI sensing or reconstruction contribution.

True value

Useful screening method for crowdsourced speech-quality studies, but not an SSI paper.

What changed

Canon before

Crowdsourced speech-quality studies had limited control over participant playback environment and no lightweight suitability screen.

Delta from canon

Introduces a modified JNDQ gate that screens playback device and background-noise suitability before MOS collection.

Position in field

Crowdsourcing methodology paper outside SSI core scope.

Evidence

“ As Spearman’s Rank correlation coefficient (SRCC) and the Root a consequence, a properly designed JND test seems to be appropriate for distinguishing noisy environment from silent 3 https://github.com/microsoft/P.808 Accessed March 2020 conditions, once the listening device is known. ”

author_claim · IV. D ISCUSSION AND CONCLUSION · confidence 0.98

“ Crowdsourcing evaluation >=0 >=1 >=2 >=3 ==4 Number of Corrrect Answers Based on the laboratory experiment, we selected three JND (out of 4) in SNR levels for the crowdsourcing evaluation, namely 10, Fig. ”

metric · B. Crowdsourcing evaluation · confidence 0.97

“ We divided the submitted answers into two groups; ”Passed” answers which passed the corresponding In this paper we assessed the application of JND in quality modified JNDQ test and ”Failed” answers which failed the as an environment suitability test for crowdsourcing. ”

limitation · IV. D ISCUSSION AND CONCLUSION · confidence 0.95

Limits

Technical limits

Not continuous monitoring; the environment can change after the screening step, and inserting the test too often increases session time.

Evaluation limits

Findings are tied to the tested JND levels, degradation conditions, and the specific crowdsourcing setup.

Deployment limits

Useful only as an evaluation-control mechanism, not an SSI deployment component.

Scope limits

Crowdsourced speech-quality environment screening only.